building vision-language models on solid foundations with masked distillation - cvpr 2024
Published 1 month ago • 9 plays • Length 4:56Download video MP4
Download video MP3
Similar videos
-
7:49
[cvpr 2024] asymmetric masked distillation for pre-training small foundation models
-
5:01
[cvpr 2024] partdistill: 3d shape part segmentation by vision-language model distillation
-
4:57
lov: language models as black-box optimizers for vision-language models (cvpr 2024)
-
3:54:28
mllm series tutorial @ cvpr 2024
-
5:00
[cvpr 2024] unsamflow: unsupervised optical flow guided by segment anything model
-
50:19
[cvpr24 vision foundation model tutorial] large multimodal models by chunyuan li
-
5:07
[cvpr 2024] eclipse: efficient continual learning in panoptic segmentationwith visual prompt tuning
-
4:55
improved zero-shot classification by adapting vlms with text descriptions [cvpr 2024]
-
5:01
[cvpr 2024 oral] memsam: taming segment anything model for echocardiography video segmentation
-
6:51
diffsci (cvpr 2024)
-
5:58
one-shot open affordance learning with foundation models (cvpr 2024)
-
5:00
visual program distillation (5 min intro for cvpr 2024)
-
5:19
[cvpr 2024] question aware vision transformer for multimodal reasoning
-
6:05
mosaic-sdf for 3d generative models (cvpr 2024)
-
5:35
[cvpr 2024] ba-sam: scalable bias-mode attention mask for segment anything model
-
9:00
dense vision transformer compression with few samples | cvpr 2024
-
2:15
[cvpr 2024] dreamvideo: composing your dream videos with customized subject and motion
-
5:00
cvpr 2024 | inseg
-
6:20
cvpr 2024 paper: argue: attribute-guided prompt tuning for vision-language models (tian et al.)
-
5:02
label propagation for zero-shot classification with vision-language models - cvpr2024