VLM 5
- [Paper Review] SigLIP
- [Paper Review] 🦩 Flamingo: a Visual Language Model for Few-Shot Learning
- [Paper Review with Code] ResCLIP: Residual Attention for Training-free Dense Vision-language Inference
- [Paper Review] Transfusion
- [Paper Review] REACT : Learning Customized Visual Models with Retrieval-Augmented Knowledge