[Paper Review]DiffSplat
[논문 리뷰] DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation Chenguo Lin,...
[논문 리뷰] DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation Chenguo Lin,...
[논문 리뷰] ResCLIP: Residual Attention for Training-free Dense Vision-language Inference ResCLIP: Residual Attention for Training-free Dense Vision-language Inference Yuhang Yang∗, Jinhong Deng...
[논문 리뷰] Escaping Plato’s Cave: Towards the Alignment of 3D and Text Latent Spaces Escaping Plato’s Cave: Towards the Alignment of 3D and Text Latent Spaces Souhail Hadgi, Luca Moschella, And...
[논문 리뷰] AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling Jun Zhan, Junqi Dai, Jiasheng Ye, Yunhua Zhou, Dong Zhan...
[논문 리뷰] SplatFlow: Multi-View Rectified Flow Model for 3D Gaussian Splatting Synthesis SplatFlow: Multi-View Rectified Flow Model for 3D Gaussian Splatting Synthesis Hyojun Go, Byeongjun Par...
[논문 리뷰] UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics Xi Chen, Zhifei Zhan...
[논문 리뷰] Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs Shengbang Tong, Zhuang Liu, Yuexiang Zhai, Y...
[논문 리뷰] CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion Kai He , Chin-Hsuan Wu , Igo...
[논문 리뷰] Flow Matching in Latent Space Flow Matching in Latent Space Quan Dao, Hao Phung CVPR 2023 [Arxiv] [Github] Flow Matching은 Diffusion에 비해 상대적으로 훈련하기 쉬우면서도 강력한 성능을 보여주는 생성 모델 알고...
[논문 리뷰] Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model Chunting Zhou, Lil...