SNCE: Geometry-Aware Supervision for Scalable Discrete Image Generation Paper • 2603.15150 • Published 3 days ago
Lavida-O: Elastic Large Masked Diffusion Models for Unified Multimodal Understanding and Generation Paper • 2509.19244 • Published Sep 23, 2025 • 12
Towards Visual Text Grounding of Multimodal Large Language Model Paper • 2504.04974 • Published Apr 7, 2025 • 17
LRM: Large Reconstruction Model for Single Image to 3D Paper • 2311.04400 • Published Nov 8, 2023 • 52