LeapAlign: Post-Training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories Paper โข 2604.15311 โข Published 9 days ago โข 12
OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation Paper โข 2604.11804 โข Published 12 days ago โข 70
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation Paper โข 2502.18364 โข Published Feb 25, 2025 โข 36
VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing Paper โข 2502.17258 โข Published Feb 24, 2025 โข 79
Running on T4 Featured 122 CountGD_Multi-Modal_Open-World_Counting ๐ 122 Count objects in images using text and example boxes