3 23 4

Shi Minglei

MingleiShi

AI & ML interests

None yet

Recent Activity

updated a model 19 days ago

KlingTeam/SVG-T2I

upvoted a paper 19 days ago

Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling

authored a paper 22 days ago

SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder

View all activity

Organizations

upvoted a paper 19 days ago

Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling

Paper • 2512.12675 • Published 22 days ago • 40

upvoted a paper 22 days ago

SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder

Paper • 2512.11749 • Published 24 days ago • 38

upvoted a paper 3 months ago

Latent Diffusion Model without Variational Autoencoder

Paper • 2510.15301 • Published Oct 17, 2025 • 49

upvoted 2 papers 6 months ago

NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining

Paper • 2507.14119 • Published Jul 18, 2025 • 58

Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning

Paper • 2507.16746 • Published Jul 22, 2025 • 35

upvoted a paper 7 months ago

UniPre3D: Unified Pre-training of 3D Point Cloud Models with Cross-Modal Gaussian Splatting

Paper • 2506.09952 • Published Jun 11, 2025 • 6

upvoted 3 papers 8 months ago

upvoted 3 papers 9 months ago

Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation

Paper • 2503.24379 • Published Mar 31, 2025 • 76

Wan: Open and Advanced Large-Scale Video Generative Models

Paper • 2503.20314 • Published Mar 26, 2025 • 56

Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26, 2025 • 168

upvoted 4 papers 10 months ago

Position: Interactive Generative Video as Next-Generation Game Engine

Paper • 2503.17359 • Published Mar 21, 2025 • 61

DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers

Paper • 2503.14487 • Published Mar 18, 2025 • 28

ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

Paper • 2503.11647 • Published Mar 14, 2025 • 146

MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning

Paper • 2503.07365 • Published Mar 10, 2025 • 61

upvoted 2 papers 11 months ago

ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models

Paper • 2502.09696 • Published Feb 13, 2025 • 43

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22, 2025 • 433

upvoted a collection about 1 year ago

Daily Papers

Collection

1 item • Updated Oct 26, 2023 • 82

upvoted a paper over 1 year ago

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published Sep 27, 2024 • 96

Shi Minglei

AI & ML interests

Recent Activity

Organizations

MingleiShi's activity