MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale Paper • 2604.04771 • Published 2 days ago • 89
The Trinity of Consistency as a Defining Principle for General World Models Paper • 2602.23152 • Published Feb 26 • 201
AICC: Parse HTML Finer, Make Models Better -- A 7.3T AI-Ready Corpus Built by a Model-Based HTML Parser Paper • 2511.16397 • Published Nov 20, 2025 • 11
Intern-S1: A Scientific Multimodal Foundation Model Paper • 2508.15763 • Published Aug 21, 2025 • 273