E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning Paper • 2409.06679 • Published Sep 10, 2024 • 4
CoBa: Convergence Balancer for Multitask Finetuning of Large Language Models Paper • 2410.06741 • Published Oct 9, 2024 • 3
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions Paper • 2410.06577 • Published Oct 9, 2024 • 14
Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs Paper • 2503.05139 • Published Mar 7 • 4
Every Sample Matters: Leveraging Mixture-of-Experts and High-Quality Data for Efficient and Accurate Code LLM Paper • 2503.17793 • Published Mar 22 • 23
CAKE: Cascading and Adaptive KV Cache Eviction with Layer Preferences Paper • 2503.12491 • Published Mar 16
Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks Paper • 2505.16901 • Published May 22 • 48
Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts Paper • 2508.07785 • Published Aug 11 • 28
MoBE: Mixture-of-Basis-Experts for Compressing MoE-based LLMs Paper • 2508.05257 • Published Aug 7 • 13
MultiEdit: Advancing Instruction-based Image Editing on Diverse and Challenging Tasks Paper • 2509.14638 • Published Sep 18 • 11
dInfer: An Efficient Inference Framework for Diffusion Language Models Paper • 2510.08666 • Published Oct 9 • 1
CreditDecoding: Accelerating Parallel Decoding in Diffusion Large Language Models with Trace Credits Paper • 2510.06133 • Published Oct 7
Every Token Counts: Generalizing 16M Ultra-Long Context in Large Language Models Paper • 2511.23319 • Published 28 days ago • 22
TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows Paper • 2512.05150 • Published 23 days ago • 74
LLaDA2.0: Scaling Up Diffusion Language Models to 100B Paper • 2512.15745 • Published 16 days ago • 77
Unifying the Perspectives of NLP and Software Engineering: A Survey on Language Models for Code Paper • 2311.07989 • Published Nov 14, 2023 • 26
MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning Paper • 2311.02303 • Published Nov 4, 2023 • 12