arxiv:2602.04705
Junyuan Shang
sjy1203
ยท
AI & ML interests
NLP
Recent Activity
authored
a paper
about 20 hours ago
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion
authored
a paper
about 20 hours ago
NACL: A General and Effective KV Cache Eviction Framework for LLMs at Inference Time
authored
a paper
about 20 hours ago
ERNIE 5.0 Technical Report
Organizations
None yet