Tony Congqian Wang
TonyCWang
AI & ML interests
None yet
Recent Activity
upvoted an article about 2 months ago
The Optimal Architecture for Small Language Models upvoted a paper 3 months ago
TiDAR: Think in Diffusion, Talk in Autoregression upvoted an article 3 months ago
Why Did MiniMax M2 End Up as a Full Attention Model? Organizations
None yet