Andrea Pierleoni
andreapie
AI & ML interests
None yet
Organizations
None yet
LLM Training
-
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models
Paper • 2501.09686 • Published • 41 -
Optimizing Large Language Model Training Using FP4 Quantization
Paper • 2501.17116 • Published • 36 -
Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search
Paper • 2502.02508 • Published • 22 -
On Teacher Hacking in Language Model Distillation
Paper • 2502.02671 • Published • 18
LLM Inference
LLM Training
-
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models
Paper • 2501.09686 • Published • 41 -
Optimizing Large Language Model Training Using FP4 Quantization
Paper • 2501.17116 • Published • 36 -
Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search
Paper • 2502.02508 • Published • 22 -
On Teacher Hacking in Language Model Distillation
Paper • 2502.02671 • Published • 18
models
0
None public yet
datasets
0
None public yet