Nandan Kumar Jha
Nandan Kumar Jha
Home
Research
Publications
Highlights
Talks
Media
Contact
High-Dimensional Learning Dynamics
A Random Matrix Theory Perspective on the Learning Dynamics of Multi-head Latent Attention
Uses random-matrix tools to analyze how multi-head latent attention evolves during training, revealing capacity bottlenecks and representation-geometry shifts.
Nandan Kumar Jha
,
Brandon Reagen
PDF
Cite
News
Cite
×