NerVE: Nonlinear Eigenspectrum Dynamics in LLM Feed-Forward Networks

Abstract

Feed-forward networks are a major source of nonlinear transformation in modern language models, yet their internal representation dynamics are often summarized only through loss or activation statistics. NerVE introduces an eigenspectrum-based framework for analyzing how nonlinearities reshape representation geometry across transformer feed-forward layers. By tracking spectral entropy, participation ratio, effective rank, and divergence-based measures before and after nonlinear transformations, the framework reveals how representation capacity is amplified, compressed, or reorganized across layers and scales.

Publication
International Conference on Learning Representations 2026
Nandan Kumar Jha
Nandan Kumar Jha
Ph.D., New York University · Representation Learning, Scaling Laws, and High-Dimensional Learning Dynamics

I study nonlinear representation dynamics in large language models, focusing on how nonlinearities, architecture, and optimization jointly shape representational geometry, scaling behavior, and usable computational capacity.

Related