Projects
Nonlinear feature learning in shallow neural networks
- Neural networks (NNs) are widely used, but the reasons behind their success are still active areas of research. What I am interested in exploring is how the first few steps of gradient descent leads to feature learning. I have looked at the distributions of singular values in the gradients and updated (inner) weight matrix for various parametrizations (see here) of two-layer NNs in the proportional scaling regime. This paper is a good start to understanding this line of research. More recently, I began working with Rishi Sonthalia and Guido Montúfar on a follow-up to this paper. We are studying low-rank structure in the gradients of the training loss for two-layer NNs and aim to characterize the hidden features.