I am a PhD candidate at the Center for Cybersecurity, New York University (NYU), advised by Prof. Brandon Reagen. My research lies at the intersection of deep learning and applied cryptography (homomorphic encryption and multiparty computation), with a focus on cryptographically secure privacy-preserving machine learning (PPML). As part of the DPRIVE projects, I develop novel architectures and algorithms to optimize neural network computations on encrypted data.
In the early stages of my PhD, I led the design of nonlinear-efficient CNNs, introducing ReLU-optimization techniques (DeepReDuce, ICML'21) and methods for redesigning existing CNNs for private inference efficiency (DeepReShape, TMLR'24), including a family of architectures called HybReNets.
My current research focuses on making private LLM inference more practical through architectural optimizations and algorithmic innovations. Specifically, we examine the functional role of nonlinearities from an information-theoretic perspective and develop the AERO framework which designs nonlinearity-reduced architectures with entropy-guided attention mechanisms. Our preliminary findings have been accepted to PPAI@AAAI'25 and ATTRIB@NeurIPS'24.
Recent talks: We presented our work Entopy and Private Language Models at the NYU CILVR Seminar, and Entropy-Guided Attention for Private LLMs on the AI Fireside Chat
Besides research, I have contributed as an (invited) reviewer for NeurIPS (2023, 2024), ICML (2024, 2025), ICLR (2024, 2025), TMLR (2025), AISTATS (2025), CVPR (2024, 2025), ICCV (2025), and AAAI (2025).
I am currently on the job market, graduating in Fall 2025, and seeking research scientist roles at the intersection of LLM science, architectural optimization, and privacy-preserving AI. Feel free to reach out!
Ph.D. in Privacy-preserving Deep Learning, 2020 - present
New York University
M.Tech. (Research Assistant) in Computer Science and Engineering, 2017 - 2020
Indian Institute of Technology Hyderabad
B.Tech. in Electronics and Communication Engineering, 2009 - 2013
National Institute of Technology Surat
We introduce an information-theoretic framework to characterize the role of nonlinearities in decoder-only language models, laying a principled foundation for optimizing transformer-architectures tailored to the demands of Private Inference (PI). By leveraging Shannon’s entropy as a quantitative measure, we uncover the previously unexplored dual significance of nonlinearities, beyond ensuring training stability, they are crucial for maintaining attention head diversity. Specifically, we find that their removal triggers two critical failure modes, entropy collapse in deeper layers that destabilizes training, and entropic overload in earlier layers that leads to under-utilization of Multi-Head Attention’s (MHA) representational capacity. We propose an entropy-guided attention mechanism paired with a novel entropy regularization technique to mitigate entropic overload. Additionally, we explore inference-efficient alternatives to layer normalization for preventing entropy collapse and stabilizing the training of LLMs with reduced-nonlinearities. Our study bridges the gap between information theory and architectural design, establishing entropy dynamics as a principled guide for developing efficient PI architecture.
DeepReDuce is a set of optimizations for the judicious removal of ReLUs to reduce private inference latency by leveraging the ReLUs heterogeneity in classical networks. DeepReDuce strategically drops ReLUs upto 4.9x (on CIFAR-100) and 5.7x (on TinyImageNet) for ResNet18 with no loss in accuracy. Compared to the state-of-the-art for private inference DeepReDuce improves accuracy and reduces ReLU count by up to 3.5% (iso-ReLU) and 3.5×(iso-accuracy), respectively.
Responsibilities include: