Hello and welcome to my occasionally updated academic website.
Over the last few years, the main focus of my research has been to understand exactly why large deep neural networks generalize to unseen data. This is puzzling since experiments show that these networks have enough capacity to fit random datasets (that is, they can “memorize” datasets), and so a natural question is why do they not simply memorize their training set? I think we now have a good answer to this mystery. Please see this page for an overview.
I am currently the Chief Science Advisor at Skovinen and a Technical Advisor at Parameter Ventures. Previously, I worked at Google Research where in addition to the research described above, I also initiated several projects at the intersection of machine learning and chip design, and founded and built out a cross-functional team for concepting new product ideas as part of the Kernel New Product Innovation program.
Before Google, I was a Senior Vice President at Two Sigma, a leading quantitative investment manager, where I founded one of the first successful deep learning-based alpha research groups on Wall Street and led a team that built one of the earliest end-to-end FPGA-based trading systems for general purpose ultra-low latency trading. Prior to that, I was a Research Scientist at Intel where I worked on microarchitectural performance analysis and formal verification for on-chip networks.
I did my undergraduate studies at IIT Bombay and got my PhD from UC Berkeley working with Prof Robert Brayton and Dr. Alan Mishchenko in logic synthesis and formal verification.