Sanae Lotfi

Logo

PhD student at New York University

I am a PhD student at NYU advised by Professor Andrew Gordon Wilson. I work on the foundations of deep learning and I am currently interested in large language and diffusion models. My goal is to understand and quantify generalization in deep learning, and use this understanding to build more robust and reliable models.

My PhD research has been recognized with an ICML Outstanding Paper Award and is generously supported by the Microsoft Research PhD Fellowship, the DeepMind Fellowship, and the Meta AI Mentorship Program. I was recently distinguished as a Rising Star in Machine Learning by the University of Maryland Center for Machine Learning.

Prior to NYU, I obtained a master’s degree in applied mathematics from Polytechnique Montreal. I was fortunate to work there with Professors Andrea Lodi and Dominique Orban to design stochastic first- and second-order algorithms with compelling theoretical and empirical properties for machine learning and large-scale optimization. I was awarded the Best Master’s Thesis Award for this work. I also hold a master’s degree in general engineering and applied mathematics from CentraleSupélec.

In 2022-2023, I was fortunate to work with Brandon Amos as a Visiting Researcher in the Fundamental AI Research (FAIR) group at Meta AI. I was also fortunate to work with Bernie Wang and Richard Kurle at Amazon as an Applied Scientist Intern in summer 2022.


You can contact me at sl8160@nyu.edu

CV, Google Scholar, LinkedIn, Twitter, Github


Publications

Non-Vacuous Generalization Bounds for Large Language Models
Sanae Lotfi*, Marc Finzi*, Yilun Kuang*, Tim G. J. Rudner, Micah Goldblum, Andrew Gordon Wilson
International Conference on Machine Learning (ICML), 2024

Mitigating Augmentation Bias with Input-Dependent Distributions over Augmentations
Sanae Lotfi, Tim G. J. Rudner, Brandon Amos, Andrew Gordon Wilson
Under review.

Bayesian Model Selection, the Marginal Likelihood, and Generalization (Extended Paper)
Sanae Lotfi, Pavel Izmailov, Gregory Benton, Micah Goldblum, Andrew Gordon Wilson
Journal of Machine Learning Research (JMLR), 2023
Best Papers Track
[arxiv, code]

PAC-Bayes Compression Bounds So Tight That They Can Explain Generalization
Sanae Lotfi*, Marc Finzi*, Sanyam Kapoor*, Andres Potapczynski*, Micah Goldblum, Andrew Gordon Wilson
Neural Information Processing Systems (NeurIPS), 2022
[arxiv, code]

Bayesian Model Selection, the Marginal Likelihood, and Generalization
Sanae Lotfi, Pavel Izmailov, Gregory Benton, Micah Goldblum, Andrew Gordon Wilson
International Conference on Machine Learning (ICML), 2022
Long oral presentation, top 2% submissions
Outstanding Paper Award
[arxiv, code, poster, talk, slides]

Dangers of Bayesian Model Averaging under Covariate Shift
Pavel Izmailov, Patrick Nicholson, Sanae Lotfi, Andrew Gordon Wilson
Neural Information Processing Systems (NeurIPS), 2021
[arxiv, code, poster]

Loss Surface Simplexes for Mode Connecting Volumes and Fast Ensembling
Gregory W. Benton, Wesley J. Maddox, Sanae Lotfi, Andrew Gordon Wilson
International Conference on Machine Learning (ICML), 2021
Spotlight presentation
[arxiv, code, slides]

Evaluating Approximate Inference in Bayesian Deep Learning
Andrew Gordon Wilson, Sanae Lotfi, Sharad Vikram, Matthew D. Hoffman, Yarin Gal, Yingzhen Li, Melanie F. Pradier, Andrew Foong, Sebastian Farquhar, Pavel Izmailov
NeurIPS Competition and Demonstration Track, Proceedings of Machine Learning Research (PMLR), 2021
[plmr, code, website]

Adaptive First-and Second-Order Algorithms for Large-Scale Machine Learning
Sanae Lotfi, Tiphaine Bonniot de Ruisselet, Dominique Orban, Andrea Lodi
Annual Conference on Machine Learning, Optimization, and Data Science (LOD)
Oral presentation
[arxiv]

Stochastic Damped L-BFGS with Controlled Norm of the Hessian Approximation
Sanae Lotfi, Tiphaine B. de Ruisselet, Dominique Orban, Andrea Lodi
SIAM Conference on Optimization, 2021
Oral presentation
NeurIPS Optimization for Machine Learning Workshop, 2020
Spotlight presentation
[arxiv]

Stochastic First and Second Order Optimization Methods for Machine Learning
Sanae Lotfi
Master’s Thesis, 2020
Best Thesis Award in Applied Mathematics at Polytechnique Montreal
Polytechnique Montreal

* denotes equal contribution.


Selected Awards and Honors


Selected Talks

Are the Marginal Likelihood and PAC-Bayes Bounds the Right Proxies for Generalization?

Non-Vacuous Generalization Bounds for Large Language Models

Bayesian Model Selection, the Marginal Likelihood, and Generalization

Robustness of Deep Learning Models to Distribution Shift

Adaptive First and Second Order Algorithms for Large-Scale Machine Learning


Surveys

Understanding the Generalization of Deep Neural Networks through PAC-Bayes bounds
Andres Potapczynski, Sanae Lotfi, Anthony Chen, Chris Ick
Mathematics of Deep Learning, CS-GA 3033, Spring 2022

Causal Representation Learning
Sanae Lotfi, Taro Makino, Lily Zhang
Inference and Representation, DS-GA 1005, Fall 2021