Michelle Lynn Gill

Michelle Lynn Gill, Ph.D.

I am currently an R&D Manager and the Tech Lead for Clara Discovery at NVIDIA. I lead a growing team of researchers and engineers who use use data, deep learning, and HPC to develop tools that accelerate drug discovery.

Previously I was a Senior Machine Learning Engineer and Data Scientist at BenevolentAI and a member of Arthur Palmer's research group where I developed and applied nuclear magnetic resonance (NMR) spin relaxation experiments to understand how enzyme dynamics are critical to biological function.

This website serves as my professional CV and has been formatted to export as a PDF. Safari or Firefox are required to preserve hyperlinks.


Ph.D., Molecular Biophysics & Biochemistry, 2003–2006
Yale University, New Haven, CT
Thesis: Development of 205Tl NMR methods for the direct study of monovalent metal ions and ligands in nucleic acids THESIS DEFENSE
Advisors: J. Patrick Loria and Scott Strobel

M.Phil., Molecular Biophysics & Biochemistry, 2001–2003
Yale University, New Haven, CT

B.S., Biochemistry, 1997-2001
Highest Distinction and Honors (Summa Cum Laude)
University of Kansas, Lawrence, KS


Research & Development Manager, 2022-Present
Tech Lead, BioNeMo, 2019-Present
Senior AI and Deep Learning Scientist, Cheminformatics and Proteomics, 2019-Present
Senior Deep Learning Consultant, 2017-2018
I serve as the R&D Manager and Tech Lead for Clara Discovery, NVIDIA's platform for accelerating the drug discovery process through deep learning, molecular dynamics, and HPC. This team develops BioNeMo, which provides researchers with the ability to pre-train and fine tune large language models for cheminformatics and proteomics tasks. My responsibilities include managing applied research, product cycle planning, and external collaborations with researchers in pharma, biotech, and academia.

Earlier work focused on proteomics, including development of deep learning models to predict peptide spectral matches (PSMs) in proteomics sequencing with >95% F1. I also led a team that used GCNNs to predict molecular properties and finished 33rd in a Kaggle competition.

As a deep learning consultant, I assisted clients in the pharmaceutical and materials science space in utilizing deep learning for strategic advantage. I helped develop proof of concept deep learning experiments and pipelines to validate approach and to identify technology stack and engineering architecture for solution deployment.

Senior Machine Learning Engineer, 2019
Senior Data Scientist, 2018-2019
Utilized scientific and machine learning methods to assist with all stages of the drug discovery process, from target identification to chemistry validation. Select focuses of my work have included using matrix factorization and graph convolutional neural networks (GCNNs) to determine the importance of drug mechanisms in knowledge graphs, and deep learning (3D CNNs) and cheminformatics methods to predict ligand pose and affinity within a target.

Senior Data Scientist, 2016-2017
Designed and created Spark machine learning and NLP curriculum using self-made Docker containers. Conducted corporate trainings focused on Python and Spark. Developed 12-week machine learning course for F100 company. Co-instructed 12-week data science bootcamps. Developed and conducted take home coding exercise to assist with interview preparation.

Scientist, 2014-2016
National Cancer Institute, National Institutes of Health
Developed parallelized, compressed sensing methods for reconstruction of non-uniformly sampled NMR data.

Postdoctoral Research Fellow, 2008-2014
Columbia University, Department of Biochemistry and Molecular Biophysics
Part of a collaboration that demonstrated conformational selection is critical in the highly concerted mechanism of the DNA methyltransferase, AlkB. Developed multiple quantum NMR spin relaxation experiments for quantifying the slow timescale (microsecond – millisecond) motions of methyl sidechains.
Advisor: Professor Arthur G. Palmer, III

Consultant, 2006–2007
The Boston Consulting Group
Worked with clients in the finance and pharmaceutical sectors to streamline organizational structure and identify novel investment opportunities. I was part of the case team that won the 2007 Global BCG Strategy Olympics for our work with a pharmaceutical client.


Exploring Molecular Space and Accelerating Drug Discovery on the GPU with Clara Discovery
Michelle Gill
Gates Foundation Grand Challenges: Applications of Artifical Intelligence in Machine Learning, Invited Talk, November 10, 2021, Virtual

Accelerating Drug Discovery with Clara Discovery and MegaMolBART SLIDES
Michelle Gill
4th RSC-BMCS / RSC-CICAG Artificial Intelligence in Chemistry, Invited Talk, September 27, 2021, Virtual

Accelerating Drug Discovery with Clara Discovery's MegaMolBart VIDEO
Michelle Gill
Demo, April 12, 2021

Real Time, GPU-Accelerated Analysis and Visualization in the Life Sciences SPEAKERS ABSTRACT SLIDES
Michelle Gill and Avantika Lal
Ken Kennedy Institute Data Science Conference, Invited Keynote, October 26-27, 2020, Virtual

Artificial intelligence driven drug discovery SLIDES
NYC R Conference, Invited Presentation, May 10, 2019, New York, NY

Panel: Careers in data science PROGRAM
Tri-Institutional Career Symposium, April 9, 2019, New York, NY
Memorial Sloan Kettering Cancer Center, The Rockefeller University and Weill Cornell Medicine

Machine learning for target identification and lead optimization in drug discovery ABSTRACT
Alix Lacoste and Michelle Gill
New York Area Group for Informatics and Modeling, Invited Presentation, February 26, 2019, New York, NY

Accelerating the journey from data to medicine ABSTRACT
Amir Saffari, Dan Neil, Alix Lacoste, and Michelle Gill
NeurIPS, Expo Talk, 2018, Montreal, Canada

Artificial intelligence as a catalyst for scientific discovery ABSTRACT
JupyterCon, Invited Keynote, 2018, New York, NY

From structural biology to AI: a holistic approach to studying molecular machines SLIDES
Brookhaven National Laboratory, Invited Presentation, 2018, Upton, NY

Efficient image search and identification: the making of Wine-O.AI SLIDES VIDEO CODE
SciPy Conference, Selected Presentation, 2017, Austin, TX

Learning from text: natural language processing with python SLIDES CODE
ODSC East, Tutorial, 2017, Boston, MA


COVID-19 Biohackathon WEBSITE CODE
April 5-11, 2020
A biohackathon focused on developing tools for COVID-19 research and drug discovery. I was part of a team that developed annotations for 3D protein structures. My contribution was an automated method for calculating and visualizing solvent accessibility on families of protein strcutures.

themodernscientist WEBSITE CODE
My personal blog discusses data visualization, optimization, automation, and various other computationally-related interests. Posts frequently incorporate Jupyter notebooks and shell scrips. The blog itself is created using Pelican, a Python-based static blogging engine, and the website is hosted on GitHub.

Developed and currently maintain a program, called NESTA-NMR, that enables the acquisition of experimental data to be completed up to 100X faster. This project uses a compressed sensing algorithm, called NESTA. The program is written in C, highly parallelized, and best-in-class for speed and optimization accuracy. Associated scientific (GSL and FFTW) and parallelization (pthreads) libraries were used. I also built the website using Flask and created the documentation using Sphinx.

Honors & Awards


Judge, Preliminary and Final Rounds
NYC STEM Fair, Feb 18 - March 27, 2022
Evaluated preliminary and final round submissions in the biochemistry track.

Program Chair
PyData NYC, 2018
Responsible for conference content including: recruitment of proposal reviewers, soliciation and aggregation of proposal feedback, selection of presentations, notification of selected speakers, and scheduling of talks.

Machine Learning Symposium Co-Chair
SciPy Conference, 2018
Co-chair of machine learning / deep learning symposium at SciPy 2018. Responsible for recruiting reviewers, reviewing proposals, selecting talks, and running symposium.

Proposal Reviewer
JupyterCon, 2018
Responsible for reviewing and scoring proposal submissions.

Journal of Open Source Software (JOSS), 2018–2019
Review submissions to JOSS within my expertise area as needed.


NIH Postdoctoral Research Fellowship (F32 GM089047), 2009–2012
Global BCG Strategy Olympics, Winning Team, 2007
NSF Graduate Research Fellowship, 2002–2006
Outstanding Graduate Teaching Assistant, 2003
Barry M. Goldwater Scholar, 2000–2001
Outstanding Undergraduate Honors Research Thesis, 2001
Kansas Board of Regents Full Tuition Merit Scholarship, 1997–2001


Deep Learning at NVIDIA (with Michelle Gill) AUDIO
DataFramed, DataCamp, Podcast Interview, 2018
Interview about my deep learning work at NVIDIA

How to Keep Your Job Regardless of AI ARTICLE
International Business Times, 2017
Profile of my transition from biophysicist to data scientist


Sevgen, E., Moller, J., Lange, A., Parker, J., Quigley, S., Mayer, J., Srivastava, P., Gayatri, S., Hosfield, D., Korshunova, M., Livne, M., Gill, M.., Ranganathan, R., Costa, A.B., Ferguson, A.L. DOI PDF
ProT-VAE: Protein transformer variational autoencoder for functional protein design
arXiv, 2023

Reidenbach, D., Livne, M., Illango, R.K., Gill, M.L., Israeli, J.I. DOI PDF
Improving small molecule generation using mutual information machine
arXiv, 2022

Gill, M.L. DOI
The rise of the machines in chemistry, Invited Prospectus
Magnetic Resonance in Chemistry, 2022, 60, 1044-1051

Strauss, M.T., Bludau, I., Zeng, W.F., Voytik, E., Ammar, C., Schessner, J., Illango, R., Gill, M.L., Meier, F., Willems, S., Mann, M. DOI PDF
AlphaPept, a modern and open framework for MS-based proteomics
bioRXiv, 2021

Gill, M.L., Hsu, A., Palmer, A.G. ChemRxiv DOI PDF
Detection of chemical exchange in methyl groups of macromolecules
Journal of Biomolecular NMR, 2019, 73, 443-450
Poster presented at 57th Experimental Nuclear Magnetic Resonance Conference, 2016, Pittsburgh, PA POSTER

Tong, M., Pelton, J., Gill, M.L., Zhang, W., Picart, F., Seeliger, M. DOI PDF
Survey of solution dynamics in Src kinase reveals cross talk between the ligand binding and regulatory sites
Nature Communications, 2017, 8, 2160

Gill, M.L., Byrd, R.A., Palmer, A.G. DOI PDF
Dynamics of GCN4 facilitate DNA interaction: a model-free analysis of an intrinsically disordered region
Physical Chemistry and Chemical Physics, 2016, 18, 5839–5849
Posters presented at International Conference of Magnetic Resonance in Biological Sciences, 2014, Dallas, TX POSTER
and 57th Experimental Nuclear Magnetic Resonance Conference, 2016, Pittsburgh, PA POSTER

Gill, M.L., Sun, S., Li, Y., Byrd, R.A.
NESTA-NMR: efficient and quantitative processing of multidimensional NUS NMR data
Poster presented at 57th Experimental Nuclear Magnetic Resonance Conference, 2016, Pittsburgh, PA POSTER

*Sun, S., *Gill, M.L., Li, Y., Huang, M., Byrd, R.A. DOI PDF WEBSITE
Efficient and generalized processing of multidimensional NUS NMR Data: the NESTA algorithm and comparison of regularization terms
Journal of Biomolecular NMR, 2015, 62, 105–117
* Authors contributed equally
Poster presented at 56th Experimental Nuclear Magnetic Resonance Conference, 2015, Monterrey, CA POSTER

Gill, M.L., Byrd, R.A. DOI PDF
Dynamic activation of apoptosis: conformational ensembles of cIAP1 are linked to a spring-loaded mechanism
Nature Structural Molecular Biology, 2014, 21, 1022–1023

Gill, M.L., Palmer, A.G. DOI PDF
Local isotropic diffusion approximation for coupled internal and overall molecular motions in NMR spin relaxation
Journal of Physical Chemistry, Series B, 2014, 118, 11120–11128

Ergel, B., Gill, M.L., Brown, L., Yu, B., Palmer, A.G., Hunt, J.F. DOI PDF
Protein dynamics control the progression and efficiency of the catalytic reaction cycle of AlkB
Journal of Biological Chemistry, 2014, 289, 29584–29601
Poster presented at International Conference of Magnetic Resonance in Biological Sciences, 2012, Lyon, France POSTER

Gill, M.L. and Palmer, A.G. DOI PDF
Multiplet-filtered and gradient-selected zero-quantum TROSY experiments for 13C1H3 methyl groups in proteins
Journal of Biomolecular NMR, 2011, 51, 245–251
Poster presented at 52nd Experimental Nuclear Magnetic Resonance Conference, 2011, Monterrey, CA POSTER

Ramsey, J.D., Gill, M.L., Kamerzell, T.J., Price, E.S., Joshi, S.B., Bishop, S.M., Oliver, C.N., Middaugh, C.R. DOI PDF
Using empirical phase diagrams to understand the role of intramolecular dynamics in immunoglobulin G stability
Journal of Pharmaceutical Sciences, 2009, 98, 2432–2447

Gill, M.L., Strobel, S.A., and Loria, J.P. DOI PDF
Crystallization and characterization of the thallium form of the Oxytricha nova G-quadruplex
Nucleic Acids Research, 2006, 34, 4506–4514

Gill, M.L., Strobel, S.A., and Loria, J.P. DOI PDF
205Tl NMR methods for the study of monovalent metal binding sites in nucleic acids
Journal of the American Chemical Society, 2005, 127, 16723–16732
Poster and selected oral presentation at 46th Experimental Nuclear Magnetic Resonance Conference, 2005, Providence, RI

Beach, H., Cole, R., Gill, M.L., and Loria, J.P. DOI PDF
Conservation of µs-ms enzyme motions in the apo- and substrate-mimicked state
Journal of the American Chemical Society, 2005, 127, 9167–9176

Adams, P.L., Stahley, M.R., Gill, M.L., Kosek, A.B., Wang, J., and Strobel, S.A. DOI PDF
Crystal structure of a group I intron splicing intermediate
RNA, 2004, 12, 1867–1887

Wiethoff, C.M., Gill, M.L., Koe, G.S., Koe, J.G., and Middaugh, C.R. DOI PDF
A fluorescence study of the structure and accessibility of plasmid DNA condensed with cationic gene delivery vehicles
Journal of Pharmaceutical Sciences, 2003, 92, 1272–1285

Wiethoff, C.M., Gill, M.L., Koe, G.S., Koe, J.G., and Middaugh, C.R. DOI PDF
The structural organization of cationic lipid-DNA complexes
Journal of Biological Chemistry, 2002, 277, 44980–44987

Silchenko, S., *Sippel, M.L., Kuchment, O., Benson, D.R., Mauk, A.G., Altuve, A., and Rivera, M. DOI PDF
Hemin is kinetically trapped in cytochrome b5 from rat outer mitochondrial membrane
Biochemical and Biophysical Research Communications, 2000, 273, 467–472
* M.L. Gill is formerly M.L. Sippel

Contact Me