I am currently a Senior Data Scientist at BenevolentAI. I use data and machine learning to develop tools that facilitate all stages of the drug discovery process.

Previously I was a Senior Deep Learning Consultant at NVIDIA and a member of Arthur Palmer's research group where I developed and applied nuclear magnetic resonance (NMR) spin relaxation experiments to understand how enzyme dynamics are critical to biological function.

Ph.D., Molecular Biophysics & Biochemistry, 2003–2006
Yale University, New Haven, CT
Thesis: Development of 205Tl NMR methods for the direct study of monovalent metal ions and ligands in nucleic acids PDF SLIDES
Advisors: J. Patrick Loria and Scott Strobel

M.Phil., Molecular Biophysics & Biochemistry, 2001–2003
Yale University, New Haven, CT

B.S., Biochemistry, 1997-2001
Highest Distinction and Honors (Summa Cum Laude)
University of Kansas, Lawrence, KS


Senior Data Scientist, 2018-Present
Utilized scientific and machine learning methods to assist with all stages of the drug discovery process, from target identification to chemistry validation. Select focuses of my work have included using matrix factorization and graph convolutional neural networks (GCNNs) to determine the importance of drug mechanisms in knowledge graphs, and deep learning (3D CNNs) and chemoinformatics methods to predict ligand pose and affinity within a target.

Senior Deep Learning Consultant, 2017-2018
Assisted clients in the pharmaceutical and materials science space in utilizing deep learning for strategic advantage. I helped develop proof of concept deep learning experiments and pipelines to validate approach and to identify technology stack and engineering architecture for solution deployment.

Senior Data Scientist, 2016-2017
Designed and created Spark machine learning and NLP curriculum using self-made Docker containers. Conducted corporate trainings focused on Python and Spark. Developed 12-week machine learning course for F100 company. Co-instructed 12-week data science bootcamps. Developed and conducted take home coding exercise to assist with interview preparation.

Scientist, 2014-2016
National Cancer Institute, National Institutes of Health
Developed parallelized, compressed sensing methods for reconstruction of non-uniformly sampled NMR data.

Postdoctoral Research Fellow, 2008-2014
Columbia University, Department of Biochemistry and Molecular Biophysics
Part of a collaboration that demonstrated conformational selection is critical in the highly concerted mechanism of the DNA methyltransferase, AlkB. Developed multiple quantum NMR spin relaxation experiments for quantifying the slow timescale (microsecond – millisecond) motions of methyl sidechains.
Advisor: Professor Arthur G. Palmer, III

Consultant, 2006–2007
The Boston Consulting Group
Worked with clients in the finance and pharmaceutical sectors to streamline organizational structure and identify novel investment opportunities. I was part of the case team that won the 2007 Global BCG Strategy Olympics for our work with a pharmaceutical client.


Forthcoming: Artificial intelligence driven drug discovery SCHEDULE
NYC R Conference, Invited Presentation, May 10, 2019, New York, New York

Panel: Careers in data science PROGRAM
Tri-Institutional Career Symposium, April 9, 2019, New York, New York
Memorial Sloan Kettering Cancer Center, The Rockefeller University and Weill Cornell Medicine

Machine learning for target identification and lead optimization in drug discovery ABSTRACT
Alix Lacoste and Michelle Gill
New York Area Group for Informatics and Modeling, Invited Presentation, February 26, 2019, New York, New York

Accelerating the journey from data to medicine ABSTRACT
Amir Saffari, Dan Neil, Alix Lacoste, and Michelle Gill
NeurIPS, Expo Talk, 2018, Montreal, Canada

Artificial intelligence as a catalyst for scientific discovery SLIDES VIDEO ABSTRACT
JupyterCon, Invited Keynote, 2018, New York, NY

From structural biology to AI: a holistic approach to studying molecular machines SLIDES
Brookhaven National Laboratory, Invited Presentation, 2018, Upton, NY

Efficient image search and identification: the making of Wine-O.AI SLIDES VIDEO CODE
SciPy Conference, Selected Presentation, 2017, Austin TX

Learning from text: natural language processing with python SLIDES CODE
ODSC East, Tutorial, 2017, Boston, MA


themodernscientist WEBSITE CODE
My personal blog discusses data visualization, optimization, automation, and various other computationally-related interests. Posts frequently incorporate Jupyter notebooks and shell scrips. The blog itself is created using Pelican, a Python-based static blogging engine, and the website is hosted on GitHub.

Developed and currently maintain a program, called NESTA-NMR, that enables the acquisition of experimental data to be completed up to 100X faster. This project uses a compressed sensing algorithm, called NESTA. The program is written in C, highly parallelized, and best-in-class for speed and optimization accuracy. Associated scientific (GSL and FFTW) and parallelization (pthreads) libraries were used. I also built the website using Flask and created the documentation using Sphinx.

A library for performing least squares regression. It attempts to seamlessly incorporate this task in a Pandas-focused workflow. Input data are expected in dataframes, and multiple regressions can be performed using functionality similar to Pandas groupby.

MFOutParser CODE
A Python library developed during my postdoctoral research that parses a particularly challenging text format and performs preliminary analysis on the resulting data using Pandas.

Deep Learning at NVIDIA (with Michelle Gill) AUDIO
DataFramed, DataCamp, Podcast Interview, 2018
Interview about my deep learning work at NVIDIA

How to Keep Your Job Regardless of AI ARTICLE
International Business Times, 2017
Profile of my transition from biophysicist to data scientist


Program Chair
PyData NYC, 2018
Responsible for conference content including: recruitment of proposal reviewers, soliciation and aggregation of proposal feedback, selection of presentations, notification of selected speakers, and scheduling of talks.

Machine Learning Symposium Co-Chair
SciPy Conference, 2018
Co-chair of machine learning / deep learning symposium at SciPy 2018. Responsible for recruiting reviewers, reviewing proposals, selecting talks, and running symposium.

Proposal Reviewer
JupyterCon, 2018
Responsible for reviewing and scoring proposal submissions.

Journal of Open Source Software (JOSS), 2018
Review submissions to JOSS within my expertise area as needed.


NIH Postdoctoral Research Fellowship (F32 GM089047), 2009–2012
Global BCG Strategy Olympics, Winning Team, 2007
NSF Graduate Research Fellowship, 2002–2006
Outstanding Graduate Teaching Assistant, 2003
Barry M. Goldwater Scholar, 2000–2001
Outstanding Undergraduate Honors Research Thesis, 2001
Kansas Board of Regents Full Tuition Merit Scholarship, 1997–2001


Gill, M.L., Hsu, A., Palmer, A.G. ChemRxiv PDF
Detection of chemical exchange in methyl groups of macromolecules
Journal of Biomolecular NMR, 2019, in press
Poster presented at 57th Experimental Nuclear Magnetic Resonance Conference, 2016, Pittsburgh, PA POSTER

Tong, M., Pelton, J., Gill, M.L., Zhang, W., Picart, F., Seeliger, M. DOI PDF
Survey of solution dynamics in Src kinase reveals cross talk between the ligand binding and regulatory sites
Nature Communications, 2017, 8, 2160

Gill, M.L., Byrd, R.A., Palmer, A.G. DOI PDF
Dynamics of GCN4 facilitate DNA interaction: a model-free analysis of an intrinsically disordered region
Physical Chemistry and Chemical Physics, 2016, 18, 5839–5849
Posters presented at International Conference of Magnetic Resonance in Biological Sciences, 2014, Dallas, TX POSTER
and 57th Experimental Nuclear Magnetic Resonance Conference, 2016, Pittsburgh, PA POSTER

Gill, M.L., Sun, S., Li, Y., Byrd, R.A.
NESTA-NMR: efficient and quantitative processing of multidimensional NUS NMR data
Poster presented at 57th Experimental Nuclear Magnetic Resonance Conference, 2016, Pittsburgh, PA POSTER

*Sun, S., *Gill, M.L., Li, Y., Huang, M., Byrd, R.A. DOI PDF WEBSITE
Efficient and generalized processing of multidimensional NUS NMR Data: the NESTA algorithm and comparison of regularization terms
Journal of Biomolecular NMR, 2015, 62, 105–117
* Authors contributed equally
Poster presented at 56th Experimental Nuclear Magnetic Resonance Conference, 2015, Monterrey, CA POSTER

Gill, M.L., Byrd, R.A. DOI PDF
Dynamic activation of apoptosis: conformational ensembles of cIAP1 are linked to a spring-loaded mechanism
Nature Structural Molecular Biology, 2014, 21, 1022–1023

Gill, M.L., Palmer, A.G. DOI PDF
Local isotropic diffusion approximation for coupled internal and overall molecular motions in NMR spin relaxation
Journal of Physical Chemistry, Series B, 2014, 118, 11120–11128

Ergel, B., Gill, M.L., Brown, L., Yu, B., Palmer, A.G., Hunt, J.F. DOI PDF
Protein dynamics control the progression and efficiency of the catalytic reaction cycle of AlkB
Journal of Biological Chemistry, 2014, 289, 29584–29601
Poster presented at International Conference of Magnetic Resonance in Biological Sciences, 2012, Lyon, France POSTER

Gill, M.L. and Palmer, A.G. DOI PDF
Multiplet-filtered and gradient-selected zero-quantum TROSY experiments for 13C1H3 methyl groups in proteins
Journal of Biomolecular NMR, 2011, 51, 245–251
Poster presented at 52nd Experimental Nuclear Magnetic Resonance Conference, 2011, Monterrey, CA POSTER

Ramsey, J.D., Gill, M.L., Kamerzell, T.J., Price, E.S., Joshi, S.B., Bishop, S.M., Oliver, C.N., Middaugh, C.R. DOI PDF
Using empirical phase diagrams to understand the role of intramolecular dynamics in immunoglobulin G stability
Journal of Pharmaceutical Sciences, 2009, 98, 2432–2447

Gill, M.L., Strobel, S.A., and Loria, J.P. DOI PDF
Crystallization and characterization of the thallium form of the Oxytricha nova G-quadruplex
Nucleic Acids Research, 2006, 34, 4506–4514

Gill, M.L., Strobel, S.A., and Loria, J.P. DOI PDF
205Tl NMR methods for the study of monovalent metal binding sites in nucleic acids
Journal of the American Chemical Society, 2005, 127, 16723–16732
Poster and selected oral presentation at 46th Experimental Nuclear Magnetic Resonance Conference, 2005, Providence, RI

Beach, H., Cole, R., Gill, M.L., and Loria, J.P. DOI PDF
Conservation of µs-ms enzyme motions in the apo- and substrate-mimicked state
Journal of the American Chemical Society, 2005, 127, 9167–9176

Adams, P.L., Stahley, M.R., Gill, M.L., Kosek, A.B., Wang, J., and Strobel, S.A. DOI PDF
Crystal structure of a group I intron splicing intermediate
RNA, 2004, 12, 1867–1887

Wiethoff, C.M., Gill, M.L., Koe, G.S., Koe, J.G., and Middaugh, C.R. DOI PDF
A fluorescence study of the structure and accessibility of plasmid DNA condensed with cationic gene delivery vehicles
Journal of Pharmaceutical Sciences, 2003, 92, 1272–1285

Wiethoff, C.M., Gill, M.L., Koe, G.S., Koe, J.G., and Middaugh, C.R. DOI PDF
The structural organization of cationic lipid-DNA complexes
Journal of Biological Chemistry, 2002, 277, 44980–44987

Silchenko, S., *Sippel, M.L., Kuchment, O., Benson, D.R., Mauk, A.G., Altuve, A., and Rivera, M. DOI PDF
Hemin is kinetically trapped in cytochrome b5 from rat outer mitochondrial membrane
Biochemical and Biophysical Research Communications, 2000, 273, 467–472
* M.L. Gill is formerly M.L. Sippel

