AI/ML Scientist – Protein Foundation Models

Manifold Bio·
Boston
3w ago
MIDResearch & Development
Market Rate — Biochemists and Biophysicists
25th
$70K
Median
$107K
75th
$144K

BLS 2024 data (national)

Description

<div class="content-intro"><p><strong>Manifold Bio builds AI models for protein therapeutic design, trained on proprietary experimental data generated at unprecedented scale.</strong> Our <em>in vivo</em>-centric discovery platform produces millions of experimentally validated protein designs per campaign, creating the datasets that make our models possible and our approach uniquely powerful. We combine high-throughput protein engineering with computational design to create antibody-like drugs and other biologics. Our world-class team of protein engineers, biologists, and computational scientists are working together to aim the platform at therapeutic opportunities where precise targeting is the key to overcoming clinical challenges.</p></div><p><strong>Position</strong></p> <p>Manifold's AI team is actively training protein foundation models on our proprietary experimental datasets. Our generative antibody design model, mBER, has already demonstrated controllable <em>de novo</em> binder design across multiple million-scale screening campaigns, and the team is now scaling foundation model capabilities to push well beyond current performance. We are looking for an AI/ML Scientist to join this effort. You will work alongside our existing model training team to accelerate the development of foundation models fine-tuned on Manifold's data, bringing additional depth in pre-training methodology, architecture development, and large-scale training. Your work will directly improve mBER's design capabilities and unlock new modeling paradigms for the broader team. You'll own foundation model projects end-to-end, from architecture selection and training infrastructure to evaluation against real experimental outcomes, while contributing to the team's shared research agenda.</p> <p><strong>Responsibilities</strong></p> <ul> <li>Advance the team's ongoing foundation model training efforts—pretraining, fine-tuning, and evaluating folding, docking, language, and generative design models on Manifold's proprietary experimental data</li> <li>Bring depth in training methodology, architecture selection, and optimization to complement the existing team's expertise</li> <li>Develop and scale training pipelines for distributed, multi-GPU and multi-node training runs</li> <li>Integrate foundation model outputs into mBER to improve binder design success rates and enable new design capabilities</li> <li>Design and execute ML experiments with clear hypotheses, rigorous evaluation frameworks, and systematic analysis</li> <li>Establish best practices for mixed-precision training, gradient checkpointing, and computational efficiency at scale</li> <li>Produce clear documentation and analysis supporting architecture and training decisions</li> </ul> <p><strong>Required Qualifications</strong></p> <ul> <li>Demonstrated experience pretraining and/or fine-tuning protein foundation models (folding, docking, language models, or generative design) with published or otherwise demonstrable results</li> <li>Strong familiarity with AlphaFold architecture and training methodology</li> <li>2+ years of hands-on experience with PyTorch and/or JAX for deep learning</li> <li>Experience with large-scale model training: distributed training, multi-GPU/multi-node setups, mixed precision, gradient checkpointing</li> <li>Solid understanding of deep learning architectures (transformers, attention mechanisms, diffusion/flow matching) and optimization techniques</li> <li>Experience working with protein structure data (PDB, mmCIF) and/or protein sequence datasets</li> <li>Strong statistical analysis and experimental design skills</li> <li>Proficiency in Python scientific computing stack (NumPy, Pandas, scikit-learn)</li> <li>Self-directed researcher who can balance guidance with independence</li> <li>Excellent written and verbal communication skills for cross-functional collaboration</li> </ul> <p><strong>Preferred Qualifications</strong></p> <ul> <li>Experience with protein generative design methods (e.g., RFdiffusion, ProteinMPNN, flow matching approaches)</li> <li>Experience with protein language models (e.g., ESM family)</li> <li>Published research in computational biology, protein design, or structural biology</li> <li>Experience training on proprietary or domain-specific biological datasets</li> <li>Familiarity with Ray for distributed computing</li> <li>Experience with Kubernetes (EKS) and cloud computing platforms (AWS)</li> <li>Knowledge of protein engineering, directed evolution, or structural biology wet lab techniques</li> <li>Experience working with agentic AI coding tools for fast, parallelized execution of modeling experiments</li> <li>Previous biotech/pharma industry experience</li> </ul> <p><strong>This Role Might Be Perfect For You If</strong></p> <ul> <li>You have deep experience training protein foundation models and want to apply that expertise to some of the richest proprietary experimental datasets in the field</li> <li>You're excited about pushing beyond public model performance by leveraging unique, large-scale <em>in vivo</em> screening data</li> <li>You thrive in high-ownership roles where you can drive research direction while collaborating with a tight-knit, world-class team</li> <li>You want your models to directly impact real drug discovery programs</li> </ul> <p><strong>If you're excited to train the next generation of protein foundation models on uniquely powerful experimental data, please reach out to careers@manifold.bio.</strong></p><div class="content-conclusion"><p><em style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-weight: 400; letter-spacing: normal; text-align: start; white-space: normal; word-spacing: 0px; text-decoration: none;"><em><span style="font-weight: 400;">We value different experiences and ways of thinking and believe the most talented teams are built by bringing together people of diverse cultures, genders, and backgrounds.</span></em></em></p></div>
Manifold Bio

Manifold Bio

BIOTECHNOLOGY

Measurement-driven drug design

LocationMA - Boston
Open Jobs6
Oncology
View Company Profile