Full-Time

Senior Machine Learning Scientist

Posted on 10/31/2025

Tahoe Therapeutics

Tahoe Therapeutics

11-50 employees

AI-powered in vivo drug discovery platform

No salary listed

Toronto, ON, Canada + 1 more

More locations: San Bruno, CA, USA

Hybrid

Hybrid role; ability to access SSF, CA or Toronto, ON offices; relocation to Bay Area/Greater Toronto Area encouraged.

Category
AI & Machine Learning (2)
,
Requirements
  • PhD or equivalent practical experience in a technical field
  • A proven track record of developing and applying deep learning methods, including experience with modern architectures such as transformers, state-space models, graph neural networks or diffusion-based generative models
  • Proficiency with modern ML frameworks (e.g., PyTorch, JAX, or TensorFlow) and core scientific computing libraries (e.g., NumPy, SciPy, Pandas)
  • A genuine enthusiasm for applying cutting-edge ML research to real-world biological problems and a bias towards action
Responsibilities
  • Develop and apply machine learning techniques towards building multi-modal foundation models that bridge the chemical and biological domains, i.e.: integrate models of chemical structure, target protein sequence and whole transcriptome scRNAseq
  • Stay at the forefront of ML and computational biology research and rapidly adopt state-of-the-art techniques to our problems and datasets
  • Collaborate with our team of biologists and engineers in cross-functional pods to test novel ML-driven hypotheses
Desired Qualifications
  • Prior experience with ML applied to problems in biology or chemistry
  • Familiarity with multimodal modeling, contrastive learning or self-supervised learning
  • Experience with large-scale distributed ML techniques (e.g., FSDP, TP, dMoE, flash attention)

Tahoe Therapeutics engineers AI-powered models of human cells to improve drug design, focusing on oncology and the RAS network. Its Mosaic platform generates in vivo, single-cell perturbation data by pooling cells from hundreds of diverse patient models into a single experiment, like a mosaic tumor in a mouse. The resulting Tahoe-100M dataset (100 million data points, 60,000 drug–cell interactions) trains AI models to identify novel drug targets and candidates. The company pursues internal drug development and strategic partnerships, including a 2026 joint venture with Alloy Therapeutics to advance antibody-drug conjugates and share larger datasets to accelerate discovery.

Company Size

11-50

Company Stage

Series A

Total Funding

$42M

Headquarters

San Francisco, California

Founded

2022

Simplify Jobs

Simplify's Take

What believers are saying

  • Parse Biosciences GigaLab generates 300 million single-cell profiles, expanding foundational datasets.
  • January 2026 Alloy Therapeutics joint venture develops antibody-drug conjugates for difficult cancers.
  • $30M funding from Mubadala Capital fuels one billion datapoints mapping million drug-patient interactions.

What critics are saying

  • Parse Biosciences failure halts 300M-cell dataset, delaying AI training 12-24 months.
  • Alloy joint venture fails clinical translation, invalidating virtual cell predictions in 18-36 months.
  • Genentech and Roche outspend Tahoe, commoditizing in vivo data advantage in 18-36 months.

What makes Tahoe Therapeutics unique

  • Mosaic platform pools hundreds of patient cells into single mouse experiments for scalable in vivo data.
  • Tahoe-100M dataset maps 60,000 drug-cell interactions, 50 times larger than public perturbation data.
  • AI models trained on gigascale in vivo atlas uncover novel RAS network targets undetectable in vitro.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Unlimited Paid Time Off

Health Insurance

Vision Insurance

Dental Insurance

Growth & Insights and Company News

Headcount

6 month growth

9%

1 year growth

0%

2 year growth

0%
General
Dec 12th, 2025
Tahoe Therapeutics Selects Parse Biosciences' GigaLab to Generate 300 Million Single Cell Profiles for Large-Scale Perturbation Atlas

Tahoe Therapeutics selects Parse Biosciences' GigaLab to generate 300 million single cell profiles for large-scale perturbation atlas. Business Wire IndiaParse Biosciences, a leading provider of accessible and scalable single cell sequencing solutions, today announced that Tahoe Therapeutics has selected Parse's GigaLab to generate data for its upcoming 300M cell project. Tahoe will use its proprietary Mosaic technology to generate samples consisting of 300M cells from large arrays of disease models, genetically or chemically perturbed. Under this agreement, Parse will apply its Evercode(TM) chemistry and high-throughput automation to these samples to deliver the largest perturbation-focused single cell dataset ever produced. Parse's GigaLab, designed specifically for million- to hundred-million-cell projects, has rapidly become an industry benchmark for large-scale, reproducible single cell data generation. The facility integrates high-capacity liquid handling, standardized workflows, and end-to-end QC to enable dataset sizes previously unattainable for most research organizations. The Tahoe project represents one of the largest sequencing initiatives undertaken at GigaLab to date and underscores the increasing industrial demand for massive single cell atlases capable of training modern AI systems. "This collaboration demonstrates how scalable single cell technology can meet the demands of modern drug discovery," said Charlie Roco, PhD, Chief Technology Officer and Co-founder at Parse Biosciences. "By combining our GigaLab platform with Tahoe's perturbation engine, we are enabling a dataset that can power the next generation of AI models, changing how therapies are discovered." The 300-million-cell project expands Tahoe Therapeutics' lead in building foundational perturbation datasets that capture how drugs, targets, and disease contexts interact across diverse biological systems. These high-dimensional datasets are made uniquely possible by Tahoe's Mosaic platform and form the next-generation foundational datasets that power virtual cell models, enabling more accurate predictions of therapeutic response, mechanism of action, and patient variability. Today's drug discovery pipelines are constrained by datasets that lack drug discovery focused perturbations and biological diversity needed to train predictive AI systems. By contrast, Tahoe's foundational datasets will incorporate: These features allow Tahoe to identify unexpected drug/cell interactions, uncover new biology, and explore therapeutic avenues that traditional in vitro systems often miss. "Scaling single cell perturbation data is essential for building AI models that understand human biology," said Johnny Yu, CSO of Tahoe Therapeutics. "Leveraging Parse's GigaLab, we're able to perform single cell sequencing on samples generated from our Mosaic technology at unprecedented depth and diversity, moving us closer to scale of foundational datasets that power our virtual cell models, which can predict therapeutic outcomes across patients and diseases." With this agreement, Parse and Tahoe together highlight a new reality in the life sciences: the next generation of therapeutic discovery requires data at a scale that only a small number of expert entities, equipped with highly automated, industrial-grade platforms, can produce. By delivering sample preparation and sequencing for this dataset through its GigaLab, Parse continues to set the standard for what is technically feasible in large-scale single cell biology, while enabling innovators like Tahoe to build AI systems rooted in far richer biological context than was previously possible. About Tahoe Therapeutics Tahoe Therapeutics is building AI-powered models of the human cell to design better drugs for more patients. Its technology platform generates large-scale, perturbative single-cell datasets that enable a new generation of biological foundation models. Based in South San Francisco, Tahoe was founded by a team of scientists and technologists advancing the frontiers of drug discovery, genomics, and machine learning. Learn more at tahoebio.ai. About Parse Biosciences Parse Biosciences is a global life sciences company whose mission is to accelerate progress in human health and scientific research. Empowering researchers to perform single cell sequencing with unprecedented scale and ease, its pioneering approach has enabled groundbreaking discoveries in cancer treatment, tissue repair, stem cell therapy, kidney and liver disease, brain development, and the immune system. With technology developed at the University of Washington by co-founders Alex Rosenberg and Charles Roco, Parse has raised over $100 million in capital and is used by approximately 3,000 customers across the world. Its growing portfolio of products includes Evercode(TM) Whole Transcriptome, Evercode(TM) TCR, Evercode(TM) BCR, Gene Select, and a solution for data analysis, Trailmaker(TM). Parse Biosciences is based in Seattle's vibrant South Lake Union district, where it recently expanded into a new headquarters and state-of-the-art laboratory.

WAYA Media
Aug 12th, 2025
Mubadala Capital Joins USD 30M Round for US-Based Tahoe Therapeutics

Tahoe Therapeutics raised USD 30M, with backers including Amplify Partners, Databricks Ventures, Mubadala Capital, and other major investors.

Business Wire
Aug 11th, 2025
Tahoe Therapeutics Raises $30M for AI Dataset

Tahoe Therapeutics has secured $30 million in funding to create the world's largest dataset for training AI models of human cells. The initiative aims to generate one billion single-cell datapoints and map one million drug-patient interactions, facilitating the discovery of new precision medicines for cancer and other diseases. Tahoe plans to partner with a single entity to share the data and expedite its application.

BiopharmaTrend
Feb 25th, 2025
Vevo Therapeutics Open-Sources Largest Single-Cell Dataset with Arc Institute

Vevo Therapeutics has officially released the Tahoe-100M, described as the world's largest single-cell dataset, in collaboration with the Arc Institute.

BioSpace
Dec 5th, 2024
Vevo Therapeutics Partners with the Parse Biosciences GigaLab to Generate 100M Cell Atlas for AI Powered Drug Discovery

Vevo Therapeutics partners with the Parse Biosciences GigaLab to generate 100M cell atlas for AI powered drug discovery.

INACTIVE