Full-Time

Director/Senior Director of Data Engineering

Posted on 6/20/2025

Xaira Therapeutics

Xaira Therapeutics

201-500 employees

AI-powered protein design for therapeutics

Compensation Overview

$227k - $295k/yr

+ Bonus + Equity

No H1B Sponsorship

San Bruno, CA, USA

In Person

Category
Data & Analytics (1)
Required Skills
Data Science
Git
Docker
Requirements
  • Track record of building and leading bioinformatics engineering, data science, data engineering, or scientific computing teams in a biotech or pharmaceutical setting.
  • Demonstrated ability to work collaboratively in a multidisciplinary setting.
  • Strong oral and written communication skills
  • Experience developing and implementing robust computational pipelines to enable the generation of biological insights.
  • Experience with large scale data engineering in the life sciences space.
  • Extensive experience in data integration, pipeline engineering, and managing complex scientific data ecosystems, including integrating and managing laboratory data using Benchling or similar systems.
  • Experience with Data lake/data warehouse and metadata tools
  • Experience using GitHub and Docker (or equivalents) for reproducible software development and deployment.
  • Understanding of data governance, reproducibility, and provenance practices.
Responsibilities
  • Mentor and lead a high-performing data engineering team, fostering collaboration, technical excellence, and continuous improvement.
  • Serve as a liaison between technical teams, stakeholders, and leadership, leading cross-functional collaborations.
  • Promote and embed best practices in data engineering and governance across the organization, bringing industrial development and engineering practices to biological software development.
  • Oversee design, development, and management of scalable scientific data infrastructure that spans from the lab to cloud-based data infrastructure and analytical applications.
  • Develop data strategy and lead integration efforts with laboratory information management systems (e.g., Benchling), ensuring efficient data capture and automated data accessioning.
  • Develop and lead efforts to convert internally and externally generated data into data repositories for training and validation of AI models
  • Implement and maintain data governance standards, ensuring quality, interoperability, provenance, and accessibility.
  • Support AI, bioinformatics, and computational biology teams by building robust engineering platforms, including those deployed via Nextflow and other workflow systems
  • Establish best practices for continuous integration and continuous delivery (CI/CD) to ensure reproducibility and consistency.
  • Collaborate closely with stakeholders to translate complex data into clear, actionable insights, including the creation and management of analytical dashboards and interactive applications.
Desired Qualifications
  • Experience with data cataloging, findability, and improving accessibility of information in complex organizations.
  • Knowledge of regulatory compliance and handling of sensitive data.
  • Familiarity with AWS and Terraform for cloud infrastructure management.
  • PhD in bioinformatics, computational biology, computer science, or a related field with 10+ years of relevant experience, or MS with 12+ years of experience.

Xaira Therapeutics uses artificial intelligence to transform drug discovery and development. It builds a pipeline of protein- and antibody-based therapeutics by combining generative AI models from the IPD lab (notably RFdiffusion and RFantibody) with large, multidimensional biological data. The company designs novel proteins and antibodies from scratch to target difficult, previously undruggable biology, then validates and evolves these molecules through iterative data generation that improves its biological foundation models. Unlike many entrants that offer research services, Xaira aims to own and advance its own therapeutic pipeline across multiple modalities, rather than solely providing platforms. The goal is to deliver effective new medicines faster and with a lower chance of failure by integrating AI-driven design, expansive data, and expert leadership.

Company Size

201-500

Company Stage

Late Stage VC

Total Funding

$1B

Headquarters

Brisbane, California

Founded

2023

Simplify Jobs

Simplify's Take

What believers are saying

  • Rachel Lane closed Genentech and Sanofi partnerships at Belharra, now drives Xaira deals.
  • Focus on inflammatory diseases aligns with high-priority pharma therapeutic targets.
  • 73,075 sq ft San Francisco expansion by July 2025 supports rapid scaling.

What critics are saying

  • Isomorphic Labs' $600M Series A in March 2025 erodes Xaira's first-mover edge.
  • Xaira's $1B burn rate triggers liquidity crisis without Series B by mid-2027.
  • Public X-Atlas release lets Recursion reverse-engineer models within 12 months.

What makes Xaira Therapeutics unique

  • Xaira leverages RFdiffusion and RFantibody from David Baker's IPD for de novo protein design.
  • X-Cell model with 4.9 billion parameters predicts perturbations on 25.6 million transcriptomes.
  • X-Atlas/Pisces dataset is largest public genome-wide CRISPRi Perturb-seq resource.

Help us improve and share your feedback! Did you find this helpful?

Your Connections

People at Xaira Therapeutics who can refer or advise you

Benefits

Flexible Work Hours

Stock Options

Company Equity

Growth & Insights and Company News

Headcount

6 month growth

-1%

1 year growth

0%

2 year growth

5%
Business Wire
Mar 26th, 2026
Xaira Therapeutics appoints Rachel Lane as SVP to lead business development and AI drug discovery partnerships

Xaira Therapeutics has appointed Rachel Lane as Senior Vice President of Business Development and Operations. She will oversee business development strategy and drive partnerships integrating machine learning with therapeutic development. Lane brings over a decade of biotech leadership experience as an operator and investor. Most recently, she served as Chief Business Officer at Belharra Therapeutics, where she closed major platform partnerships with Genentech and Sanofi. She previously worked at Versant Ventures and holds a PhD in Molecular Genetics and Cell Biology. This month, Xaira launched X-Cell, its virtual cell model trained on 25.6 million perturbed single-cell transcriptomes. The 4.9 billion-parameter model predicts outcomes of genetic perturbations, marking progress towards transforming drug discovery into a predictive engineering discipline.

Endpoints News
Mar 18th, 2026
With its first AI model release, xaira tests scaling laws in virtual cells.

With its first AI model release, xaira tests scaling laws in virtual cells. Senior biopharma correspondent. Biotech's best-funded AI startup detailed its first AI model on Tuesday, giving a glimpse at what it's been doing since launching out of stealth nearly... Get free access to a limited number of articles, plus choose newsletters to get straight to your inbox.

Fierce Biotech
Mar 17th, 2026
Xaira COO reveals $1B AI biotech targets inflammatory and immunological diseases

Xaira Therapeutics' Chief Operating Officer Jeff Jonker has revealed the AI-driven biotech is focusing its research and development efforts on inflammatory and immunological science. The disclosure marks the first detailed insight into how the company is deploying its capital since raising $1 billion in 2024. The secretive biotech has remained largely quiet about its operations since the fundraise. Jonker's comments to Fierce Biotech provide the first public glimpse into Xaira's strategic priorities as it applies artificial intelligence to drug discovery in areas the pharmaceutical industry considers high-priority therapeutic targets.

Business Wire
Mar 17th, 2026
Xaira Therapeutics launches X-Cell virtual cell model trained on 25.6M single-cell transcriptomes

Xaira Therapeutics has launched X-Cell, its first virtual cell model, trained on X-Atlas/Pisces, the largest genome-wide CRISPRi Perturb-seq dataset ever reported. The dataset contains 25.6 million perturbed single-cell transcriptomes across seven cellular contexts, more than three times larger than Xaira's previous dataset. X-Cell uses a diffusion language model architecture with 4.9 billion parameters, making it the largest causal perturbation model to date. Unlike traditional autoregressive models, it iteratively refines predictions by progressively replacing control gene expression values with perturbed values, improving accuracy with each step. The model demonstrates state-of-the-art performance in predicting genetic perturbation outcomes, including unseen experiments across different cell types and laboratories. Xaira is making a subset of the dataset and model available to the scientific community.

Business Wire
Jul 9th, 2025
Xaira Therapeutics Announces the Appointment of Jeff Jonker as President and Chief Operating Officer

Xaira Therapeutics announces the appointment of Jeff Jonker as President and Chief Operating Officer.

INACTIVE