Full-Time

Data Scientist

Knowledge Graphs

Mithrl

Mithrl

11-50 employees

Generative AI-powered bioinformatics analysis pipelines

Compensation Overview

$150k - $200k/yr

San Francisco, CA, USA

In Person

Category
Data & Analytics (2)
,
Requirements
  • Strong experience in data science, bioinformatics, computational biology, or a related field
  • Experience working with biological knowledgebases, public datasets, or ontology driven systems
  • Familiarity with graph data structures, relationship modeling, and knowledge graph concepts
  • Experience harmonizing heterogeneous biological datasets and mapping variable IDs across sources
  • Proficiency in Python and scientific computing libraries
  • Ability to build ingestion pipelines for structured or semi structured biological data
  • Strong understanding of metadata standards, biological ontologies, and domain logic
  • Ability to translate complex biological information into structured, machine readable representations
  • Excellent communication skills and comfort collaborating across engineering and scientific teams
Responsibilities
  • Ingest, harmonize, and version high value public biological datasets such as CellxGene, Gemma, ARCHS4, ENCODE, GTEx, TCGA, etc.
  • Ingest well maintained peer reviewed knowledgebases including OpenTargets, HPA, and similar resources
  • Build automated pipelines to curate and expand relationships inside the knowledge graph
  • Define and evolve schemas for node types, relationships, metadata rules, and ontology alignment
  • Harmonize variable IDs and metadata fields across all imported sources to create a unified knowledge layer
  • Build and maintain versioning, change tracking, and provenance systems for all data and relationships
  • Develop the framework that allows users to build custom knowledge graphs from the analyses they run inside Mithrl
  • Build features that allow users to explore, query, and interact with their graphs
  • Work closely with ML engineers, bioinformatics teams, and discovery application teams to ensure the knowledge graph supports downstream reasoning and analysis
  • Validate the correctness, completeness, and integrity of the knowledge graph across releases
Desired Qualifications
  • Experience with graph databases or graph query languages
  • Experience with KG curation, link prediction, relationship extraction, or graph based ML
  • Familiarity with multi modal data integration
  • Previous work on biological or chemical knowledge graphs
  • Experience with public consortia such as ENCODE, GTEx, TCGA, or ChEMBL, etc.
  • Prior experience in a tech bio startup or scientific software environment

Mithrl provides data-processing services for bioinformatics using Generative AI to speed up the creation of analysis pipelines and the generation of comprehensive reports. Its platform enables clients to obtain ready-to-run pipelines and accompanying white papers or reports through a subscription or order-based model, with pricing tailored to each project. The product works by applying Generative AI and automated workflows to ingest client data, design and run analysis pipelines, and produce structured reports or white papers, all within a transparent, client-controlled environment. Mithrl differentiates itself by focusing on rapid, customized delivery for a niche bioinformatics market, offering tailored content and insights while prioritizing data transparency and avoiding third-party data sharing. The company's goal is to help clients achieve faster, reliable data analysis and documentation through accessible, contract-based services that fit their specific needs.

Company Size

11-50

Company Stage

Seed

Total Funding

$4M

Headquarters

San Francisco, California

Founded

2023

Simplify Jobs

Simplify's Take

What believers are saying

  • Pharma adoption accelerating: biomarker discovery completed in 15 minutes versus traditional months-long workflows.
  • Quality control automation prevents expensive downstream errors, creating sticky, mission-critical workflows in drug discovery.
  • Vertical AI in life sciences attracting institutional capital; $4M seed round signals strong investor confidence in domain-specific solutions.

What critics are saying

  • Open-source competitors Galaxy and Nextflow offer free no-code NGS pipelines, capturing 70% of biotech workflows.
  • Illumina-owned Basepair dominates RNA-seq with integrated hardware bundles, undercutting standalone platforms by 40%.
  • OpenAI o1-pro enables custom RNA-seq pipelines via ChatGPT for $20/month, commoditizing Mithrl's core value proposition.

What makes Mithrl unique

  • Converts months of bioinformatics analysis into minutes with automatic data cleaning and literature integration.
  • Autonomous AI Co-Scientist agents collaborate with researchers, catching errors like mislabeled samples before costly mistakes.
  • No-code platform eliminates need for specialized bioinformatics engineers, democratizing NGS data analysis for pharma teams.

Help us improve and share your feedback! Did you find this helpful?

Your Connections

People at Mithrl who can refer or advise you

Benefits

Health Insurance

Dental Insurance

Vision Insurance

401(k) Retirement Plan

Remote Work Options

Hybrid Work Options

Flexible Work Hours

Paid Vacation

Paid Holidays

Wellness Program

Mental Health Support

Conference Attendance Budget

Professional Development Budget

Stock Options

Company Equity

Phone/Internet Stipend

Home Office Stipend

Growth & Insights and Company News

Headcount

6 month growth

-13%

1 year growth

-20%

2 year growth

-5%
Mithrl
Mar 28th, 2026
An Agentic AI Scientific Decision Engine for Omics: Bridging Core Lab and End User.

An Agentic AI Scientific Decision Engine for Omics: Bridging Core Lab and End User. In service of service labs. Last month, the Mithrl team headed to Pittsburgh for ABRF 2026, the annual meeting of the Association of Biomolecular Resource Facilities. ABRF is where the scientists who run core research facilities come together: the genomics labs, proteomics centers, and shared resource teams that power discovery across academia and biopharma. What makes it special is that it sits at the intersection of cutting-edge science and real-world execution. The people in the room aren't just talking about new technologies; they're the ones actually running the experiments. Its scientific poster. Mithrl Inc. were honored to be selected for a scientific poster, An Agentic AI Scientific Decision Engine for Omics: Bridging Core Lab and End User, authored and presented by, Ada Shaw, PhD, Scientific Partnerships Lead at Mithrl. Poster background: challenges. - Data complexity ≠ clearer outcomes for investigators - Bioinformatics & biology expertise rarely coexist - AI outputs: hard to verify, reproduce, or operationalize Solution: Mithrl Scientific Decision Engine. Raw Reads to Defensible Decisions; Transparent · Accurate · Reproducible Mithrl is a Scientific Decision Engine that compresses the discovery loop for pharma R&D. Mithrl may also be applied to service lab operations, to deliver data to clients in a ready-for-insights mode, with no coding or configuration required. With Mithrl[[ʼ]]s enterprise-ready agentic AI systems, bench scientists and computational biologists can accelerate analyses and contextualize results with rigorous biological insights. Mithrl enables researchers to discover new biomarkers, identify new therapeutic targets, and understand mechanisms of action, moving from raw data to reproducible insights and defensible IP opportunities. Conversations and community continue. Download its scientific poster from the recent ABRF conference in Pittsburgh, PA. Connect with Mithrl Inc. to discuss the Mithrl Scientific Decision Engine and how your discovery or services lab could gain a competitive edge in speed and delivering depth of actionable insights.

FinSMEs
Nov 15th, 2024
Mithrl Inc. raises $4M Seed Funding

Mithrl, a San Francisco-based AI platform provider for scientific research, raised $4M in Seed funding led by Bonfire Ventures. The funds will be used to expand their go-to-market team and enhance their platform, which aids pharmaceutical and biotech firms in accelerating drug discovery. The platform allows rapid RNA sequencing data analysis without coding, enabling scientists to validate findings.