Summer 2026

Bioinformatics Machine Learning Intern

RefinedScience

RefinedScience

Data-driven precision medicine for trials

Compensation Overview

$34 - $38/hr

United States

Hybrid

Category
Biology & Biotech (2)
,
Required Skills
LLM
Python
R
Machine Learning
Requirements
  • Current Ph.D. candidate in Bioinformatics, Computational Biology, Computer Science, Biostatistics, or a related quantitative field
  • Single-cell omics experience: Demonstrated ability to process, analyze, and interpret single-cell data (scRNA-seq, scATAC-seq, CITE-seq, or spatial transcriptomics) using frameworks such as Scanpy/scverse, Seurat, or Bioconductor
  • Machine learning expertise: Applied experience developing and evaluating ML/deep learning models on biological data, including neural network architectures (GNNs, transformers, autoencoders), model selection and benchmarking, and integration of ML approaches into analytical workflows
  • Programming proficiency: Python and/or R for data analysis, statistical modeling, and visualization
  • Statistical foundation: Understanding of statistical methods for biological data (hypothesis testing, differential expression, multiple testing correction, clustering)
  • Strong problem-solving skills and ability to communicate complex insights effectively
Responsibilities
  • Analyze single-cell and multiomics datasets to extract biological insights supporting precision medicine and drug development programs
  • Apply and evaluate machine learning and deep learning approaches to single-cell data for tasks such as cell type classification, biomarker discovery, and patient stratification
  • Explore and prototype generative AI and LLM-based approaches to accelerate biological data interpretation and scientific workflows
  • Collaborate with scientists, clinicians, and data scientists to design and execute data-driven research projects
  • Document and optimize computational workflows following reproducible research best practices
  • Present findings through technical reports, visualizations, and presentations to cross-functional teams
Desired Qualifications
  • Experience with deep learning frameworks (PyTorch, TensorFlow, JAX)
  • Familiarity with graph neural networks, attention mechanisms, or transformer architectures applied to biological data
  • Experience with ML experiment tracking and reproducibility (MLflow, Weights & Biases)
  • Exposure to representation learning, variational autoencoders, or contrastive learning methods
  • Familiarity with scikit-learn, XGBoost, or similar ML libraries
  • Interest in or experience with LLMs, RAG systems, or agentic AI tooling
  • Experience with multimodal single-cell integration (Seurat WNN, scvi-tools/MultiVI/totalVI, Muon)
  • Familiarity with spatial transcriptomics analysis (Squidpy, cell2location, nf-core/spatialvi)
  • Experience with cell-cell communication inference (CellChat, NicheNet, LIANA)
  • Knowledge of drug-gene interaction resources (CMap/LINCS, OpenTargets, ChEMBL)
  • Familiarity with Linux/Unix CLI and version control (Git/GitHub)
  • Experience with containerization (Docker, Singularity) and environment management (conda, venv)
  • Exposure to cloud computing platforms (GCP preferred)
  • Familiarity with workflow managers (Nextflow, Snakemake)
  • Adherence to best-practices for conduct reproducible computational research

RefinedScience builds a data platform that combines clinical and biological data to support precision drug development and clinical trial optimization using AI and machine learning. It curates high-fidelity data and runs computational analyses to identify which patients are most likely to respond and how to design trials, delivering data-driven insights as a service to pharmaceutical companies and healthcare institutions. Unlike providers that only offer datasets, RefinedScience integrates rich data with advanced analytics to give actionable, patient-focused guidance. Its goal is to improve patient outcomes and shorten drug development timelines by increasing trial success and speeding up discovery to approval.

Company Size

N/A

Company Stage

N/A

Total Funding

N/A

Headquarters

Aurora, Colorado

Founded

2019

Simplify Jobs

Simplify's Take

What believers are saying

  • Fitzsimons membership grants unique campus data access for drug development.
  • Hospital partnerships replicate datasets across hematology, oncology, immunology.
  • Precision analytics cut drug timelines from 10 years to 3-4 years.

What critics are saying

  • OncoVerity spinout diverts AML pharma partnerships and IP value now.
  • Tempus's $400M funding captures larger oncology trial contracts immediately.
  • NVIDIA's free BioNeMo models commoditize single-cell analytics this year.

What makes RefinedScience unique

  • RefinedScience integrates live-stream hospital feeds with single-cell AML omics data.
  • Led by Clay Smith, M.D., it delivers single-cell treatment response insights.
  • Spun from CU Anschutz, it expands datasets to solid tumors and autoimmunity.

Help us improve and share your feedback! Did you find this helpful?

Your Connections

People at RefinedScience who can refer or advise you

Benefits

Health Insurance

Dental Insurance

Vision Insurance

Life Insurance

Disability Insurance

Health Savings Account/Flexible Spending Account

Paid Vacation

Paid Holidays

Paid Sick Leave

401(k) Retirement Plan

Company News

Business Wire
Nov 22nd, 2024
OncoVerity Secures Extended Series A to Advance Cusatuzumab in Newly Diagnosed AML

Alongside financing, first patients dosed in OV-AML-1231, a Phase 2 randomized controlled trial of cusatuzumab in newly diagnosed AML