Full-Time

Senior Software Engineer

Experiment Insights, Weights & Biases

Weights & Biases

Weights & Biases

201-500 employees

MLOps platform for experiments, datasets, models

Compensation Overview

$139k - $204k/yr

+ Discretionary Bonus + Equity Awards

San Francisco, CA, USA + 4 more

More locations: Livingston, NJ, USA | New York, NY, USA | Bellevue, WA, USA | Sunnyvale, CA, USA

Hybrid

Hybrid role; remote work may be considered for candidates located more than 30 miles from an office.

Category
Software Engineering (1)
Required Skills
Python
JavaScript
React.js
Java
GraphQL
TypeScript
Go
C/C++
Requirements
  • 3–5+ years of experience building user-facing software, with a strong focus on frontend development
  • Comfortable working across the stack, including backend services, APIs, and data models when product needs require it
  • Experience working in ambiguous problem spaces where requirements, schemas, and abstractions evolve over time
  • Strong product instincts and ability to reason from user goals to technical design decisions
  • Familiarity with modern web technologies such as React, TypeScript, and GraphQL
  • Solid understanding of programming fundamentals and experience with at least one backend or systems language (e.g. Go, Python, Java, C++, JavaScript)
  • Bias toward action, iteration, and learning through building, with the judgment to avoid over-engineering too early
Responsibilities
  • Rapidly prototype, iterate, and ship new analysis workflows for Experiment Insights, balancing experimentation with long-term product quality
  • Design frontend experiences in parallel with backend schemas and APIs, evolving data models as user needs become clearer
  • Work across the stack to implement end-to-end features, including adding or modifying logging formats, APIs, and data access patterns
  • Make pragmatic decisions about how user-logged data should be stored, queried, and rendered to support new analytical questions
  • Collaborate closely with product, design, and customer-facing teams to validate ideas early and incorporate real user feedback
  • Improve reliability, performance, and scalability when working with large, complex, or high-volume datasets
  • Contribute clean, testable, and reusable code; participate in code reviews and help establish patterns that reduce one-off solutions
Desired Qualifications
  • Experience evolving or versioning data schemas in production systems
  • Experience designing APIs to support exploratory or analytical workflows
  • Familiarity with large-scale data systems, time-series data, media, or domain-specific file formats
  • Exposure to ML, robotics, simulation, or scientific tooling
  • You enjoy working in ambiguous problem spaces, quickly iterating on ideas, and refining both user experiences and underlying data models as you learn what users actually need
  • You’re comfortable thinking end-to-end about products — from how data is logged and structured, to the APIs that expose it, to the frontend experiences that help users make sense of it

Weights & Biases provides a developer-first MLOps platform that helps ML teams manage the full machine learning lifecycle. Its core tools include Experiments for tracking and comparing runs, Sweeps for automated hyperparameter optimization, and Artifacts for versioning datasets and models to ensure reproducibility. The platform integrates with TensorFlow, PyTorch, and Keras, enabling researchers to work within familiar frameworks and collaborate across teams. Its goal is to help practitioners reproduce results, optimize models efficiently, and manage production ML at scale, from development to deployment.

Company Size

201-500

Company Stage

Acquired

Total Funding

$2B

Headquarters

San Francisco, California

Founded

2017

Simplify Jobs

Simplify's Take

What believers are saying

  • CoreWeave's $1.7B acquisition in May 2025 integrates compute with W&B's MLOps platform.
  • LG CNS MOU on March 19, 2025, expands agentic AI in South Korea enterprise market.
  • 700,000 practitioners and customers like OpenAI, NVIDIA drive network effects and adoption.

What critics are saying

  • CoreWeave GPU shortages block W&B Inference scaling within 6-12 months.
  • Databricks MLflow and Hugging Face poach enterprises due to acquisition uncertainty in 12 months.
  • Post-IPO pricing hikes erode free-tier users to MLflow and Kubeflow in 12-24 months.

What makes Weights & Biases unique

  • W&B LEET provides open-source terminal UI for real-time ML training monitoring since November 2025.
  • W&B Inference serves LoRA fine-tuned models on CoreWeave GPUs via OpenAI-compatible API.
  • LLM Evaluation Jobs automate benchmark evaluations on model checkpoints with managed GPUs.

Help us improve and share your feedback! Did you find this helpful?

Your Connections

People at Weights & Biases who can refer or advise you

Benefits

Medical, dental, and vision insurance - 100% paid for by CoreWeave

Company-paid Life Insurance

Voluntary supplemental life insurance

Short and long-term disability insurance

Flexible Spending Account

Health Savings Account

Tuition Reimbursement

Ability to Participate in Employee Stock Purchase Program (ESPP)

Mental Wellness Benefits through Spring Health

Family-Forming support provided by Carrot

Paid Parental Leave

Flexible, full-service childcare support with Kinside

401(k) with a generous employer match

Flexible PTO

Catered lunch each day in our office and data center locations

A casual work environment

A work culture focused on innovative disruption

Hybrid Work Options

Remote Work Options

Growth & Insights and Company News

Headcount

6 month growth

-2%

1 year growth

-2%

2 year growth

-3%
Weights & Biases
Nov 1st, 2025
Product newsletter: Updates and new features for November 2025

From LLM Evaluation Jobs in W&B Models to its new terminal UI, here's what Biases, Inc. shipped in November: Welcome to the November 2025 edition of the Weights & Biases newsletter. Last month, Biases, Inc. launched a brand new terminal UI for Weights & Biases, brought support for LoRA fine-tuned models to W&B Inference, announced LLM evaluation jobs for W&B Models and more. Table of contents. Introducing W&B LEET: A new terminal UI for Weights & Biases. Biases, Inc. is thrilled to roll out its W&B Lightweight Experiment Exploration Tool (LEET), a fast terminal interface to watch your ML training runs, including stats, metrics, and system health. It reads and visualizes live Weights & Biases log files to provide a fast, customizable, browser-free experience in real time. LEET presents an interactive, three-pane dashboard that updates live as your run progresses: * Run overview (left): configuration, environment, and summary metadata * Metrics grid (center): live charts for tracked metrics * System metrics (right): CPU, GPU, and memory consumption W&B LEET is open source and available to users on W&B SDK version 0.23.0 or later. You can access W&B LEET directly from the terminal where your W&B run is active: For a full list of LEET shortcuts and hotkeys, hit h or? to toggle the help screen. You can give it a try here. Bring your LoRA to serve fine-tuned models on W&B Inference. W&B Inference now lets you serve custom fine-tuned models on fully managed CoreWeave GPU clusters without managing infrastructure. Use its OpenAI-compatible Chat Completions API, reference your LoRA artifact in W&B Models along with the base model name, and Biases, Inc.'ll dynamically load your adapter onto a preloaded base model at request time and return the response. It takes just a few lines of code on your local machine to deploy your LoRA adapter on W&B Inference. Checkout this sample notebook to get started or this launch blog post for more information. * Version-controlled serving: The artifact URI explicitly includes the project, run, and version, providing traceability back to the training and hyperparameters used to create the LoRA weights. * Zero infra to manage: You avoid the complexity of setting up and scaling serving infrastructure for every LoRA iteration. The system handles dynamic loading and hot-swapping of your weights in the background. * Faster iteration: Because only the small LoRA weights are updated and managed via artifacts, you can cycle from training to production validation instantly. Saved prompts in W&B Weave Playground. You can now edit, save, and version prompts directly within the W&B Weave Playground. The Playground is great for interactively comparing production traces across different LLMs. Now you can track changes as you refine prompts and pull them into your code using: To try it out, head to https://wandb.ai/inference, pick a model, and click "Try in Playground." This feature is available in its SaaS cloud deployments but will be rolled out to Dedicated shortly. Introducing LLM Evaluation Jobs in W&B Models. You can now evaluate model checkpoints during training on popular public benchmarks without building numerous evaluation harnesses or managing infrastructure with LLM Evaluation Jobs. This gives you early reads on your model's downstream performance so you can course-correct or end unpromising runs. You save GPU hours and wall-clock time. When training or fine-tuning models, you often want an early read on benchmark performance while still in the loop. If results trend poorly, you can adjust hyperparameters or stop the run to avoid waste. Doing this on your own means finding compute, building the evaluation harness, and wiring it into your training pipeline. That work adds little differentiation and slows you down. That changes with LLM Evaluation Jobs in W&B Models. Point to your model checkpoint artifacts or hosted API endpoints, then pick a benchmark. Biases, Inc. provide prebuilt evaluation harnesses for popular public benchmarks using Inspect Evals. Biases, Inc. provision and manage the GPU infrastructure, so there is nothing for you to set up or maintain. Biases, Inc. run the evaluation, store results in your Weights & Biases project, and generate a leaderboard, so you can easily compare your models in your W&B Models workspace. LLM Evaluation Jobs is currently available in public preview for multi-tenant cloud customers. You can check out the docs to learn more. Updated run menu shortcuts in W&B Models. Biases, Inc. has added some new options to the run menu to make your life a little easier: * Actions to copy the run name or the run path to your clipboard * A link right to the Logs tab for a run * Clarification on updating the display name for a run in this workspace vs. the name of the run for the whole project. These also all appear in the menu in the single run views and are available on both its Dedicated and SaaS deployments. Biases, Inc. hope this shaves off a few more seconds for these common actions. That's it for November. If you missed any of its recent product newsletter, you can catch up here: October: Release of Serverless RL. Access and usability improvements to W&B Registry. W&B Weave adds the ability to generate images in Playground and new filter evaluations. W&B Models adds copious quality-of-life improvements to UI and common work streams. September: W&B Inference support for Z.AI's GLM 4-5. W&B Weave introduced no-code evaluations. W&B Models enhanced support for post-training AI agents with reinforcement learning, featuring a sleek traces panel right in the workspace. August: W&B Inference support for DeepSeek V3.1. W&B Weave introduced a new Content API for logging any media type, along with updates like UI-based prompt management, a trace graph view, better markdown rendering, and new latency metrics. W&B Models added practical improvements such as pinned workspace columns and a full-screen media viewer for easier comparison. July: Tool calling support and new models in W&B Inference including GPT OSS, Qwen 3, and Kimi K2. For W&B Weave, Biases, Inc. announced advanced filtering for traces, live dashboards for monitoring trace plots, and the ability to group threads into traces. For W&B Models, Biases, Inc. announced usability improvements for media and updates to full fidelity plots. June: Biases, Inc. announced a new product, W& Inference as well as updates to existing products. For Weave, Biases, Inc. launched online evaluations, pivot tables, new data types. For Models, Biases, Inc. launched CoreWeave Observability and Object Storage integration, and integrated logging with W&B Tables. May: For Weave, Biases, Inc. launched features to trace streaming responses, video support, saved views, and filter by status. For Models, Biases, Inc. launched workspace templates for line plot settings, enhanced color controls, and control media panels in bulk. April: For Weave, Biases, Inc. launched new trace views, the EvaluationLogger API, OpenTelemetry support, custom model support in Weave Playground, and more. For Models, Biases, Inc. announced run metrics notifications, protected aliases in W&B Registry, custom display names, complete console logs for distributed runs and easier ways to view media panels. March: For Weave, Biases, Inc. launched the MCP server and integrations with OpenAI Agents SDK and CrewAI. For Models, Biases, Inc. launched interactive sliders, segment masks, and options to format custom metrics. February: For Weave, Guardrails generally available, great data privacy, and new ways to work with datasets. For Models, new workspaces usability updates, faster and easier share of AI artifacts through Registry, cool line plot updates, and new Reports customization options. January: For Weave, state-of-the-art agent released, run trials in the Weave playground, human annotation scorers, and datasets made easy. For Models, new panel management and improved workspace user experience with unified settings, live preview and a new UX.

VentureBeat
Jun 16th, 2025
Minimax-M1 Is A New Open Source Model With 1 Million Token Context And New, Hyper Efficient Reinforcement Learning

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more. Chinese AI startup MiniMax, perhaps best known in the West for its hit realistic AI video model Hailuo, has released its latest large language model, MiniMax-M1 — and in great news for enterprises and developers, it’s completely open source under an Apache 2.0 license, meaning businesses can take it and use it for commercial applications and modify it to their liking without restriction or payment. M1 is an open-weight offering that sets new standards in long-context reasoning, agentic tool use, and efficient compute performance. It’s available today on the AI code sharing community Hugging Face and Microsoft’s rival code sharing community GitHub, the first release of what the company dubbed as “MiniMaxWeek” from its social account on X — with further product announcements expected. MiniMax-M1 distinguishes itself with a context window of 1 million input tokens and up to 80,000 tokens in output, positioning it as one of the most expansive models available for long-context reasoning tasks.The “context window” in large language models (LLMs) refers to the maximum number of tokens the model can process at one time — including both input and output

VentureBeat
Jun 12th, 2025
Cloud Collapse: Replit And Llamaindex Knocked Offline By Google Cloud Identity Outage

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn moreDays after OpenAI and Google Cloud announced a partnership to support the growing use of generative AI platforms, much of the AI-powered web and tools went down due to an outage of the leading cloud providers.Google Cloud Service Platform (GCP) and some Cloudflare services began experiencing issues around 10:00 a.m. PT today, affecting several AI development tools and data storage services, including ChatGPT and Claude, as well as a variety of other AI platforms.We are aware of a service disruption to some Google Cloud services and we are working hard to get you back up and running ASAP.Please view our status dashboard for the latest updates: https://t.co/sT6UxoRK4R — Google Cloud (@googlecloud) June 12, 2025A GCP spokesperson confirmed the outage to VentureBeat, urging users to check its public status dashboard.GCP said affected services include API Gateway, Agent Assist, Cloud Data Fusion, Contact Center AI Platform, Google App Engine, Google BigQuery, Google Cloud Storage, Identity Platform, Speech-to-Text, Text-to-Speech and Vertex AI Search, among other tools. Google’s mobile development platform, Firebase, also went down.VentureBeat staffers had trouble accessing Google Meet, but other Google services on Workspace remained online.A Cloudflare spokesperson told VentureBeat only “a limited number of services at Cloudflare use Google Cloud and were impacted. We expect them to come back shortly

AiThority
May 5th, 2025
CoreWeave Acquires Weights & Biases

CoreWeave has completed its acquisition of Weights & Biases, enhancing its AI Cloud Platform capabilities. This strategic move aims to accelerate AI innovation and expand growth opportunities following CoreWeave's recent IPO. CEO Michael Intrator praised Weights & Biases for their innovation and engineering excellence, which align with CoreWeave's priorities. Together, they plan to deliver a leading AI Cloud Platform to develop, deploy, and iterate AI more efficiently.

MarketScreener
May 5th, 2025
CoreWeave Acquires Weights & Biases

CoreWeave, Inc. (Nasdaq: CRWV) has completed its acquisition of Weights & Biases, enhancing its AI Cloud Platform capabilities. This strategic move aims to accelerate AI innovation by combining CoreWeave's infrastructure with Weights & Biases' developer platform. The acquisition supports interoperability and aims to empower AI developers. Financial advisors included Evercore and Morgan Stanley for CoreWeave, and Qatalyst Partners for Weights & Biases.