Full-Time

Member of Technical Staff

Cloud Infrastructure

Fireworks AI

Fireworks AI

51-200 employees

AI inference platform for model deployment

Compensation Overview

$175k - $220k/yr

+ Equity

Company Does Not Provide H1B Sponsorship

San Mateo, CA, USA + 1 more

More locations: New York, NY, USA

In Person

Category
DevOps & Infrastructure (1)
Required Skills
Kubernetes
Microsoft Azure
Python
Tensorflow
Pytorch
Computer Networking
MLflow
AWS
C/C++
Google Cloud Platform
Requirements
  • Bachelor’s degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience).
  • 5+ years of experience designing and building backend infrastructure in cloud environments (e.g., AWS, GCP, Azure).
  • Proven experience in ML infrastructure and tooling (e.g., PyTorch, TensorFlow, Vertex AI, SageMaker, Kubernetes, etc.).
  • Strong software development skills in languages like Python, or C++.
  • Deep understanding of distributed systems fundamentals: scheduling, orchestration, storage, networking, and compute optimization.
Responsibilities
  • Architect and build scalable, resilient, and high-performance backend infrastructure to support distributed training, inference, and data processing pipelines.
  • Lead technical design discussions, mentor other engineers, and establish best practices for building and operating large-scale ML infrastructure.
  • Design and implement core backend services (e.g., job schedulers, resource managers, autoscalers, model serving layers) with a focus on efficiency and low latency.
  • Drive infrastructure optimization initiatives, including compute cost reduction, storage lifecycle management, and network performance tuning.
  • Collaborate cross-functionally with ML, DevOps, and product teams to translate research and product needs into robust infrastructure solutions.
  • Continuously evaluate and integrate cloud-native and open-source technologies (e.g., Kubernetes, Kubeflow, MLFlow) to enhance our platform’s capabilities and reliability.
  • Own end-to-end systems from design to deployment and observability, with a strong emphasis on reliability, fault tolerance, and operational excellence.
Desired Qualifications
  • Master’s or PhD in Computer Science or related field.
  • Experience leading infrastructure projects supporting large-scale ML/AI workloads or high-throughput systems.
  • Familiarity with infrastructure-as-code and CI/CD tooling (e.g., Terraform, ArgoCD, GitOps).
  • Track record of driving system performance, reliability, and cost-efficiency improvements.
  • Contributions to open-source cloud or ML infrastructure projects a plus.

Fireworks AI provides an AI inference platform that helps organizations run, customize, and deploy machine learning models. It supports deployment, fine-tuning, and inference workflows, with access to open-source models and on-demand deployment options through a subscription model. The platform enables users to configure and optimize models for production environments, manage variants, and scale AI workloads while controlling costs. Unlike some competitors, the emphasis is on end-to-end production-grade inference and fine-tuning across a range of clients, from tech firms to research institutions and enterprises, with tools tailored for rapid deployment and cost efficiency. The long-term goal is to expand the use of AI in production by building compound AI systems, growing the team, and delivering broader platform capabilities that accelerate AI adoption in real-world settings.

Company Size

51-200

Company Stage

Private

Total Funding

$327M

Headquarters

Redwood City, California

Founded

2022

Simplify Jobs

Simplify's Take

What believers are saying

  • $254M Series C at $4B valuation funds enterprise AI infrastructure expansion.
  • Processes 10 trillion tokens daily for 10,000 customers like Uber and Shopify.
  • $315M annualized revenue grows 416% year-over-year with 50% gross margins.

What critics are saying

  • Groq LPUs deliver 10x faster speeds, eroding Fireworks' latency edge in 6-12 months.
  • Together AI undercuts costs with better GPU utilization, capturing customers in 12-18 months.
  • AWS Inferentia2 offers 40% lower inference costs, migrating Shopify clients in 12-24 months.

What makes Fireworks AI unique

  • Fireworks AI delivers 12x faster inference than vLLM via custom CUDA kernels.
  • Hathora acquisition enables global low-latency orchestration across 14 regions.
  • Compound AI systems automate customization for millions of specialized models.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Professional Development Budget

Growth & Insights and Company News

Headcount

6 month growth

1%

1 year growth

4%

2 year growth

0%
Fireworks AI
Mar 10th, 2026
Fireworks AI

Fireworks Acquires Hathora to Accelerate Global Computer Orchestration

SiliconANGLE Media
Mar 9th, 2026
Fireworks AI acquires Hathora to build global real-time compute infrastructure for agentic AI

Fireworks AI has acquired Hathora, a real-time compute and server orchestration platform, to strengthen its global compute infrastructure for AI inference and training. Chief executive Lin Qiao described the deal as a talent-and-infrastructure acquisition rather than a customer acquisition. Hathora, launched in 2023, built a container orchestration platform across 14 regions serving multiplayer games and real-time AI workloads. Qiao drew parallels between gaming infrastructure's latency demands and AI inference requirements, noting gamers tolerate reduced graphics but not lag. The acquisition supports Fireworks' vision of "millions of models" continuously customised for specific use cases, rather than relying on single generalised models. Qiao emphasised that Fireworks focuses on automated customisation beyond just inference, positioning the company to handle the increasing velocity of agentic AI interactions.

Hathora
Mar 4th, 2026
Hathora Is Joining Fireworks AI

Hathora is joining Fireworks AI. 04 Mar 2026 Today, Hathora Inc.'d like to share that Hathora has been acquired by Fireworks AI. Its team will be joining Fireworks to work on compute orchestration for AI inference at scale. Over the past four years, Hathora Inc. built a global container orchestration platform spanning 14 regions, two bare metal providers and four clouds. Hathora Inc. powered server infrastructure for live titles like Splitgate 2, Stormgate, and Predecessor, and more recently expanded into real-time AI workloads with its voice model marketplace. The throughline was always the same: low-latency compute orchestration across heterogeneous infrastructure, without compromising on performance. Fireworks AI is where that work can have the most impact. The team Hathora Inc. built at Hathora is obsessed with infrastructure, and at Fireworks, they can continue to do what they do best. Founded by the team behind PyTorch at Meta, Fireworks processes more than 10 trillion tokens a day for over 10,000 customers and has built one of the fastest-growing AI inference platforms in the world. The challenge of orchestrating GPU compute across providers at the latency, reliability, and performance their customers demand is exactly the problem Hathora Inc. has spent four years solving. For its gaming customers, support will continue through May 5, 2026, and Hathora Inc. has partnered with Nitrado's GameFabric to provide a clear migration path and hands-on support through the transition. Hathora Inc. has already been in direct contact with its active customers. Details on timing and migration support have been shared directly. Thank you to its team, who took a bet on two first-time founders. To Upfront Ventures, Founders Fund and Lunar Ventures for backing Hathora Inc. early. And to its customers, from the game studios who shaped its platform to the AI teams who pushed Hathora Inc. forward. Onwards, Harsh & Sid

The Wall Street Journal
Oct 28th, 2025
Fireworks AI Raises $254M, Valued $4B

Fireworks AI, a startup focused on providing developers with access to advanced AI chips and models, announced it has raised $254 million in a recent funding round. This investment values the company at $4 billion, according to the Wall Street Journal.

Tech Funding News
Jul 29th, 2025
Fireworks AI nears $4B valuation.

Fireworks AI, a California-based startup, is nearing a $4 billion valuation in talks with Lightspeed and Index Ventures, up from $552 million after its Series B funding led by Sequoia, NVIDIA, AMD, and MongoDB. This rapid growth highlights the demand for scalable AI infrastructure. Co-founded by Lin Qiao, a former Meta engineering leader, Fireworks AI aims to democratize AI by simplifying the deployment of advanced models, addressing enterprise needs for computing resources and expertise.