Full-Time

Applied Machine Learning Engineer

Fireworks AI

Fireworks AI

51-200 employees

AI inference platform for model deployment

Compensation Overview

$170k - $240k/yr

+ Equity

Company Does Not Provide H1B Sponsorship

San Mateo, CA, USA + 1 more

More locations: New York, NY, USA

In Person

Category
AI & Machine Learning (2)
,
Required Skills
Python
Machine Learning
Requirements
  • Bachelor’s degree in Computer Science, Engineering, or a related technical field
  • 5+ years of experience in a software engineering role, with a strong preference for customer-facing roles
  • Robust coding skills, preferably in Python
  • Demonstrated ability to lead and execute complex technical projects with a focus on customer success
  • Strong interpersonal and communication skills; ability to thrive in dynamic, cross-functional teams
Responsibilities
  • Collaborate directly with the GTM team (Account Executives and Solutions Architects) to ensure smooth integration and successful deployment of ML solutions
  • Build and present compelling PoCs that demonstrate the capabilities of our AI technology
  • Design, develop, and deploy end-to-end AI-powered applications tailored to customer needs
  • Contribute to the internal ML platform, including adding features and resolving issues
  • Integrate and enable new machine learning models into the existing platform or client environments
  • Improve system performance, efficiency, and scalability of deployed models and applications
  • Work closely with partners to enable joint AI solutions and ensure seamless collaboration
Desired Qualifications
  • Master’s degree in Computer Science, Engineering, or a related technical field
  • Experience working in a startup or fast-paced environment
  • Hands-on experience fine-tuning machine learning models, including supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF or RFT)
  • Solid understanding of generative AI, machine learning principles, and enterprise infrastructure

Fireworks AI provides an AI inference platform that helps organizations run, customize, and deploy machine learning models. It supports deployment, fine-tuning, and inference workflows, with access to open-source models and on-demand deployment options through a subscription model. The platform enables users to configure and optimize models for production environments, manage variants, and scale AI workloads while controlling costs. Unlike some competitors, the emphasis is on end-to-end production-grade inference and fine-tuning across a range of clients, from tech firms to research institutions and enterprises, with tools tailored for rapid deployment and cost efficiency. The long-term goal is to expand the use of AI in production by building compound AI systems, growing the team, and delivering broader platform capabilities that accelerate AI adoption in real-world settings.

Company Size

51-200

Company Stage

Private

Total Funding

$327M

Headquarters

Redwood City, California

Founded

2022

Simplify Jobs

Simplify's Take

What believers are saying

  • $254M Series C at $4B valuation funds enterprise AI infrastructure expansion.
  • Processes 10 trillion tokens daily for 10,000 customers like Uber and Shopify.
  • $315M annualized revenue grows 416% year-over-year with 50% gross margins.

What critics are saying

  • Groq LPUs deliver 10x faster speeds, eroding Fireworks' latency edge in 6-12 months.
  • Together AI undercuts costs with better GPU utilization, capturing customers in 12-18 months.
  • AWS Inferentia2 offers 40% lower inference costs, migrating Shopify clients in 12-24 months.

What makes Fireworks AI unique

  • Fireworks AI delivers 12x faster inference than vLLM via custom CUDA kernels.
  • Hathora acquisition enables global low-latency orchestration across 14 regions.
  • Compound AI systems automate customization for millions of specialized models.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Professional Development Budget

Growth & Insights and Company News

Headcount

6 month growth

1%

1 year growth

4%

2 year growth

0%
Fireworks AI
Mar 10th, 2026
Fireworks AI

Fireworks Acquires Hathora to Accelerate Global Computer Orchestration

SiliconANGLE Media
Mar 9th, 2026
Fireworks AI acquires Hathora to build global real-time compute infrastructure for agentic AI

Fireworks AI has acquired Hathora, a real-time compute and server orchestration platform, to strengthen its global compute infrastructure for AI inference and training. Chief executive Lin Qiao described the deal as a talent-and-infrastructure acquisition rather than a customer acquisition. Hathora, launched in 2023, built a container orchestration platform across 14 regions serving multiplayer games and real-time AI workloads. Qiao drew parallels between gaming infrastructure's latency demands and AI inference requirements, noting gamers tolerate reduced graphics but not lag. The acquisition supports Fireworks' vision of "millions of models" continuously customised for specific use cases, rather than relying on single generalised models. Qiao emphasised that Fireworks focuses on automated customisation beyond just inference, positioning the company to handle the increasing velocity of agentic AI interactions.

Hathora
Mar 4th, 2026
Hathora Is Joining Fireworks AI

Hathora is joining Fireworks AI. 04 Mar 2026 Today, Hathora Inc.'d like to share that Hathora has been acquired by Fireworks AI. Its team will be joining Fireworks to work on compute orchestration for AI inference at scale. Over the past four years, Hathora Inc. built a global container orchestration platform spanning 14 regions, two bare metal providers and four clouds. Hathora Inc. powered server infrastructure for live titles like Splitgate 2, Stormgate, and Predecessor, and more recently expanded into real-time AI workloads with its voice model marketplace. The throughline was always the same: low-latency compute orchestration across heterogeneous infrastructure, without compromising on performance. Fireworks AI is where that work can have the most impact. The team Hathora Inc. built at Hathora is obsessed with infrastructure, and at Fireworks, they can continue to do what they do best. Founded by the team behind PyTorch at Meta, Fireworks processes more than 10 trillion tokens a day for over 10,000 customers and has built one of the fastest-growing AI inference platforms in the world. The challenge of orchestrating GPU compute across providers at the latency, reliability, and performance their customers demand is exactly the problem Hathora Inc. has spent four years solving. For its gaming customers, support will continue through May 5, 2026, and Hathora Inc. has partnered with Nitrado's GameFabric to provide a clear migration path and hands-on support through the transition. Hathora Inc. has already been in direct contact with its active customers. Details on timing and migration support have been shared directly. Thank you to its team, who took a bet on two first-time founders. To Upfront Ventures, Founders Fund and Lunar Ventures for backing Hathora Inc. early. And to its customers, from the game studios who shaped its platform to the AI teams who pushed Hathora Inc. forward. Onwards, Harsh & Sid

The Wall Street Journal
Oct 28th, 2025
Fireworks AI Raises $254M, Valued $4B

Fireworks AI, a startup focused on providing developers with access to advanced AI chips and models, announced it has raised $254 million in a recent funding round. This investment values the company at $4 billion, according to the Wall Street Journal.

Tech Funding News
Jul 29th, 2025
Fireworks AI nears $4B valuation.

Fireworks AI, a California-based startup, is nearing a $4 billion valuation in talks with Lightspeed and Index Ventures, up from $552 million after its Series B funding led by Sequoia, NVIDIA, AMD, and MongoDB. This rapid growth highlights the demand for scalable AI infrastructure. Co-founded by Lin Qiao, a former Meta engineering leader, Fireworks AI aims to democratize AI by simplifying the deployment of advanced models, addressing enterprise needs for computing resources and expertise.