Full-Time

Member of Technical Staff: ML Infrastructure

Platform Engineer

Posted on 8/4/2025

essential AI

essential AI

11-50 employees

Offers practical AI models and services

No salary listed

San Francisco, CA, USA

In Person

Category
Software Engineering (1)
Requirements
  • A strong understanding of architectures of new AI accelerators like GPU, TPU, IPU, HPU etc and their tradeoffs. Knowledge of parallel computing concepts and distributed systems.
  • Experience with Kernels, Low precision training, MoE.
  • Prior experience in performance tuning of training and/or inference large language model workloads. Experience with MLPerf or internal production workloads will be valued.
  • 6+ years of relevant industry experience in leading the design of large-scale & production ML infrastructure systems. Experience with Communication Libraries.
  • Experience with training and building large language models using frameworks such as Megatron, DeepSpeed, etc and deployment frameworks like vLLM, TGI, TensorRT-LLM etc
  • Comfortable with working under-the-hood with kernel languages like OAI Triton, Pallas and compilers like XLA
  • Experience with INT8/FP8 training and inference, quantization and/or distillation
  • Knowledge of container technologies like Docker and Kubernetes and cloud platforms like AWS, GCP, etc.
  • Intermediate fluency with network fundamentals like VPC, Subnets, Routing Tables, Firewalls etc
Responsibilities
  • Oversee and drive the vision of how to build, test, and deploy models, while taking ownership and transform state-of-the-art development experience for research
  • Design, build, and maintain scalable machine learning infrastructure to support our model training, inference and applications
  • Design and implement scalable machine learning and distributed systems that enable training and scaling of large language models. Work on parallelism methods to improve training in a fast and reliable way
  • Work on lower levels of the stack to build high-performing and optimal training and serving infrastructure, including researching new techniques and writing custom kernels as needed to achieve improvements
  • Develop tools and frameworks to automate and streamline ML experimentation and management
  • Collaborate with other researchers and product engineers to bring magical product experiences through large language models
  • Be willing to optimize performance and efficiency across different accelerators

EssentialAI builds simple, practical machine learning tools for everyday tasks, including a Dog Breed Predictor that identifies dog breeds from photos. The predictor analyzes an image with ML models and returns the breed result through an online service, while other tools handle calculations and image processing on the same platform. It differentiates itself by targeting a broad audience with easy-to-use interfaces, multilingual support, and options for custom AI solutions in addition to ready-made models. Its goal is to make AI useful in daily life and expand access to customized AI capabilities for individuals, tech enthusiasts, and businesses.

Company Size

11-50

Company Stage

Series A

Total Funding

$64.8M

Headquarters

San Francisco, California

Founded

2023

Simplify Jobs

Simplify's Take

What believers are saying

  • Raised $56.5M Series A from March Capital, NVIDIA, Google, AMD in December 2023.
  • $8.3M seed led by Thrive Capital totals nearly $65M funding since 2023 founding.
  • San Francisco team of 80+ expands globally with elite Google veterans driving innovation.

What critics are saying

  • OpenAI agents capture Essential AI's enterprise workflow market within 12-24 months.
  • Anthropic Claude displaces Enterprise Brain users in 6-12 months with superior reasoning.
  • NVIDIA, AMD prioritize own platforms, deny Essential AI GPU access in 12-18 months.

What makes essential AI unique

  • Essential AI builds Enterprise Brain automating monotonous workflows 10x faster for data analysts.
  • Co-founders Ashish Vaswani and Niki Parmar invented Transformer architecture powering modern LLMs.
  • Open platform accelerates deep learning with Rnj-1 models focused on STEM and code.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Relocation Assistance

401(k) Company Match

Company Equity

Health Insurance

Growth & Insights and Company News

Headcount

6 month growth

2%

1 year growth

5%

2 year growth

12%
Silicon Valley Daily
Dec 29th, 2023
Essential AI Lands $56.5 Million Series A

SAN FRANCISCO - Essential AI has raised a $56.5 million Series A funding round led by March Capital including participation from AMD, Franklin Venture Partners, Google, KB Investment, NVIDIA, and Thrive Capital.

Crunchbase
Dec 14th, 2023
Eye On AI: Big Tech Continues To Invest In AI Startups, But For How Long?

Essential AI announced this week it had raised $56.5 million in new funding led by March Capital.

VentureBeat
Dec 12th, 2023
Essential AI emerges from stealth with backing from Google, Nvidia and AMD

Essential AI says the products would make data analysts 10x faster and give business users the ability to become data-driven decision-makers.

Bloomberg L.P.
Dec 12th, 2023
Essential AI Comes Out of Stealth With $57 Million in Funding

The startup was founded by two co-authors of a famous Google AI paper

PYMNTS
Sep 13th, 2023
Essential Ai Raising $40 Million To Build Llm Software

Artificial intelligence (AI) company Essential AI has reportedly raised $40 million in new funding. The funding round for the startup — described in a Bloomberg News report Wednesday (Sept. 13) as a “secretive” company — comes amid a wave of financing for the AI sector. . The Bloomberg report — citing a source familiar with the deal — noted that Essential had raised $8 million a few months ago in a round led by Thrive Capital, which also invested in OpenAI

INACTIVE