Full-Time

Senior Staff AI Platform Engineer

Posted on 6/6/2026

NVIDIA

NVIDIA

10,001+ employees

Designs GPUs and AI HPC platforms

Compensation Overview

$168k - $322k/yr

+ Equity

Company Historically Provides H1B Sponsorship

Santa Clara, CA, USA

In Person

Category
DevOps & Infrastructure (1)
Required Skills
Kubernetes
MLOps
Rust
Python
Data Structures & Algorithms
Vulnerability Analysis
Go
Observability
C/C++
Requirements
  • 10+ years in cloud, platform, or SRE roles with relevant education or equivalent experience.
  • Bachelor's degree or equivalent experience.
  • Strong Python and at least one systems language (C++, Go, or Rust), with proven distributed systems debugging expertise.
  • Deep experience building and scaling distributed systems, including Kubernetes and bare-metal infrastructure.
  • Strong observability design across infrastructure and AI workloads (metrics, logging, tracing, AI quality signals).
  • Hands-on experience operating AI/ML platforms, including MLOps, model serving, and GPU-accelerated environments.
  • Experience with infrastructure and application security practices, such as identity/auth, network segmentation, supply chain security, and vulnerability management in cloud-native environments.
  • Practical use of AI-assisted development tools and coding agents in daily workflows.
  • Solid foundation in data structures, algorithms, and complexity analysis.
  • Excellent problem-solving, communication, and collaboration across multiple functions.
Responsibilities
  • Define and lead AI-native infrastructure roadmaps and cross-organizational initiatives.
  • Architect and scale LLM/ML infrastructure across cloud-native clusters and on-premises hardware.
  • Design and implement observability for infrastructure health and AI model performance.
  • Build LLM-aware monitoring and leverage AI to improve incident response and reduce toil.
  • Develop automation and tooling to ensure reliability, scalability, and developer self-services
  • Troubleshoot complex distributed systems, including deep Kubernetes and AI/ML scaling challenges.
  • Drive AI-assisted engineering practices and mentor engineers to foster an AI-first culture.
  • Partner with product engineering and internal business units to translate AI platform capabilities into reliable, scalable solutions that accelerate product development.
Desired Qualifications
  • Deep experience with AI/ML platforms (e.g., Hugging Face, Weights & Biases, NVIDIA NIM).
  • Proven use of AI agents and LLM tooling to enhance observability, incident response, or developer productivity.
  • Experience with artifact management, AI supply chain security, or trusted model distribution systems.
  • Experience with AI-specific threat models (OWASP Top 10 for LLMs, model poisoning, adversarial inputs), experience with FedRAMP, SOC 2, or other compliance frameworks relevant to your environment, and red-teaming or security evaluation of LLM systems.
  • Strong ownership demeanor with a structured, automation-first approach.
  • Demonstrated impact driving AI-first engineering practices across teams.

NVIDIA designs and manufactures graphics processing units (GPUs) and computing platforms used for gaming, data centers, and artificial intelligence. These products work by using parallel processing to handle complex mathematical calculations much faster than standard computer processors, supported by a software ecosystem that allows developers to build and run AI models. Unlike competitors that may focus solely on hardware, NVIDIA integrates its chips with specialized software and cloud services to create a complete environment for high-performance tasks. The company’s goal is to provide the underlying technology necessary to power advanced computing, from realistic video game graphics to autonomous vehicles and large-scale data analysis.

Company Size

10,001+

Company Stage

IPO

Headquarters

Santa Clara, California

Founded

1993

Simplify Jobs

Simplify's Take

What believers are saying

  • Over 80% AI-training GPU share supports pricing power and scale.
  • Microsoft partnership deepens Windows and Azure distribution for agentic AI.
  • Helix and Abridge expand NVIDIA into infrastructure and regulated healthcare.

What critics are saying

  • U.S. export controls restrict H200 sales to China case-by-case.
  • Blackwell shortages leave demand unfilled and invite accelerator competitors.
  • Helix exposes NVIDIA to capital-intensive projects and partner execution risk.

What makes NVIDIA unique

  • CUDA and full-stack software lock developers into NVIDIA's ecosystem.
  • Blackwell GPUs target AI factories, not just gaming or graphics.
  • Its platforms span gaming, data centers, robotics, automotive, and healthcare.

Help us improve and share your feedback! Did you find this helpful?

Your Connections

People at NVIDIA who can refer or advise you

Benefits

Company Equity

401(k) Company Match

Growth & Insights and Company News

Headcount

6 month growth

0%

1 year growth

-3%

2 year growth

-3%
Mistral AI
May 28th, 2026
Mistral AI raises $1.9B at $13.2B valuation, led by ASML to advance frontier AI research

Mistral AI has raised €1.7 billion in a Series C funding round at an €11.7 billion post-money valuation. The round was led by semiconductor equipment manufacturer ASML Holding, with participation from existing investors including DST Global, Andreessen Horowitz, Bpifrance, General Catalyst, Index Ventures, Lightspeed and NVIDIA. The Paris-based AI company will use the funding to advance its scientific research and develop custom decentralised frontier AI solutions for complex engineering and industrial problems. ASML CEO Christophe Fouquet said the partnership aims to generate benefits for ASML customers through AI-enabled products and solutions. Mistral AI CEO Arthur Mensch stated the investment will help address engineering challenges in the semiconductor and AI value chain whilst maintaining the company's independence.

Decart
May 18th, 2026
Decart Raises $300M: Tech Leaders Back the Company as Both Customers and Investors | Decart AI

With funding led by Radical Ventures, Decart is building the infrastructure layer for the next generation of low-latency AI systems, through three product lines: DOS, an ultra-optimized inference and training stack that enables agents and reasoning models to run smarter and faster; and the models Lucy, its World Model for Immersive Experiences; and Oasis, its World Model for Physical AI – both powered by DOS. Today, we’re also announcing DOS 2.0, with new versions of Lucy and Oasis launching in the coming weeks.

The Associated Press
Apr 15th, 2026
Matlantis integrates NVIDIA ALCHEMI Toolkit for 10x faster materials simulation

Matlantis has integrated NVIDIA's ALCHEMI Toolkit into its materials simulation platform to accelerate industrial materials discovery. The company previously incorporated NVIDIA Warp-optimised kernels, achieving up to 10x speed improvements in atomistic calculations. The integration includes LightPFP, Matlantis' lightweight potential for large-scale simulations, which uses a server-based architecture with NVIDIA ALCHEMI Toolkit-Ops to reduce communication bottlenecks. Matlantis plans to integrate its flagship Universal Machine-Learning Interatomic Potential with the toolkit to further enhance GPU efficiency. Launched in 2021, Matlantis is a cloud-based atomistic simulator jointly developed by PFN and ENEOS. The platform uses deep learning to increase simulation speeds by tens of thousands of times and serves over 150 companies discovering materials including catalysts, batteries and semiconductors.

CNBC
Apr 14th, 2026
Nvidia stock surges 18% on 10-day winning streak fuelled by $1T GPU orders through 2027

Nvidia shares have climbed 18% over a ten-day winning streak, the longest since 2023. The stock is trading about 8% below its October all-time high of $212.19. CEO Jensen Huang revealed at last month's GTC conference that Nvidia has over $1 trillion in GPU orders through 2027, including Blackwell and next-generation Vera Rubin chips. Data centre revenue surged 75% year-over-year and now comprises 88% of the business, a dramatic shift from five years ago when gaming dominated. The rally follows major deals including Meta's February commitment to deploy millions of Nvidia chips across its global data centres. On Monday, Nvidia denied rumours it was pursuing acquisitions of PC makers Dell or HP. The company also unveiled Ising, a new family of open-source models for quantum computing.

Yahoo Finance
Apr 14th, 2026
D-Wave CEO claims quantum computers could challenge Nvidia's AI dominance with superior power efficiency

D-Wave Quantum CEO Alan Baratz claims quantum computing poses a threat to Nvidia, citing superior energy efficiency. Speaking at the Semafor World Economy Summit, Baratz said D-Wave's quantum computer uses just 10 kilowatts of power—equivalent to five or 10 GPUs—whilst solving problems that would take GPU systems nearly a million years. D-Wave shares rose nearly 16% on Tuesday, part of a 140% gain over the past year. The company reported $2.75 million in Q4 revenue, missing analyst estimates, but bookings surged 471% to $13.4 million. The $5.3 billion company recently secured a $20 million agreement with Florida Atlantic University and acquired Quantum Circuits for $550 million. However, quantum machines remain specialised tools, unable to run large language models that drive Nvidia's dominance.