Full-Time

Staff Infrastructure Engineer

AI Scientist Team

Confirmed live in the last 24 hours

Anthropic

Anthropic

1,001-5,000 employees

Develops reliable and interpretable AI systems

Compensation Overview

$315k - $560k/yr

Senior, Expert

H1B Sponsorship Available

San Francisco, CA, USA

Category
DevOps & Infrastructure
Platform Engineering
Cloud Engineering
Required Skills
Kubernetes
Agile
Docker
Reinforcement Learning
Requirements
  • Familiarity with performance optimization
  • Familiarity with distributed systems
  • Familiarity with vm/sandboxing/container deployment
  • Familiarity with large scale data pipelines
  • 3+ years of highly relevant experience in infrastructure engineering
  • Demonstrated expertise in large-scale distributed systems
  • Deep knowledge of performance optimization techniques and system architectures for high-throughput ML workloads
  • Experience with containerization technologies (Docker, Kubernetes) and orchestration at scale
  • Proven track record of building large-scale data pipelines and distributed storage systems
  • Ability to diagnose and resolve complex infrastructure challenges in production environments
  • Ability to work effectively across the full ML stack from data pipelines to performance optimization
  • Experience collaborating with other researchers to scale experimental ideas
Responsibilities
  • Design and implement large-scale infrastructure systems to support AI scientist training, evaluation, and deployment across distributed environments
  • Identify and resolve infrastructure bottlenecks impeding progress toward scientific capabilities
  • Develop robust and reliable evaluation frameworks for measuring progress towards scientific AGI
  • Build scalable and performant VM/sandboxing/container architectures to safely execute long-horizon AI tasks and scientific workflows
  • Collaborate to translate experimental requirements into production-ready infrastructure
  • Develop large scale data pipelines to handle advanced language model training requirements
  • Optimize large scale training and inference pipelines for stable and efficient reinforcement learning
Desired Qualifications
  • Experience with language model training infrastructure and distributed ML frameworks (PyTorch, JAX, etc.)
  • Background in building infrastructure for AI research labs or large-scale ML organizations
  • Knowledge of GPU/TPU architectures and language model inference optimization
  • Experience with cloud platforms (AWS, GCP) at enterprise scale
  • Familiarity with VM and container orchestration
  • Experience with workflow orchestration tools and experiment management systems
  • History working with large scale reinforcement learning
  • Comfort with large scale data pipelines (Beam, Spark, Dask, …)

Anthropic focuses on creating reliable and interpretable AI systems. Its main product, Claude, is an AI assistant that can perform a variety of tasks for clients across different industries. Claude uses advanced techniques in natural language processing, reinforcement learning, and human feedback to understand and respond to user needs effectively. What sets Anthropic apart from its competitors is its emphasis on making AI systems that are not only powerful but also easy to understand and control. The company's goal is to enhance operational efficiency and decision-making for its clients by providing AI-driven solutions that can be tailored to specific requirements.

Company Size

1,001-5,000

Company Stage

Debt Financing

Total Funding

$19.3B

Headquarters

San Francisco, California

Founded

2021

Simplify Jobs

Simplify's Take

What believers are saying

  • Growing demand for AI safety tools aligns with Anthropic's mission.
  • Claude Gov targets the expanding AI-driven national security market.
  • Opportunities exist to enhance AI infrastructure through cloud partnerships.

What critics are saying

  • Claude faces competition from ChatGPT's market dominance.
  • Reliance on third-party cloud services poses potential vulnerabilities.
  • Reddit's lawsuit could lead to legal and financial challenges.

What makes Anthropic unique

  • Anthropic focuses on AI safety, transparency, and alignment with human values.
  • Claude is designed to handle tasks at any scale across various industries.
  • Anthropic emphasizes reliable, interpretable, and steerable AI systems.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Flexible Work Hours

Paid Vacation

Parental Leave

Hybrid Work Options

Company Equity

Growth & Insights and Company News

Headcount

6 month growth

-5%

1 year growth

5%

2 year growth

0%
VentureBeat
Jun 12th, 2025
Cloud Collapse: Replit And Llamaindex Knocked Offline By Google Cloud Identity Outage

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn moreDays after OpenAI and Google Cloud announced a partnership to support the growing use of generative AI platforms, much of the AI-powered web and tools went down due to an outage of the leading cloud providers.Google Cloud Service Platform (GCP) and some Cloudflare services began experiencing issues around 10:00 a.m. PT today, affecting several AI development tools and data storage services, including ChatGPT and Claude, as well as a variety of other AI platforms.We are aware of a service disruption to some Google Cloud services and we are working hard to get you back up and running ASAP.Please view our status dashboard for the latest updates: https://t.co/sT6UxoRK4R — Google Cloud (@googlecloud) June 12, 2025A GCP spokesperson confirmed the outage to VentureBeat, urging users to check its public status dashboard.GCP said affected services include API Gateway, Agent Assist, Cloud Data Fusion, Contact Center AI Platform, Google App Engine, Google BigQuery, Google Cloud Storage, Identity Platform, Speech-to-Text, Text-to-Speech and Vertex AI Search, among other tools. Google’s mobile development platform, Firebase, also went down.VentureBeat staffers had trouble accessing Google Meet, but other Google services on Workspace remained online.A Cloudflare spokesperson told VentureBeat only “a limited number of services at Cloudflare use Google Cloud and were impacted. We expect them to come back shortly

Decrypt
Jun 11th, 2025
Chatgpt Is Eating The Internet: Openai Commands 80% Of Ai Market

In brief ChatGPT attracts more traffic than the next nine AI tools combined, with 5.5 billion visits crushing Gemini and Claude.Chinese startup DeepSeek exploded from 33.7 million to 436 million monthly visits in just four months, operating at 1/30th the cost of Western models.Traditional sectors are hemorrhaging users: freelance platforms down 14%, educational sites down 19%, and Chegg collapsed 64% as students turn to AI.Decrypt’s Art, Fashion, and Entertainment Hub. Discover SCENEThe latest traffic data from Similarweb reveals an uncomfortable truth about the generative AI market: While the world debates which AI model is technically superior, ChatGPT has already won the user adoption war by a margin so vast it defies conventional competition metrics.With 5.5 billion visits in May 2025, ChatGPT commands roughly 80% of all generative AI traffic. To grasp this scale: ChatGPT gets more visits than Google's Gemini, DeepSeek, Grok, Perplexity, and Claude combined. Then doubled. Then add another few millions for good measure.Image: SimilarwebChatGPT surged past 500 million weekly active users in late March, and ChatGPT's mobile app already averaged more than 250 million monthly active users between September and November 2024. OpenAI's partnership with Microsoft has certainly helped, but the scale suggests something more fundamental: ChatGPT has become the default AI assistant for hundreds of millions of users worldwide.What's particularly interesting is ChatGPT's resilience

PYMNTS
Jun 10th, 2025
Report: Openai Taps Google’S Cloud Service For Extra Computing Capacity

The companies had been discussing a deal for months but had been unable to sign it until an agreement that made Microsoft OpenAI’s exclusive data center infrastructure provider ended in January, Reuters reported Tuesday (June 10), citing unnamed sources. OpenAI and Google finalized their deal in May, according to the report. Neither OpenAI nor Google immediately replied to PYMNTS’ request for comment

UBOS
Jun 6th, 2025
Anthropic Strengthens Its Governing Trust with National Security Expertise

In a significant development within the AI industry, Anthropic has appointed Richard Fontaine, a renowned national security expert, to its long-term benefit trust.

HotHardware
Jun 6th, 2025
Reddit Slams AI Firm Anthropic In Privacy Lawsuit Alleging Mass Data Scraping

It alleges that Anthropic's actions amounted to a breach of Reddit's rules and regulations.