Full-Time

Site Reliability Engineer

Infrastructure, Analytics Platform

OpenAI

OpenAI

5,001-10,000 employees

Develops safe AI models and tools

No salary listed

San Francisco, CA, USA

In Person

Category
DevOps & Infrastructure (1)
Required Skills
Kubernetes
Apache Kafka
Clickhouse
Terraform
DevOps
Snowflake
Requirements
  • A track record of owning production infrastructure for data-heavy, low-latency systems end to end.
  • Strong hands-on experience operating ClickHouse, Kafka, and adjacent large-scale data systems.
  • Practical experience with Snowflake workflows and cross-system data architecture.
  • The ability to independently define operational standards (runbooks, incident process, rollout safety) and make them stick.
  • Strong operational experience with Kubernetes, Terraform, and cloud infrastructure.
  • Excellent communication and collaboration skills; you work effectively across engineering and research teams.
  • High personal rigor and organization in high-pressure production environments.
  • A deeply hands-on mindset: willing to debug incidents, tune systems, and implement fixes directly.
Responsibilities
  • Own infrastructure lifecycle management across provisioning, upgrades, scaling, and decommissioning (IaC-first).
  • Operate and scale ClickHouse clusters, including sharding, replication, capacity planning, performance tuning, and maintenance.
  • Operate Kafka as the ingestion backbone, improving throughput, lag, backpressure handling, and failure recovery.
  • Improve end-to-end latency and reliability for data-heavy serving and query workloads.
  • Build and maintain strong monitoring and alerting: SLIs/SLOs, dashboards, alert policies, and actionable runbooks.
  • Define, implement, and continuously improve incident response standards, on-call practices, and postmortem quality.
  • Own backup/restore and disaster recovery strategy, including regular recovery drills.
  • Plan and execute safe rollouts across multiple environments (dev/stage/prod), including canary and rollback strategies.
  • Partner day to day with software engineers, embedding reliability into design, implementation, and release processes.
  • Set the quality bar for operational readiness and runbook standards, and drive adoption across teams.
  • Improve CI/CD pipelines and DevEx for faster, safer, and more predictable releases.
  • Strengthen security posture across infrastructure and delivery systems (least privilege, secrets management, patching, supply-chain controls).

OpenAI conducts AI research and deployment to build advanced AI models and tools that help people automate tasks, be more creative, and make better decisions. Its products include ChatGPT, a conversational AI that can write, code, tutor, and assist in interactive tasks, and Sora, which can generate videos from text prompts. OpenAI’s models typically run through cloud-based services and subscriptions, with licensing and partnerships for broader use. The company operates a capped-profit model to balance generating revenue with ensuring safety, ethics, and long-term societal benefits. Its approach emphasizes safety, responsible deployment, and collaboration with researchers, governments, and institutions. The goal is to ensure artificial general intelligence, when it arrives, benefits all of humanity and minimizes risks.

Company Size

5,001-10,000

Company Stage

Late Stage VC

Total Funding

$196B

Headquarters

San Francisco, California

Founded

2015

Simplify Jobs

Simplify's Take

What believers are saying

  • SoftBank pledges $30B more in 2026, valuing OpenAI at $850B with $25B annualized revenue.
  • University of Michigan's $20M investment yields $2B, signaling strong early investor returns.
  • TPG joint venture with billions in PE funding boosts corporate AI adoption rates.

What critics are saying

  • Microsoft divests after $100B spend by June 2026, funding Anthropic and xAI instead.
  • Doubled GPT-5.5 API prices to $5/$30 per million tokens slash 5% subscriber conversions.
  • Europe bans GPT-5.5-Cyber for non-partners post-Mythos, blocking Trusted Access expansion.

What makes OpenAI unique

  • Trusted Access for Cyber grants vetted users GPT-5.4-Cyber for binary reverse engineering.
  • Daybreak platform automates threat prioritization, patching, and fix verification in software development.
  • Deployment Company with 150 Tomoro engineers accelerates enterprise AI integration via partnerships.

Help us improve and share your feedback! Did you find this helpful?

Your Connections

People at OpenAI who can refer or advise you

Benefits

Health insurance

Dental and vision insurance

Flexible spending account for healthcare and dependent care

Mental healthcare service

Fertility treatment coverage

401(k) with generous matching

20-week paid parental leave

Life insurance (complimentary)

AD&D insurance (complimentary)

Short-term/long-term disability insurance (complimentary)

Optional buy-up life insurance

Flexible work hours and unlimited paid time off (we encourage 4+ weeks per year)

Annual learning & development stipend

Regular team happy hours and outings

Daily catered lunch and dinner

Travel to domestic conferences

Growth & Insights and Company News

Headcount

6 month growth

-2%

1 year growth

3%

2 year growth

2%
Daring Fireball
May 8th, 2026
Y Combinator’s Stake in OpenAI

The fact that Paul Graham personally has billions of dollars at stake with OpenAI doesn’t mean that his public opinion on Sam Altman’s trustworthiness and leadership is invalid. But it certainly seems like the sort of thing that ought to be disclosed when quoting Graham as an Altman character reference.

Bloomberg L.P.
Apr 21st, 2026
OpenAI launches ChatGPT Images 2.0 with improved chart and diagram creation

OpenAI is releasing ChatGPT Images 2.0, an updated AI image-generating software designed to create accurate charts and scientific diagrams. The company aims to make its technology more appealing to professionals. Rolling out Tuesday through ChatGPT and Codex AI coding assistant, the new model improves instruction-following and detail incorporation when generating images. It can produce visuals across multiple styles and render text in various languages. The update represents OpenAI's effort to expand its AI capabilities beyond general use cases into professional applications requiring technical precision and accuracy.

Bloomberg L.P.
Apr 17th, 2026
OpenAI loses head of science initiatives and Sora AI video team leader

OpenAI's head of science initiatives and the leader of its Sora AI video team are leaving the company, adding to recent executive departures as the firm reorganises its product portfolio. The exits continue a pattern of senior leadership changes at the artificial intelligence company.

Bloomberg L.P.
Apr 16th, 2026
OpenAI unveils GPT-5.4 to tackle enterprise trust and governance concerns

OpenAI is addressing enterprise adoption challenges with GPT-5.4 "Cyber", focusing on security, trust and governance issues. Erica Brescia, managing director at Redpoint Ventures and OpenAI backer, discussed the development, emphasising that the AI cyber race centres on governance rather than purely technological advancement. The move represents OpenAI's effort to overcome barriers preventing widespread enterprise adoption of its AI systems by prioritising security features in its latest model release.

Bloomberg L.P.
Apr 16th, 2026
OpenAI launches GPT-Rosalind AI model for drug discovery to rival Google

OpenAI has launched GPT-Rosalind, an AI model designed to accelerate drug discovery and life sciences research. The model aims to extract insights from large datasets and help translate scientific studies into healthcare applications. Initially available as a research preview to select business customers, GPT-Rosalind's early users include pharmaceutical company Amgen, vaccine maker Moderna and bioscience research nonprofit the Allen Institute. The launch positions OpenAI alongside other technology companies entering the drug discovery field, as the industry seeks to demonstrate AI's potential for scientific breakthroughs. The ChatGPT maker announced the model's release on Thursday.