Full-Time

Operations Engineering Manager

Fleet Reliability

Confirmed live in the last 24 hours

CoreWeave

CoreWeave

501-1,000 employees

Cloud service for GPU-accelerated workloads

Compensation Overview

$210k - $230k/yr

Senior

Livingston, NJ, USA + 4 more

More locations: Plano, TX, USA | New York, NY, USA | Bellevue, WA, USA | Sunnyvale, CA, USA

Candidates not living within 30 miles of an office may be considered for remote work, but onboarding will require attendance at one of the hubs within the first month of employment.

Category
DevOps Engineering Management
Engineering Management
Requirements
  • seven or more years of experience in a software or infrastructure engineering industry
  • at least two years in a leadership capacity
  • knowledge and practice of SRE fundamentals
  • incident management
  • blameless culture
  • observability
  • change management
Responsibilities
  • Build and lead a 24/7 team of process-oriented, reliability and observability-focused engineers
  • Lead the socialization and documentation of clear and consistent processes for provisioning, validating and troubleshooting nodes in our server fleet
  • Think critically about and advocate for process and automation improvements prioritizing event-driven automated remediation as the end goal
  • Provide a 24/7 engineering support function for high-criticality, time-sensitive node delivery and maintenance
  • Drive and improve our program of onboarding, documentation, enablement, and performance management to help your team members achieve new heights of personal growth and capability
  • Drive the culture and tone for how your team keeps score both in how they communicate with and support each other and how they enable the rest of CoreWeave

CoreWeave provides cloud computing services that focus on GPU-accelerated workloads, which are essential for tasks requiring high computational power like Generative AI, Machine Learning, and Visual Effects rendering. Their services allow clients to access powerful computing resources without needing to invest in expensive hardware, operating on a pay-as-you-go basis. CoreWeave's infrastructure is built on a bare metal serverless Kubernetes platform, which enhances performance while minimizing operational complexity for clients. This setup is particularly beneficial for tech companies, film studios, and enterprises that need efficient data processing solutions. Unlike many competitors, CoreWeave offers a wide selection of NVIDIA GPUs, enabling clients to optimize performance and costs based on their specific needs. The company's goal is to provide scalable and efficient computing resources that adapt to the growing demands of various industries.

Company Size

501-1,000

Company Stage

IPO

Headquarters

New York City, New York

Founded

2017

Simplify Jobs

Simplify's Take

What believers are saying

  • CoreWeave's rapid deployment of NVIDIA systems positions them as leaders in AI cloud solutions.
  • Collaboration with Cohere, IBM, and Mistral AI advances AI model development and deployment.
  • CoreWeave's record-breaking AI inferencing benchmarks demonstrate their high-performance infrastructure capabilities.

What critics are saying

  • Emerging competition from Nscale could challenge CoreWeave's market position.
  • Nscale's $2.7 billion investment may lead to increased pricing pressure on CoreWeave.
  • Rapid expansion by competitors could reduce CoreWeave's market share.

What makes CoreWeave unique

  • CoreWeave specializes in GPU-accelerated workloads for AI, ML, and VFX rendering.
  • Their infrastructure uses NVIDIA GB200 NVL72 systems, setting industry benchmarks in AI inference.
  • CoreWeave's bare metal serverless Kubernetes platform offers high performance with reduced operational burden.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Health Insurance

Dental Insurance

Vision Insurance

Life Insurance

Disability Insurance

Health Savings Account/Flexible Spending Account

Tuition Reimbursement

Mental Health Support

Family Planning Benefits

Paid Parental Leave

Hybrid Work Options

401(k) Company Match

Unlimited Paid Time Off

Catered lunch each day in our office and data center locations

A casual work environment

Growth & Insights and Company News

Headcount

6 month growth

-1%

1 year growth

4%

2 year growth

5%
PYMNTS
Apr 28th, 2025
Report: Nscale Aims To Raise $2.7 Billion To Build Ai Infrastructure

London-based artificial intelligence infrastructure startup Nscale reportedly aims to raise $2.7 billion to build data centers around the globe that use Nvidia chips and rent the centers to companies that are training and operating AI models. Nscale is working to raise a $1.8 billion private credit deal and $900 million in preferred equity and convertible shares, Bloomberg reported Monday (April 28), citing an offering document. An Nscale spokesperson said in the report: “We recognize that increasing demand for AI and keen interest in our rapidly evolving industry is generating a lot of attention for Nscale, but we do not comment on speculation.”

PR Newswire
Apr 25th, 2025
Coreweave Announces Date Of First Quarter 2025 Financial Results

LIVINGSTON, N.J., April 25, 2025 /PRNewswire/ ---- CoreWeave, Inc. (Nasdaq: CRWV), the AI Hyperscaler™, announced that it will release first quarter 2025 financial results, after the market closes on Wednesday, May 14, 2025.CoreWeave will also host a conference call to discuss its results at 2:00 pm Pacific Time / 5:00 pm Eastern Time. The live webcast of the earnings conference call can be accessed at the CoreWeave Investor Relations website at investors.coreweave.com. A replay of the webcast will be available at the same website.About CoreWeave, Inc.CoreWeave, the AI Hyperscaler™, delivers a cloud platform of cutting-edge software powering the next wave of AI. The company's technology provides enterprises and leading AI labs with cloud solutions for accelerated computing. Since 2017, CoreWeave has operated a growing footprint of data centers across the US and Europe

Hipther
Apr 23rd, 2025
Galaxy Announces Commitment with CoreWeave to Host Additional Artificial Intelligence and High-Performance Computing Infrastructure at Helios Data Center Campus

Galaxy expects Phase I of its agreement with CoreWeave to be ready for service in the first half of 2026.

ROI-NJ
Apr 18th, 2025
Coreweave Announces First Customers For Nvidia Rack-Scale Systems

CoreWeave, which provides cloud software to power AI, said Tuesday that Cohere, IBM and Mistral AI were the first customers to gain access to NVIDIA GB200 NVL72 rack-scale systems and CoreWeave’s portfolio of cloud services. The combination of these services is intended to advance AI model development and deployment. NVIDIA GB200 NVL72. -NVIDIA

Market News 24
Apr 15th, 2025
Thousands of NVIDIA Grace Blackwell GPUs Launch at CoreWeave, Boosting AI Development for Innovators and Tech Pioneers

CoreWeave has launched a cutting-edge cloud service featuring NVIDIA's GB200 NVL72 systems, making it one of the first providers to offer these advanced technologies at scale.