Simplify Logo

Full-Time

HPC Network Engineer

Confirmed live in the last 24 hours

CoreWeave

CoreWeave

201-500 employees

Cloud service for GPU-accelerated workloads

Data & Analytics
Hardware
Enterprise Software
AI & Machine Learning

Compensation Overview

$160k - $210kAnnually

Mid, Senior

Livingston, NJ, USA + 2 more

Category
DevOps & Infrastructure
Network Engineering
Required Skills
Linux/Unix
Requirements
  • Proficient in InfiniBand configuration and management.
  • Solid understanding of network architectures, topologies, best practices, and techniques for high performance and availability.
  • Familiarity with optical networking hardware.
  • Experience in Linux system administration.
  • Proficiency in at least one scripting language.
  • Team player with effective collaboration skills.
  • Ability to manage multiple tasks and projects concurrently.
Responsibilities
  • Monitoring: Consistently monitoring the performance and overall health of InfiniBand fabrics, which includes network switches, host adapters, and nodes. This responsibility entails utilizing existing monitoring tools and potentially developing new ones to ensure comprehensive visibility and timely detection of any issues or abnormalities.
  • Troubleshooting: Skillfully investigating and resolving various issues that may arise within InfiniBand fabrics. This involves diagnosing network connectivity problems, identifying and resolving performance bottlenecks, and effectively addressing any errors or failures within the fabric components.
  • Support: Provide assistance and collaboration to other teams involved in the management and operation of HPC clusters utilizing InfiniBand technology. This includes offering expertise, guidance, and troubleshooting support to ensure the smooth functioning and optimal performance of the clusters.
  • Deploy and Bringup: Help with installation of large fabrics, organizing and work with teams to bring up fabrics from day 0 to operational fabrics together with onsite personnel and customers.
  • Operations/Configuration: Work with configuration tooling, operations teams to carry out maintenance and upgrades of switches and the control plane of the fabrics.

CoreWeave offers cloud computing services that specialize in GPU-accelerated workloads, which are crucial for tasks like artificial intelligence, machine learning, and visual effects rendering. Their infrastructure is built on a bare metal serverless Kubernetes platform, allowing clients to access powerful computing resources without the need for expensive hardware. CoreWeave's pay-as-you-go pricing model provides flexibility and scalability, making it appealing to tech companies, film studios, and enterprises. The company's goal is to deliver high-performance computing resources that meet the increasing demands of various industries.

Company Stage

Series B

Total Funding

$12B

Headquarters

New York City, New York

Founded

2017

Growth & Insights
Headcount

6 month growth

62%

1 year growth

134%

2 year growth

719%
Simplify Jobs

Simplify's Take

What believers are saying

  • Securing $1.1 billion in funding positions CoreWeave for aggressive growth and innovation in the AI and HPC sectors.
  • The appointment of former AWS executive Chetan Kapoor as Chief Product Officer brings valuable expertise and leadership to drive product strategy during a hypergrowth phase.
  • CoreWeave's $2.2 billion investment in European data centers demonstrates their commitment to expanding global reach and meeting surging demand for AI infrastructure.

What critics are saying

  • The competitive landscape with giants like AWS launching high-core instances could pressure CoreWeave to continuously innovate to maintain its edge.
  • Rapid expansion, including significant investments in new data centers, could strain resources and operational capabilities.

What makes CoreWeave unique

  • CoreWeave specializes in GPU-accelerated workloads, setting it apart from general cloud service providers like AWS and Azure.
  • Their fully managed, bare metal serverless Kubernetes platform offers high performance with reduced operational burden, a unique selling point in the cloud computing market.
  • CoreWeave's strategic partnerships, such as with Bloom Energy for on-site power generation, enhance their infrastructure's reliability and sustainability.