Full-Time

Cluster Deployment Operations Engineer

Confirmed live in the last 24 hours

Cerebras

Cerebras

201-500 employees

Develops AI acceleration hardware and software

Hardware
AI & Machine Learning

Senior

Toronto, ON, Canada + 1 more

More locations: Sunnyvale, CA, USA

Category
Robotics & Autonomous Systems
AI & Machine Learning
Required Skills
Python
Go
Linux/Unix
Requirements
  • Proficiency in scripting and practical coding, particularly in Shell and Python (Go is a plus).
  • Strong experience troubleshooting, analyzing, and administering large-scale, distributed systems.
  • 5+ years of experience in data center operations and Linux system administration.
  • Knowledge and hands-on experience with network configuration and operations.
  • Expertise in hardware operations including networking components (e.g., cabling, switches, routers).
Responsibilities
  • Plan and execute cluster deployments, from small-scale to massive distributed systems.
  • Manage hands-on aspects of the deployments, coordinating with data center staff for hardware configurations and necessary maintenance.
  • Troubleshoot issues related to networking (e.g., BGP, cluster creation hurdles, or cabling errors) and hardware (e.g., hardware DOA).
  • Monitor and maintain systems to ensure uptime, performance, and reliability.
  • Collaborate with cross-functional teams including hardware vendors, data center operations, and network engineers to manage the entire lifecycle of deployment.
  • Ensure comprehensive documentation is created and maintained for deployments, configurations, and operational processes.
  • Develop tools, scripts, or playbooks to automate routine tasks and deployment processes.

Cerebras Systems specializes in accelerating artificial intelligence (AI) processes with its CS-2 system, which is designed to replace traditional clusters of graphics processing units (GPUs) used in AI computations. The CS-2 system simplifies AI tasks by eliminating the need for complex parallel programming and cluster management, making the process more efficient. Cerebras serves a variety of clients, including major pharmaceutical companies and government research labs, providing them with faster results for critical applications like drug response predictions. The company operates in the high-performance computing and AI markets, generating revenue through the sale of its proprietary hardware and software solutions, including the CS-2 system and associated cloud services. Cerebras aims to reduce the overall cost of AI research and development while enabling clients to achieve quicker results and lower latency in AI inference.

Company Stage

N/A

Total Funding

$700.4M

Headquarters

Sunnyvale, California

Founded

2016

Growth & Insights
Headcount

6 month growth

8%

1 year growth

16%

2 year growth

-3%
Simplify Jobs

Simplify's Take

What believers are saying

  • Cerebras' IPO and significant funding, including $720 million raised, position it for substantial growth and market penetration.
  • Collaborations with industry giants and government labs, such as GlaxoSmithKline, AstraZeneca, and Argonne National Lab, validate the effectiveness and demand for Cerebras' technology.
  • The CS-2 system's ability to produce faster results in critical applications like cancer drug response prediction models highlights its transformative potential in healthcare and scientific research.

What critics are saying

  • Competing against established giants like Nvidia poses significant market challenges and could impact Cerebras' market share.
  • The high cost and complexity of developing and maintaining cutting-edge hardware like the WSE-3 chip could strain resources and affect profitability.

What makes Cerebras unique

  • Cerebras' CS-2 system replaces traditional GPU clusters, eliminating complexities in parallel programming and distributed training.
  • The WSE-3 chip, with 40 trillion transistors, is designed to train AI models 10 times larger than current top models like GPT-4, setting a new industry standard.
  • Strategic partnerships with major entities like Dell and Aleph Alpha enhance Cerebras' reach and influence in the AI and high-performance computing markets.

Help us improve and share your feedback! Did you find this helpful?