Full-Time

Network Observability Engineer

Confirmed live in the last 24 hours

Cerebras

Cerebras

201-500 employees

Develops AI acceleration hardware and software

No salary listed

Expert

Sunnyvale, CA, USA

Category
Network Administration
IT & Security
Required Skills
Kubernetes
Python
Grafana
Docker
Prometheus
Requirements
  • Bachelor’s degree or higher in Electrical Engineering, Computer Engineering or Computer Science.
  • 10+ years of experience as networking engineering, technical support engineer or system test engineer at major switch vendors.
  • Must have deep understanding of networking protocols TCP/IP, BGP, PFC, ECN, QoS, MLAG, ECMP, and VRF.
  • Proficient in python automation and building tools.
  • Experience with network monitoring and analytics tools like Prometheus, Grafana; and familiarity with GNMI, OpenConfig, OpenTelemetry, or New Relic.
  • Experience with containerization (e.g., Docker) and orchestration (e.g., Kubernetes).
  • Excellent problem-solving and analytical skills.
  • Strong communication and collaboration skills.
Responsibilities
  • Drive Network Observability effort at Cerebras.
  • Design and implement a network observability platform that provides real-time visibility into network performance, anomalies, and configuration.
  • Develop network monitoring, analytics, and automation tools to ensure network reliability and performance.
  • Maintain network monitoring and alerting systems to ensure timely detection and resolution of network issues.
  • Analyze network data to identify trends, patterns, and anomalies, and provide recommendations for improvement.
  • Develop reporting capabilities and method of procedure to support doing networking maintenance and upgrades.
  • Collaborate with Network Operations teams to troubleshoot and resolve complex network issues.
  • Develop and implement network automation scripts and tools to streamline network configuration, provisioning, and troubleshooting.
  • Develop and maintain documentation on network observability platforms, tools, and processes.
  • Share knowledge and best practices with other teams and stakeholders to improve overall network reliability and performance.
  • Participate in on-call rotations to provide support for critical network issues.

Cerebras Systems specializes in accelerating artificial intelligence (AI) processes with its CS-2 system, which is recognized as the fastest AI accelerator available. This system replaces traditional clusters of graphics processing units (GPUs) used in AI computations, simplifying the complexities of parallel programming and cluster management. Cerebras serves a variety of clients, including major pharmaceutical companies and government research labs, providing them with faster results for critical applications like cancer drug response predictions. The company operates in the high-performance computing and AI markets, generating revenue through the sale of its CS-2 systems and associated software and cloud services. By offering a comprehensive solution that includes the largest processor in the industry, Cerebras aims to enhance the efficiency of AI training and inference, ultimately reducing the costs associated with AI research and development.

Company Size

201-500

Company Stage

Series F

Total Funding

$720M

Headquarters

Sunnyvale, California

Founded

2016

Simplify Jobs

Simplify's Take

What believers are saying

  • Growing AI model efficiency demand aligns with Cerebras' energy-efficient accelerators.
  • AI democratization increases need for user-friendly systems like Cerebras' CS-2.
  • Pharmaceutical industry's push for faster drug discovery boosts demand for Cerebras' technology.

What critics are saying

  • Competition from NVIDIA and Graphcore could impact Cerebras' market share.
  • Rapid AI model evolution may necessitate frequent hardware updates, increasing R&D costs.
  • Supply chain vulnerabilities could delay production of Cerebras' hardware.

What makes Cerebras unique

  • Cerebras' Wafer-Scale Engine is the largest chip ever built for AI.
  • The CS-2 system replaces traditional GPU clusters, simplifying AI computations.
  • Cerebras serves diverse industries, including pharmaceuticals and government research labs.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Professional Development Budget

Flexible Work Hours

Remote Work Options

401(k) Company Match

401(k) Retirement Plan

Mental Health Support

Wellness Program

Paid Sick Leave

Paid Holidays

Paid Vacation

Parental Leave

Family Planning Benefits

Fertility Treatment Support

Adoption Assistance

Childcare Support

Elder Care Support

Pet Insurance

Bereavement Leave

Employee Discounts

Company Social Events