Full-Time

AI Infrastructure Test Engineer

Posted on 7/29/2024

Cerebras

Cerebras

201-500 employees

Develops AI acceleration hardware and software

Data & Analytics
Enterprise Software
AI & Machine Learning

Senior

Sunnyvale, CA, USA

Category
QA & Testing
Quality Assurance
Required Skills
Python
Linux/Unix
Requirements
  • 6+ years experience in Software Development, Quality Assurance, System Test of Switches and Routers at a Networking equipment vendor.
  • Bachelor’s degree or higher in Electrical Engineering, Computer Engineering, Computer Science, or related majors.
  • Understanding of RDMA congestion control mechanisms on InfiniBand and RoCE Networks.
  • Must have deep understanding of networking protocols BGP, PFC, ECN, QoS, MLAG, ECMP, and VRF.
  • Experience with computer system architecture, especially on CPU SoC or Platform Architecture, Interconnect Fabric, and Memory sub-system.
  • Experience designing and implementing large switching and routing networks.
  • Strong technical abilities, problem-solving, design, coding, and debugging skills.
  • Expertise in Linux tools such as lspci, ping, traceroute, tcpdump, ifconfig, ip link, ip route, arp, /proc/net, /proc/sys/net, vmstat, netstat, ttcp, iperf, strac, memtest, fio, ozone, and iometer.
  • Must be proficient in python.
  • Proficient in Networking Test Tools like IXIA and Smartbits.
Responsibilities
  • Identify experiments, tools, and methodology to test complex AI Infrastructure equipment including Switches, Routers, Server, NICs, Transceivers that push the frontier in hardware design and system integration.
  • Co-work with equipment vendors to evaluate the performance of newly introduced hardware, and to resolve defects.
  • Design and setup test lab, test beds to exercise and evaluate vendor equipment from Arista, Juniper, Cisco, Dell, HPE.
  • Work with architects, software engineers to create test cases, write test scripts, execute tests, and document results of evaluation of solution from different vendors.
  • Troubleshoot, isolate, and drive issues to resolution through partnerships with other teams and vendors.
  • Provide solutions for efficient networking design for AI infrastructure.
  • Design, install, configure, and maintain complex Network for AI Infrastructure.
  • Build up and optimize server system benchmarks based on deep understanding of server system architect, and workload characterization.

Cerebras Systems specializes in accelerating artificial intelligence (AI) processes with its CS-2 system, which is designed to replace traditional clusters of graphics processing units (GPUs). The CS-2 system simplifies AI computations by eliminating the need for complex parallel programming and cluster management, making the process more efficient. Cerebras serves a variety of clients, including major pharmaceutical companies and government research labs, providing them with faster results for critical applications like cancer drug response predictions. The company generates revenue by selling its proprietary hardware and software solutions, including the CS-2 system and related cloud services. Cerebras aims to enhance the speed and efficiency of AI training and inference, ultimately reducing the costs associated with AI research and development.

Company Stage

Series F

Total Funding

$700.4M

Headquarters

Sunnyvale, California

Founded

2016

Growth & Insights
Headcount

6 month growth

0%

1 year growth

-5%

2 year growth

-10%
Simplify Jobs

Simplify's Take

What believers are saying

  • Growing AI model efficiency demand aligns with Cerebras' energy-efficient accelerators.
  • AI democratization increases need for user-friendly systems like Cerebras' CS-2.
  • Pharmaceutical industry's push for faster drug discovery boosts demand for Cerebras' technology.

What critics are saying

  • Competition from NVIDIA and Graphcore could impact Cerebras' market share.
  • Rapid AI model evolution may necessitate frequent hardware updates, increasing R&D costs.
  • Supply chain vulnerabilities could delay production of Cerebras' hardware.

What makes Cerebras unique

  • Cerebras' Wafer-Scale Engine is the largest chip ever built for AI.
  • The CS-2 system replaces traditional GPU clusters, simplifying AI computations.
  • Cerebras serves diverse industries, including pharmaceuticals and government research labs.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Professional Development Budget

Flexible Work Hours

Remote Work Options

401(k) Company Match

401(k) Retirement Plan

Mental Health Support

Wellness Program

Paid Sick Leave

Paid Holidays

Paid Vacation

Parental Leave

Family Planning Benefits

Fertility Treatment Support

Adoption Assistance

Childcare Support

Elder Care Support

Pet Insurance

Bereavement Leave

Employee Discounts

Company Social Events

INACTIVE