Simplify Logo

Full-Time

AI Infrastructure Test Engineer

Confirmed live in the last 24 hours

Cerebras

Cerebras

201-500 employees

Develops AI acceleration hardware and software

AI & Machine Learning
Financial Services
Healthcare

Senior

Sunnyvale, CA, USA

Category
QA & Testing
Quality Assurance
Required Skills
Python
Linux/Unix
Requirements
  • 6+ years experience in Software Development, Quality Assurance, System Test of Switches and Routers at a Networking equipment vendor.
  • Bachelor’s degree or higher in Electrical Engineering, Computer Engineering, Computer Science, or related majors.
  • Understanding of RDMA congestion control mechanisms on InfiniBand and RoCE Networks.
  • Must have deep understanding of networking protocols BGP, PFC, ECN, QoS, MLAG, ECMP, and VRF.
  • Experience with computer system architecture, especially on CPU SoC or Platform Architecture, Interconnect Fabric, and Memory sub-system.
  • Experience designing and implementing large switching and routing networks.
  • Strong technical abilities, problem-solving, design, coding, and debugging skills.
  • Expertise in Linux tools such as lspci, ping, traceroute, tcpdump, ifconfig, ip link, ip route, arp, /proc/net, /proc/sys/net, vmstat, netstat, ttcp, iperf, strac, memtest, fio, ozone, and iometer.
  • Must be proficient in python.
  • Proficient in Networking Test Tools like IXIA and Smartbits.
Responsibilities
  • Identify experiments, tools, and methodology to test complex AI Infrastructure equipment including Switches, Routers, Server, NICs, Transceivers that push the frontier in hardware design and system integration.
  • Co-work with equipment vendors to evaluate the performance of newly introduced hardware, and to resolve defects.
  • Design and setup test lab, test beds to exercise and evaluate vendor equipment from Arista, Juniper, Cisco, Dell, HPE.
  • Work with architects, software engineers to create test cases, write test scripts, execute tests, and document results of evaluation of solution from different vendors.
  • Troubleshoot, isolate, and drive issues to resolution through partnerships with other teams and vendors.
  • Provide solutions for efficient networking design for AI infrastructure.
  • Design, install, configure, and maintain complex Network for AI Infrastructure.
  • Build up and optimize server system benchmarks based on deep understanding of server system architect, and workload characterization.

Cerebras Systems focuses on accelerating artificial intelligence (AI) with its CS-2 system, the fastest AI accelerator on the market. This system replaces traditional GPU clusters, simplifying the process of AI computations for clients in various industries, including pharmaceuticals and government research. By providing proprietary hardware and software solutions, Cerebras enables faster AI training and lower latency, which helps reduce costs in AI research and development. The company's goal is to make AI tasks more efficient and accessible across different sectors.

Company Stage

Series F

Total Funding

$720M

Headquarters

Sunnyvale, California

Founded

2016

Growth & Insights
Headcount

6 month growth

10%

1 year growth

19%

2 year growth

-3%
Simplify Jobs

Simplify's Take

What believers are saying

  • Cerebras' IPO and significant funding, including $720 million raised, position it for substantial growth and market penetration.
  • Collaborations with industry giants and government labs, such as GlaxoSmithKline, AstraZeneca, and Argonne National Lab, validate the effectiveness and demand for Cerebras' technology.
  • The CS-2 system's ability to produce faster results in critical applications like cancer drug response prediction models highlights its transformative potential in healthcare and scientific research.

What critics are saying

  • Competing against established giants like Nvidia poses significant market challenges and could impact Cerebras' market share.
  • The high cost and complexity of developing and maintaining cutting-edge hardware like the WSE-3 chip could strain resources and affect profitability.

What makes Cerebras unique

  • Cerebras' CS-2 system replaces traditional GPU clusters, eliminating complexities in parallel programming and distributed training.
  • The WSE-3 chip, with 40 trillion transistors, is designed to train AI models 10 times larger than current top models like GPT-4, setting a new industry standard.
  • Strategic partnerships with major entities like Dell and Aleph Alpha enhance Cerebras' reach and influence in the AI and high-performance computing markets.

Help us improve and share your feedback! Did you find this helpful?