Full-Time

Software Development Engineer in Test

Cloud

Posted on 5/11/2026

Cerebras

Cerebras

501-1,000 employees

AI accelerator hardware replacing GPUs

No salary listed

Bengaluru, Karnataka, India

In Person

Category
QA & Testing (2)
,
Required Skills
LLM
Kubernetes
Python
Computer Networking
Docker
AWS
Go
C/C++
Requirements
  • Five+ years of experience in quality engineering, test engineering, or a closely related role, with substantial individual contributor experience on large-scale distributed systems or cloud infrastructure.
  • Deep cloud platform experience, preferably AWS, including networking, compute orchestration, container platforms, and multi-region production services.
  • Track record of building scalable test infrastructure, including frameworks, harnesses, environments, and automation that scale with the system under test rather than fighting it.
  • Strong systems debugging and reasoning, with ability to take an unfamiliar failure and follow it through layers of the stack to a root cause.
  • Strong proficiency in at least one backend language, such as Python, Go, or C++, sufficient to read production code, write production-grade tests, and contribute infrastructure code directly.
  • Excellent written and asynchronous communication, with ability to operate effectively across time zones and in environments where most decisions are made in writing.
  • Self-direction under ambiguity, able to frame problems, make trade-off decisions, and push back when quality is at risk without waiting to be asked.
  • Experience with Cloud infrastructure, model serving systems, or GPU accelerated workloads is a strong plus.
  • Experience using AI tooling such as LLMs, coding assistants, or agents to accelerate test development, triage, or analysis is a plus.
Responsibilities
  • Release Quality Ownership: Drive weekly cloud release qualification end to end, read every PR in the release branch first-hand, understand what changed, decide where the risk is, and design the qualification that exercises the actual risk, and be the final voice before a release ships.
  • Test Infrastructure at Scale: Build and evolve the test infrastructure for the Inference Cloud platform, including functional, integration, performance, and fault testing, planning for 20x growth in coverage, environments, and traffic, ensuring the setup can handle tomorrow’s load.
  • End-to-End System Understanding: Reason through the full stack including client SDK, API, gateway, inference software, driver, and hardware, and know enough to debug from any layer and test the right thing.
  • Code Review with Intent: Read and review developer PRs with genuine understanding of changes and blast radius, and test the change’s actual impact, not just surface area.
  • Automation Expansion: Increase automation coverage continuously, fix flaky tests, and use AI tooling to accelerate test creation, debugging, and analysis.
  • Quality Discipline: Choose high-value tests over volume metrics and drive the team’s standards for what is tested and what is ready to ship.
  • Cross-Team Operation: Collaborate with platform, ML, infrastructure, and product teams across timezones and influence quality outcomes without owning every team’s roadmap.
Desired Qualifications
  • Experience with Cloud infrastructure, model serving systems, or GPU accelerated workloads is a strong plus.
  • Experience using AI tooling such as LLMs, coding assistants, or agents to accelerate test development, triage, or analysis is a plus.

Cerebras Systems creates AI acceleration hardware and software. Its CS-2 system is designed to replace traditional GPU clusters for AI workloads, speeding up training and inference while simplifying the setup by eliminating the need for parallel programming, distributed training, and cluster management. The product works as a single, large processor-based accelerator with accompanying software and cloud services to run AI models efficiently, reducing latency and time to results. Compared with competitors, Cerebras differentiates itself with the largest processor in the industry and an integrated hardware-software stack that aims to streamline AI workflows rather than relying on multi-GPU clusters. The company’s goal is to help research labs, healthcare, finance, and other industries achieve faster, more cost-effective AI development and deployment by offering a turnkey high-performance AI compute solution.

Company Size

501-1,000

Company Stage

Debt Financing

Total Funding

$3.7B

Headquarters

Sunnyvale, California

Founded

2016

Simplify Jobs

Simplify's Take

What believers are saying

  • OpenAI commits over $20B for Cerebras servers through 2029, securing revenue.
  • AWS partnership deploys CS-3 on Bedrock for fastest LLM inference in 2026.
  • $850M credit facility funds data center expansion post-Series H in January 2026.

What critics are saying

  • Nvidia Blackwell B200 captures 70% market share via CUDA ecosystem in 12 months.
  • IPO at $35B valuation fails by August 2026 amid AI hype cooldown, diluting shares.
  • Groq LPUs deliver 40% faster inference at 1/10th power, stealing pharma clients.

What makes Cerebras unique

  • WSE-3 integrates 900,000 tiny 0.05mm² fault-tolerant cores across 46,225mm² wafer.
  • Achieves 93% silicon utilization, 100x more defect-tolerant than Nvidia H100 cores.
  • Delivers 21 PB/s on-chip SRAM bandwidth, eliminating multi-chip interconnect latency.

Help us improve and share your feedback! Did you find this helpful?

Your Connections

People at Cerebras who can refer or advise you

Benefits

Professional Development Budget

Flexible Work Hours

Remote Work Options

401(k) Company Match

401(k) Retirement Plan

Mental Health Support

Wellness Program

Paid Sick Leave

Paid Holidays

Paid Vacation

Parental Leave

Family Planning Benefits

Fertility Treatment Support

Adoption Assistance

Childcare Support

Elder Care Support

Pet Insurance

Bereavement Leave

Employee Discounts

Company Social Events

Growth & Insights and Company News

Headcount

6 month growth

0%

1 year growth

2%

2 year growth

0%
Yahoo Finance
Mar 13th, 2026
Cerebras and Amazon partner to combine AI chips on AWS cloud

Amazon and Cerebras Systems have struck a deal to combine their computing chips in a new service aimed at accelerating AI applications like chatbots and coding tools. Cerebras chips will be installed in Amazon Web Services data centres and linked to Amazon's Trainium3 AI chips using custom networking technology. The partnership will tackle AI inference by splitting tasks between the two chip types: Amazon's Trainium3 will handle "prefill", transforming user requests into AI-readable tokens, whilst Cerebras chips will manage the "decode" stage, generating answers. Both companies declined to disclose the deal's financial terms. Valued at $23.1 billion, Cerebras recently signed a $10 billion deal with OpenAI. Amazon expects its service to launch in the second half of this year and believes it will lead in price-performance versus traditional GPUs.

The Register
Feb 20th, 2026
Cerebras to build 8 exaFLOPS AI supercomputer in India backed by UAE's G42

Cerebras Systems will power an 8 exaFLOPS AI supercomputer in India through a collaboration between UAE's Mohamed Bin Zayed University of AI and India's Center for Development of Advanced Computing. The system will be deployed by UAE technology company G42, one of Cerebras' largest backers. The supercomputer will feature approximately 64 WSE-3 wafer-scale accelerators, each delivering 125 petaFLOPS of performance. Unlike traditional GPUs using high-bandwidth memory, Cerebras chips use on-chip SRAM offering 21 petabytes per second of memory bandwidth — roughly 1,000 times faster than Nvidia's HBM4. G42 will deploy the system under India-defined governance frameworks, with all data remaining within Indian borders. The supercomputer will serve Indian universities, startups and SMEs whilst maintaining data sovereignty.

The Tech Buzz
Feb 20th, 2026
G42 and Cerebras deploy 8 exaflops of AI compute in India

G42 and Cerebras deploy 8 exaflops of AI compute in India. UAE's G42 partners with Cerebras to build massive AI infrastructure in India PUBLISHED: Fri, Feb 20, 2026, 12:12 PM UTC | UPDATED: Sun, Feb 22, 2026, 4:53 PM UTC 4 mins read * | G42 and Cerebras announce deployment of 8 exaflops of AI compute infrastructure in India, unveiled at the India AI Impact Summit 2026 * | The partnership represents one of the largest enterprise AI infrastructure investments in Asia, addressing the region's growing compute shortage * | Cerebras brings its wafer-scale engine technology, which offers significant advantages over traditional GPU clusters for training large language models * | The move positions India as a critical AI infrastructure hub and highlights the growing geopolitical importance of compute capacity G42, the Abu Dhabi-based AI powerhouse, just announced a massive infrastructure partnership with chipmaker Cerebras to deploy eight exaflops of compute capacity in India. The deal, revealed at the India AI Impact Summit 2026, represents one of the largest AI infrastructure investments in Asia and signals a major shift in the global AI compute landscape. It's a bold move that positions India as a critical node in the emerging AI supply chain while cementing G42's role as a bridge between Middle Eastern capital and Asian tech ambitions. G42 is making a massive bet on India's AI future. The Abu Dhabi tech giant just unveiled plans to deploy eight exaflops of compute capacity across India in partnership with Cerebras, the U.S. chipmaker known for its wafer-scale engine technology. The announcement, made at the India AI Impact Summit 2026, marks one of the largest AI infrastructure commitments in Asia and comes as the region faces a critical shortage of compute resources. The timing couldn't be more strategic. India's been scrambling to build out AI infrastructure as demand from startups and enterprises skyrockets. According to industry estimates, the country currently has less than 2% of global AI compute capacity despite having one of the world's largest developer populations. This deal could change that calculus overnight. Cerebras brings something different to the table than the usual Nvidia GPU clusters dominating the market. The company's CS-3 systems use wafer-scale engines - essentially chips the size of dinner plates - that can train large language models faster and more efficiently than traditional setups. For G42, which has been aggressively expanding its AI infrastructure footprint across the Middle East and Asia, Cerebras offers a way to differentiate from competitors betting entirely on Nvidia's ecosystem. The eight exaflops figure is eye-popping. To put it in perspective, that's roughly equivalent to eight quintillion floating-point operations per second, enough computational power to train multiple frontier AI models simultaneously. It's the kind of capacity typically reserved for national supercomputing initiatives or hyperscale cloud providers. G42's been on a tear lately. The company, backed by Abu Dhabi's Royal Group and with strategic investments from Microsoft, has positioned itself as a neutral AI infrastructure provider at a time when U.S.-China tech tensions are reshaping global supply chains. The India deployment follows similar announcements in the UAE, Saudi Arabia, and several African nations, suggesting a coordinated strategy to build compute capacity in regions underserved by Western cloud giants. For Cerebras, this represents validation of its alternative approach to AI infrastructure. The company's been competing against Nvidia's CUDA ecosystem dominance by focusing on customers who need to train models quickly rather than just run inference at scale. Landing a deal of this magnitude with G42 gives Cerebras a major reference customer and manufacturing scale it hasn't previously enjoyed. The geopolitical implications are significant. India's government has been pushing for AI sovereignty, wanting to ensure the country isn't dependent on foreign cloud providers for critical infrastructure. A partnership between a UAE-based company and a U.S. chipmaker, deploying hardware on Indian soil, threads a delicate needle - bringing capital and technology without the baggage of direct Chinese or pure Western control. Neither company disclosed the financial terms, but industry sources estimate infrastructure deployments of this scale typically run into the billions of dollars. The systems are expected to come online in phases starting later this year, with full deployment targeted for 2027. The announcement also raises questions about power and cooling infrastructure. Eight exaflops of compute requires massive amounts of electricity and sophisticated cooling systems. India's data center infrastructure has been growing rapidly, but projects of this scale will need dedicated facilities with direct access to power substations and water resources for cooling. What remains unclear is who'll be the primary customers for this capacity. G42 operates both as a cloud provider and as an infrastructure partner for governments and large enterprises. The India deployment could serve regional startups building AI applications, support government AI initiatives, or potentially lease capacity to other cloud providers looking to expand in Asia without building their own infrastructure. This partnership represents more than just another infrastructure deal - it's a glimpse at how AI compute is becoming a geopolitical chess piece. As countries race to build AI capabilities, access to cutting-edge compute matters as much as access to talent or data. G42's ability to move capital and technology across borders while Cerebras provides an alternative to Nvidia's ecosystem creates a playbook other regions will watch closely. For India, it's a chance to leapfrog into the top tier of AI infrastructure nations. The real test will be whether the country can build the ecosystem of customers, applications, and governance frameworks to make full use of all those exaflops. More Topics:

Bloomberg L.P.
Feb 12th, 2026
OpenAI debuts first AI model using Cerebras chips, challenging Nvidia dominance

OpenAI is releasing its first AI model running on chips from Cerebras Systems, marking a move to diversify beyond Nvidia. The model, GPT-5.3-Codex-Spark, launches Thursday as a faster but less powerful version of its Codex software for automating coding. The new model enables software engineers to quickly complete tasks like editing code segments and running tests. Users can interrupt the model or redirect it to different coding tasks without waiting for lengthy computing processes to finish. The release represents OpenAI's effort to broaden its chipmaker partnerships beyond its primary reliance on Nvidia hardware.

TechCrunch
Feb 12th, 2026
OpenAI launches GPT-5.3-Codex-Spark powered by Cerebras' $10B chip partnership

OpenAI has launched GPT-5.3-Codex-Spark, a lightweight version of its coding tool designed for faster inference, powered by Cerebras' Wafer Scale Engine 3 chip. The model represents the "first milestone" in OpenAI's multi-year, $10 billion partnership with Cerebras announced last month. Spark is optimised for rapid prototyping and real-time collaboration, targeting daily productivity tasks rather than the heavier workloads handled by the standard 5.3 model. It's currently available in research preview for ChatGPT Pro users in the Codex app. The integration marks a new level of hardware partnership for OpenAI. Cerebras' WSE-3 chip contains 4 trillion transistors and excels at low-latency workflows. Cerebras recently raised $1 billion at a $23 billion valuation.