Full-Time

Senior Staff Operations Program Manager

Updated on 6/25/2026

d-Matrix

d-Matrix

201-500 employees

Delivers memory-integrated AI compute platforms

No salary listed

Santa Clara, CA, USA

In Person

On-site at contract manufacturers (CM) sites as needed; travel up to 30% domestically and internationally.

Category
Business & Strategy (1)
Required Skills
Six Sigma
Data Analysis
Requirements
  • Bachelor's degree in Mechanical, Electrical, Industrial, or Manufacturing Engineering, or a related technical discipline.
  • 5+ years of experience in manufacturing, NPI, or operations program management for high-volume consumer electronics, medical devices, automotive, or similar hardware products.
  • Demonstrated experience taking products through full NPI cycle (EVT/DVT/PVT) and ramping to mass production.
  • Strong working knowledge of manufacturing processes such as SMT, PCBA, plastics/metals fabrication, mechanical assembly, and final test.
  • Hands-on experience working with contract manufacturers, ODMs, and tier-1 suppliers, including in Asia.
  • Proficiency interpreting engineering drawings, BOMs, AVL, ECOs, and test data.
  • Excellent program management skills with the ability to manage multiple concurrent priorities under tight deadlines.
  • Strong analytical skills; comfortable with statistical tools, yield analysis, and data-driven decision making.
  • Excellent written and verbal communication skills, with the ability to influence at all levels of the organization.
  • Willingness and ability to travel domestically and internationally (up to 30%), including extended stays at CM sites during critical builds.
Responsibilities
  • Lead cross-functional manufacturing programs from concept through end-of-life, owning schedule, scope, cost, and risk.
  • Drive build planning and execution for EVT, DVT, PVT, and MP phases at contract manufacturer sites.
  • Define and track exit criteria for each NPI phase, including yield, cycle time, FPY, and quality gates.
  • Maintain integrated program schedules and identify critical path dependencies across hardware, firmware, tooling, and supply readiness.
  • Own the Manufacturing Readiness Plan (MRP), including process flow, station design, fixture and tooling readiness, test coverage, and operator training.
  • Partner with Manufacturing Engineering to validate process capability (Cpk), drive DFM/DFA feedback into Engineering, and qualify production lines.
  • Lead production ramp, monitor key KPIs (yield, throughput, RTY, scrap, WIP), and drive corrective actions when targets are missed.
  • Coordinate ECO/ECN implementation at the factory and ensure clean cut-in with minimal disruption to output.
  • Partner with Commodity Managers and Planning to align material availability, lead times, and clear-to-build status with build plans.
  • Drive capacity assessments and ensure CM line, equipment, and labor capacity meet forecasted demand.
  • Identify supply risks early, develop mitigation plans, and escalate constraints to leadership with clear options and recommendations.
  • Partner with Quality to drive failure analysis, 8D problem solving, and closed-loop corrective actions on field and factory issues.
  • Lead yield improvement initiatives and cost-down projects across the product lifecycle.
  • Champion continuous improvement, lean manufacturing, and Six Sigma practices on the production floor.
  • Serve as the primary day-to-day operational point of contact with contract manufacturers and key suppliers.
  • Run regular program reviews, daily build huddles during NPI and ramp, and executive readouts on program health.
  • Translate complex manufacturing data and risks into clear, actionable communication for engineering, operations, and executive audiences.
Desired Qualifications
  • Master's degree in Engineering, Operations, or MBA.
  • PMP, Lean, Six Sigma Green/Black Belt, or equivalent certification.
  • Direct experience managing builds at CMs in Asia (e.g., China, Vietnam, Mexico).
  • Experience with ERP/PLM systems (e.g., SAP, Oracle, Agile, Arena, Windchill) and data tools (SQL, Tableau, Power BI, Jira).
  • Working proficiency in Mandarin or another regional language is a plus.

d-Matrix provides scalable, modular AI compute hardware and software for large datacenters, prioritizing energy efficiency and reduced data movement. Its core DIMC engine embeds compute directly into programmable memory, while a fabric of low-power chiplets delivers configurable compute resources and the accompanying software optimizes performance. This combination cuts data transfers and power use, aligning hardware design with memory-based computation for AI inference. The goal is to let large datacenters run AI workloads more efficiently at scale with customizable, modular compute platforms.

Company Size

201-500

Company Stage

Series C

Total Funding

$429M

Headquarters

Santa Clara, California

Founded

2019

Your Connections

People at d-Matrix who can refer or advise you

Simplify Jobs

Simplify's Take

What believers are saying

  • Inference efficiency is now a top datacenter priority, boosting demand.
  • Full production shipments in 2026 can create reference wins and customer feedback.
  • Partnerships with TSMC, Broadcom, and Gimlet validate manufacturing and workload fit.

What critics are saying

  • NVIDIA can absorb inference efficiency into GPUs and preserve platform control.
  • Corsair remains weak on trillion-parameter reasoning models and larger memory demands.
  • Raptor and 3DIMC depend on flawless foundry execution, packaging, and yields.

What makes d-Matrix unique

  • Corsair uses digital in-memory compute and chiplets to minimize data movement.
  • Air-cooled PCIe deployment fits existing datacenters without liquid cooling upgrades.
  • 3DIMC adds stacked DRAM to extend beyond small-batch inference workloads.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Hybrid Work Options

Growth & Insights and Company News

Headcount

6 month growth

0%

1 year growth

-1%

2 year growth

-9%
Lonergan Partners
Jun 15th, 2026
Building a high-performance team.

Building a high-performance team. In partnership with d-Matrix. Lonergan Partners, june 15th, 2026. Lonergan Partners is pleased to partner with d-Matrix, the pioneer in efficient low-latency AI inference compute platform for data centers, in building its high performance leadership team. Joining CEO Sid Sheth and our executive team placements Ashot Melik-Martirosian and Sree Ganesan, are three exceptional program management and engineering leaders: Priya Joshi, in Silicon Program Management; Pankaj Sharma, in Systems Program Management; and Raj Menon, in Customer Platform Engineering. These appointments are the result of a global search process led by Co-Managing Partners Kirsten Settle and Michael Cunningham. Lonergan Partners is proud to work with d-Matrix Vice President of Product Sree Ganesan to bring Raj Menon to her team in Silicon Program Management. Raj joins d-Matrix in Customer Platform Engineering, and comes most recently from Dell Technologies Office of the ISG CTO. Also, Lonergan Partners is proud to work with Vice President of Program Management Ashot Melik-Martirosian to bring two experienced leaders, Priya Joshi and Pankaj Sharma, to his team. Priya comes to d-Matrix having guided teams from early concept all the way through productization, working most recently at Efficient Computers and Astera Labs. Pankaj Sharma joins d-Matrix in Systems Program Management and comes most recently from SambaNova Systems, where he orchestrated AI/ML engineering teams. d-Matrix is pioneering accelerated computing for AI inference, breaking through the limits of latency, cost and energy. Its Corsair accelerators, JetStream networking, and Aviator software deliver fast, sustainable AI inference at data center scaled. "We are delighted to be working with Sid and his leadership team to build out a world class organization at d-Matrix in pursuit of building a faster, attainable, more energy efficient AI compute platform." Lonergan Partners' Co-Managing Partner Kirsten Settle

PR Newswire
Apr 2nd, 2026
d-Matrix acquires GigaIO data center business to strengthen rack-scale AI inference infrastructure

d-Matrix, a specialist in low-latency AI inference compute, has acquired GigaIO's data centre business, including its SuperNODE and FabreX PCIe-based memory fabric technologies. The deal builds on a collaboration that began in 2025. The acquisition adds a systems engineering team with expertise in rack-scale infrastructure and high-performance interconnects. d-Matrix will integrate GigaIO's technologies into its AI inference platform, which includes Corsair inference accelerators, JetStream networking and Aviator software. GigaIO will continue operating independently, focusing on edge computing. The acquisition establishes d-Matrix's new engineering presence in Carlsbad, California, expanding its global footprint to six locations across North America, Europe and Asia. Financial terms were not disclosed.

StorageNewsletter
Mar 17th, 2026
Nvidia GTC 2026: d-Matrix and Gimlet Labs to deliver 10x speed ups, massive power efficiency for frontier AI workloads.

Nvidia GTC 2026: d-Matrix and Gimlet Labs to deliver 10x speed ups, massive power efficiency for frontier AI workloads. d-Matrix and Gimlet's combined solution can deliver order-of-magnitude performance increases on both inference latency and throughput per Watt vs. GPU-only stacks. * Gimlet Cloud, built for running agentic AI inference, to deploy d-Matrix Corsair low latency, memory-optimized accelerators alongside GPUs * 10x performance benefits in latency and throughput per Watt compared to GPU-only approach * Job-division between GPUs and d-Matrix accelerators enables faster interactivity, massive power savings d-Matrix, a player in low latency AI inference compute for data centers, and Gimlet Labs, an applied AI research and product company, announced that Gimlet is incorporating d-Matrix Corsair accelerators into the Gimlet Cloud alongside traditional GPUs to deliver 10x speed ups for agentic AI inference workloads.d-Matrix and Gimlet's combined solution can deliver order-of-magnitude performance increases on both inference latency and throughput per Watt compared to traditional GPU-only deployments. The solution is ideal for latency-sensitive workloads including speculative decoding, which is commonly adopted by large-scale AI deployments to reduce latency. With d-Matrix Corsair accelerators on Gimlet's Cloud, workloads already well-optimized for agentic AI can achieve even greater performance gains, enabling token delivery speeds that enable industry-leading levels of interactivity required for today's most critical applications. "Model providers are spending billions on inference, and the demand for fast tokens is higher than ever - but power remains a scarce resource," said Zain Asgar, founder and CEO, Gimlet Labs. "d-Matrix hardware is the ideal solution for the phases of inference that GPUs waste energy on. By leveraging Corsair for use cases like speculative decoding, we can deliver dramatically faster performance for our customers for the same footprint." "From day one, d-Matrix has been uniquely focused on inference, founded on our belief that inference would not be a one-size-fits-all compute problem. As the only multi-silicon inference cloud, Gimlet is leading the industry with a fundamental new approach that delivers dramatic leaps forward in performance that homogeneous infrastructure simply cannot deliver," said Sid Sheth, founder and CEO, d-Matrix. "With power limits capping how fast AI can advance, it's imperative that AI service providers have the right tools for the right job and that we embrace doing more with less." Gimlet's software stack is the first to intelligently divide and map agentic workloads across a variety of accelerators spanning multiple vendors, gens and architectures and runs each segment on the most optimal hardware. Gimlet's datacenters incorporate these different hardware types and connect them via high-speed interconnects to serve frontier labs and other AI native companies. d-Matrix Corsair's unique memory-optimized architecture delivers high memory bandwidth and low latency, making it ideal for running memory-bound portions of the AI model. Corsair ships as a standard PCIe card with air cooling, which enables rapid deployments in existing data centers. The companies plan to make their combined solution available to select customers through Gimlet Cloud in 2H 2026. Read also: d-Matrix 3DIMC to deliver 10x faster inference than HBM4-based solutions, commercial debut planned with d-Matrix Raptor inference accelerator Collaboration combines d-Matrix 3DIMC technology with Andes' high-performance RISC-V CPU IP for Raptor, d-Matrix's next-gen accelerator for blazing fast, sustainable AI Series C led by global consortium values company at $2 billion, accelerates product and customer expansion as demand grows for faster, more efficient data center inference Arista, Broadcom and Supermicro team with d-Matrix to offer disaggregated standards-based approach for ultra-low latency batched inference Delivering gains in performance, cost and energy

d-Matrix
Mar 16th, 2026
Going vertical: why we created a 3D DRAM solution to advance low latency AI inference.

Going vertical: why d-Matrix inc. created a 3D DRAM solution to advance low latency AI inference. d-Matrix inc. scaled SRAM to create a system to run even larger models with the extremely low latency benefits it brings on single chips. The next step is to rethink DRAM altogether. Published: March 16, 2026 By: d-Matrix Team When d-Matrix inc. launched seven years ago, d-Matrix inc. had one goal: to build the fastest and most scalable technology to power small-batch AI inference and interactive applications. Both of those have become absolute table stakes in the last 12 months as user expectations grow and tens of millions of people flock to interactive applications. Its approach involved deploying purpose-built SRAM-based inference architecture at scale to capture the steps in inference that needed to be completed fastest, were relatively low complexity, and captured a significant volume of the actual inference compute. But to support the full scope of AI inference, including future innovations, d-Matrix inc. knew from the beginning that d-Matrix inc. would have to extend the same performance and low-latency Corsair has, but with larger memory capacity. To do that, d-Matrix inc. went vertical: adding an additional layer of DRAM on top of the compute. Agentic pipelines are becoming increasingly sophisticated, and some steps will inevitably require larger models for quality purposes - such as translation or code completion. The same performance d-Matrix inc. bring to smaller models must inevitably extend to models at significantly larger scale, as well as even further optimize disaggregated inference pipelines. Why memory was the blocker here - and will remain in the future Its chiplet-based design with on-chip SRAM and PCIe-based architecture enables d-Matrix inc. to scale up the total memory pool available in SRAM linearly, with an additional pool of DRAM available when needed. This enabled several benefits: * Ultra-low latency, particularly for task-specific steps in agentic pipelines where interactivity is the determining factor of success and single agentic steps can hold up the entire pipeline. * Seamless scalability that enables d-Matrix inc. to grow to a rack-level pool of memory that can operate most models on rack-scale Corsair. * Plug-and-play hardware that fits directly into most existing data center configurations with a low power envelope. * Highly flexible in disaggregated pipelines that optimizes full pipelines by working in concert with other accelerators like GPUs to accelerate larger, more powerful models. Smaller models, however, are only part of the solution - nor are they the only area where rapid innovation is happening. Frontier models as well as recent open-weight models like Qwen, Kimi and DeepSeek have delivered powerful reasoning capability for complex tasks but are sprawled into the hundreds of billions of parameters. Scaling SRAM beyond a single die It's tempting to look at a chiplet design and say you've just split it into a bunch of tiny HBM-esque pipes rather than a single pathway. But the problem itself has shifted to a different realm governed by die-to-die interconnectivity. SRAM access is still adjacent to compute on-die and operating at full speed. Scaling that up moves the challenge to a die-to-die architectural problem - when one chiplet needs data from a neighbor. That shifts the problem space to a different set of metrics: bandwidth per millimeter of edge, latency per hop, and energy per bit transferred. Optimizing each of them makes a chiplet-based architecture behave as if it has one giant pool of ultra-fast memory rather than discrete pockets. This gives d-Matrix inc. a way to scale up the available pool of SRAM memory while preserving low latency and performance requirements. But that scales elegantly up to a certain point that falls short of handling the largest reasoning models and the general shift toward significant token consumption. Rack-scale SRAM with Corsair captures a significant surface area of the space for AI workloads, and disaggregated pipelines with Corsair capture an even larger space. In fact, data released with its partner Gimlet Labs shows that there is as much as a 10x performance boost when deploying Corsair in a heterogeneous pipeline for small-batch inference. Shifting to 3D stacked DRAM Modern reasoning models aren't just larger by virtue of the number of parameters - they also consume substantially more tokens. Even at a smaller scale, reasoning models can consume anywhere significantly more tokens to achieve a result. The total memory footprint is growing on two axes for interactive applications requiring reasoning models. A stacked 3D DRAM configuration still lives in the die-to-die interconnectivity space, which allows d-Matrix inc. to target 10x better memory bandwidth and 10x better energy efficiency using 3DIMC over HBM4 configurations. In addition, its chiplet architecture allows for easier 3D stacking of DRAM, just as it initially allowed for easier scaling of SRAM memory pools. This addresses both capacity and bandwidth limitations constrained by SRAM scaling. d-Matrix inc. took passive DRAM d-Matrix inc. use for capacitors and converted that to an active stack, By doing that d-Matrix inc. can expose every small bank in the DRAM and directly bring it to the compute engine. With 3D DRAM, d-Matrix inc. now have the entire 3D surface area to connect, and the signals can be run at the DRAM base which is a few hundred Mhz. By doing that d-Matrix inc. can get 20 TB/s bandwidth per stack - 10x what HBM4 can achieve - at a power consumption of .3-.4 pJ/bit bit, compared to 3-4 pJ/bit. 3DIMC: Industry's first 3D DRAM solution for AI inference d-Matrix inc. announced 3DIMC, its stacked DRAM solution, at Hot Chips in August 2025. Since then, d-Matrix inc. has accomplished what d-Matrix inc. hoped - d-Matrix inc. has proven it works. d-Matrix inc. successfully validated Pavehawk, its test chip for 3D DRAM operates within its performance and power targets, demonstrating that the theory was not only sound, but the attainable path forward to powering next-generation AI inference. Its stacked DRAM chip, Pavehawk, arrived in its labs in August 2025 and d-Matrix inc. got to work to test the aggressive targets d-Matrix inc. set for ourselves. d-Matrix inc. now have had the opportunity to stress-test the very first iterations of the Pavehawk chips across different voltages and temperature ranges. Thus far d-Matrix inc. is seeing around 0.4 pJ/bit for the worst case scenarios, and that will decrease further as d-Matrix inc. complete additional optimizations. The future of AI inference is 3D stacked memory The answer obviously doesn't just lie with throwing an extra layer of DRAM on top of an existing one. Verticality exposes a whole new operating space to grow memory pools and meet the ravenous demand for low-latency, high performance interactive apps. Corsair was the world's first accelerator that offered a whopping 2GB of available SRAM per card, with the ability to scale up to 128 GB in a rack. A single server is capable of hosting and running a Llama 3.1 8B model that can handle specific tasks in agent pipelines, and it gracefully scales to larger models in a rack. Pavehawk is its first crack at the next problem, which will be central to its second-generation accelerator, Raptor. More sophisticated agentic pipelines will require increasingly sophisticated models, and even smaller models are becoming more robust and capable. Pavehawk not only enables larger models on its own - it dramatically improves disaggregated pipelines in a way far beyond what Corsair offers. If you're interested in trying out or purchasing Corsair, you can request early access or contact its sales directly. Its next task is meeting the incredible demand required by emerging AI workloads with high user expectations, and that starts with Pavehawk. Article tags: Suggested articles. By Sree Ganesan | October 14, 2025 By Aseem Bathla | July 17, 2025 By Matthew Lynley | August 14, 2025

PR Newswire
Mar 12th, 2026
d-Matrix and Gimlet Labs deliver 10x speed boost and power efficiency for agentic AI inference

d-Matrix and Gimlet Labs have announced a partnership to deliver 10x performance improvements for agentic AI inference workloads. Gimlet Cloud will deploy d-Matrix Corsair accelerators alongside GPUs, achieving significant gains in latency and throughput per watt compared to GPU-only approaches. The solution divides workloads between GPUs and d-Matrix accelerators, with Corsair's memory-optimised architecture handling memory-bound portions of AI models. This is particularly effective for latency-sensitive tasks like speculative decoding used in large-scale AI deployments. Gimlet's software intelligently maps workloads across multiple accelerator types and vendors. The combined solution will be available to select customers through Gimlet Cloud in the second half of 2026. Both companies emphasise power efficiency as crucial for advancing AI infrastructure amid growing energy constraints.