Simplify Logo

Full-Time

Software Engineer – Principal

Model Factory

Updated on 9/11/2024

d-Matrix

d-Matrix

51-200 employees

AI compute platform using in-memory computing

Data & Analytics
Hardware
AI & Machine Learning

Senior, Expert

Santa Clara, CA, USA

Category
Full-Stack Engineering
Software Engineering
Required Skills
Python
Tensorflow
Data Structures & Algorithms
Linux/Unix
Requirements
  • MS or PhD, Computer Science, Engineering, Math, Physics or related degree, MS 15+ Year of Industry Experience or Phd with 12+ years of Industry Experience.
  • Strong grasp of computer architecture, data structures, system software, and machine learning fundamentals.
  • Proficient in C/C++ and Python development in Linux environment and using standard development tools.
  • Experience with deep learning frameworks (such as PyTorch, Tensorflow).
  • Experience with deep learning runtimes (such as ONNX Runtime, TensorRT,…).
  • Experience with distributed ML model deployment for training or inference, including collectives, modifying pytorch models with annotations, exporting models.
  • Experience with LLM including implementations from Hugging Face, vLLMs,...
  • Experience with model preprocessing including quantization, sparsity, model, collectives.
  • Experience deploying ML workloads on distributed systems, in a multitenancy environment.
  • Self-motivated team player with a strong sense of ownership and leadership.
Responsibilities
  • The role requires you to be part of the team that helps productize the SW stack for our AI compute engine.
  • As part of the Software team, you will be responsible for the development, enhancement, and maintenance of the next-generation AI Deployment software.
  • You have had past experience working across all aspects of the full stack tool chain and understand the nuances of what it takes to optimize and trade-off various aspects of hardware-software co-design.
  • You are able to build and scale software deliverables in a tight development window.
  • You will work with a team of compiler experts to build out the compiler infrastructure working closely with other software (ML, Systems) and hardware (mixed signal, DSP, CPU) experts in the company.

d-Matrix is developing a unique AI compute platform using in-memory computing (IMC) techniques with chiplet level scale-out interconnects, revolutionizing datacenter AI inferencing. Their innovative circuit techniques, ML tools, software, and algorithms have successfully addressed the memory-compute integration problem, enhancing AI compute efficiency.

Company Stage

Series B

Total Funding

$161.5M

Headquarters

Santa Clara, California

Founded

2019

Growth & Insights
Headcount

6 month growth

-15%

1 year growth

85%

2 year growth

223%
Simplify Jobs

Simplify's Take

What believers are saying

  • Securing $110 million in Series B funding positions d-Matrix for rapid growth and technological advancements.
  • Their Jayhawk II silicon aims to solve critical issues in AI inference, such as cost, latency, and throughput, making generative AI more commercially viable.
  • The company's focus on efficient AI inference could attract significant interest from data centers and enterprises looking to deploy large language models.

What critics are saying

  • Competing against industry giants like Nvidia poses a significant challenge in terms of market penetration and customer acquisition.
  • The high dependency on continuous innovation and technological advancements could strain resources and lead to potential setbacks.

What makes d-Matrix unique

  • d-Matrix focuses on developing AI hardware specifically optimized for Transformer models, unlike general-purpose AI chip providers like Nvidia.
  • Their digital in-memory compute (DIMC) architecture with chiplet interconnect is a first-of-its-kind innovation, setting them apart in the AI hardware market.
  • Backed by major investors like Microsoft, d-Matrix has the financial support to challenge established players like Nvidia.