Full-Time

Machine Learning Engineer – Principal

Model Factory

Confirmed live in the last 24 hours

d-Matrix

d-Matrix

201-500 employees

AI compute platform for datacenters

Compensation Overview

$155k - $250k/yr

Senior, Expert

Santa Clara, CA, USA

Hybrid, working onsite at our Santa Clara, CA headquarters 3-5 days per week.

Category
Applied Machine Learning
AI & Machine Learning
Required Skills
Kubernetes
Python
Tensorflow
Pytorch
Docker
Requirements
  • BS in Computer Science with 15+ years or MS in Computer Science preferred with 10+ years of strong programming skills in Python and experience with ML frameworks like PyTorch, TensorFlow, or JAX.
  • Hands-on experience with model optimization, quantization, and inference acceleration.
  • Deep understanding of Transformer architectures, attention mechanisms, and distributed inference (Tensor Parallel, Pipeline Parallel, Sequence Parallel).
  • Knowledge of quantization (INT8, BF16, FP16) and memory-efficient inference techniques.
  • Solid grasp of software engineering best practices, including CI/CD, containerization (Docker, Kubernetes), and MLOps.
  • Strong problem-solving skills and ability to work in a fast-paced, iterative development environment.
Responsibilities
  • Design, build, and optimize machine learning deployment pipelines for large-scale models.
  • Implement and enhance model inference frameworks.
  • Develop automated workflows for model development, experimentation, and deployment.
  • Collaborate with research, architecture, and engineering teams to improve model performance and efficiency.
  • Work with distributed computing frameworks (e.g., PyTorch/XLA, JAX, TensorFlow, Ray) to optimize model parallelism and deployment.
  • Implement scalable KV caching and memory-efficient inference techniques for transformer-based models.
  • Monitor and optimize infrastructure performance across different levels of custom hardware hierarchy - cards, servers, and racks, which are powered by the d-Matrix custom AI chips.
  • Ensure best practices in ML model versioning, evaluation, and monitoring.
Desired Qualifications
  • Experience working with cloud-based ML pipelines (AWS, GCP, or Azure).
  • Experience with LLM fine-tuning, LoRA, PEFT, and KV cache optimizations.
  • Contributions to open-source ML projects or research publications.
  • Experience with low-level optimizations using CUDA, Triton, or XLA.

d-Matrix develops an AI compute platform aimed at improving efficiency for large datacenter customers. Their main product, the digital in-memory compute (DIMC) engine, integrates computing capabilities directly into programmable memory, which helps reduce power consumption and improve data movement. This technology allows for high-performance AI inference acceleration while maintaining accuracy. d-Matrix differentiates itself from competitors by offering scalable and modular solutions through a network of low-power chiplets that can be tailored for different applications. The company's goal is to provide energy-efficient AI compute solutions that meet the demands of large-scale datacenter operators.

Company Size

201-500

Company Stage

Series B

Total Funding

$154M

Headquarters

Santa Clara, California

Founded

2019

Simplify Jobs

Simplify's Take

What believers are saying

  • Growing interest in brain-inspired computing aligns with d-Matrix's mission and products.
  • The rise of AI-driven chatbots increases demand for specialized AI chips like d-Matrix's.
  • Collaboration with Micron Technology enhances product development and market reach.

What critics are saying

  • Increased competition from Nvidia and startups could pressure d-Matrix's market share.
  • Dependency on Micron Technology may affect d-Matrix's supply chain and innovation pace.
  • Regulatory changes like the EU's AI Act could impose compliance costs on d-Matrix.

What makes d-Matrix unique

  • d-Matrix's DIMC engine integrates compute into programmable memory for enhanced efficiency.
  • The company offers scalable AI solutions through low-power, customizable chiplets.
  • d-Matrix focuses on power-efficient AI inference acceleration for large datacenter operators.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Hybrid Work Options

Growth & Insights and Company News

Headcount

6 month growth

-2%

1 year growth

-9%

2 year growth

11%
VC News Network
Jan 2nd, 2025
Corsair: The Future of Generative AI Processing Unveiled

Additionally, Micron Technology is collaborating with d-Matrix to bolster Corsair's development and expansion, ensuring that this innovative processor meets the growing demands of the industry.

DIG Watch
Nov 21st, 2024
d-Matrix debuts AI chip for chatbots

Silicon Valley firm d-Matrix has launched its first AI chip, designed to enhance AI services like chatbots and video generators.

Wall Street Pit
Nov 19th, 2024
Challengers Arise: Can AMD, Intel, and Startups Take on Nvidia's AI Chip Reign?

The design and production of these chips is a highly intricate process, as demonstrated by d-Matrix's recent launch of its AI processor, "Corsair."

VentureBeat
Dec 27th, 2023
Ai Predictions For 2024: What Top Vcs Think

Are you ready to bring more awareness to your brand? Consider becoming a sponsor for The AI Impact Tour. Learn more about the opportunities here. As 2023 draws to a close, it’s a time of reflection on the monumental advances — and ethical debates — surrounding artificial intelligence this past year. The launch of chatbots like Bing Chat and Google Bard showcased impressive natural language abilities, while generative AI models like DALL-E 3 and MidJourney V6 stunned with their creative image generation.However, concerns were also raised about AI’s potential harms. The EU’s landmark AI Act sought to limit certain uses of the technology, and the Biden Administration issued guidelines on its development.With rapid innovation expected to continue, many wonder: What’s next for AI? To find out, we surveyed leading venture capitalists investing in artificial intelligence startups for their boldest predictions on what 2024 may bring. Will we see another “AI winter” as hype meets reality? Or will new breakthroughs accelerate adoption across industries? How will policymakers and the public respond?VCs from top firms including Bain Capital Ventures (BCV), Sapphire Ventures, Madrona, General Catalyst and more offered their outlook on topics ranging from the future of generative AI to GPU shortages, AI regulation, climate change applications, and more

Yahoo Finance
Oct 24th, 2023
A new phase of the AI race is coming - and chip startup d-Matrix could be the winner

With backers like Microsoft, d-Matrix is competing against Nvidia to become the next big thing in AI chips.