Internship

Machine Learning Intern

Confirmed live in the last 24 hours

d-Matrix

d-Matrix

51-200 employees

AI compute platform for datacenters

Hardware
Enterprise Software
AI & Machine Learning

Santa Clara, CA, USA

Hybrid position requiring onsite work in Santa Clara, CA for 3 days per week.

Category
Applied Machine Learning
AI & Machine Learning
Required Skills
Python
CUDA
Data Structures & Algorithms
Pytorch
Requirements
  • Currently pursuing a degree in Computer Science, Electrical Engineering, Machine Learning, or a related field.
  • Familiarity with PyTorch and deep learning concepts, particularly regarding model optimization and memory management.
  • Understanding of CUDA programming and hardware-accelerated computation (experience with CUDA is a plus).
  • Strong programming skills in Python, with experience in PyTorch.
  • Analytical mindset with the ability to approach problems creatively.
  • Experience with deep learning model inference optimization.
  • Knowledge of data structures used in machine learning for memory and compute efficiency.
  • Experience with hardware-specific optimization, especially on custom hardware like D-Matrix, is an advantage.
Responsibilities
  • Research and analyze existing KV-Cache implementations used in LLM inference, particularly those utilizing lists of past-key-values PyTorch tensors.
  • Investigate 'Paged Attention' mechanisms that leverage dedicated CUDA data structures to optimize memory for variable sequence lengths.
  • Design and implement a torch-native dynamic KV-Cache model that can be integrated seamlessly within PyTorch.
  • Model KV-Cache behavior within the PyTorch compute graph to improve compatibility with torch.compile and facilitate the export of the compute graph.
  • Conduct experiments to evaluate memory utilization and inference efficiency on D-Matrix hardware.

d-Matrix focuses on improving the efficiency of AI computing for large datacenter customers. The main product is the digital in-memory compute (DIMC) engine, which combines computing capabilities directly within programmable memory. This design helps reduce power consumption and enhances data processing speed while ensuring accuracy. Unlike many competitors, d-Matrix offers a modular and scalable approach, utilizing low-power chiplets that can be tailored for different applications. The goal is to provide high-performance AI inference solutions that are energy-efficient, catering specifically to the needs of large-scale datacenter operators.

Company Stage

Series B

Total Funding

$149.8M

Headquarters

Santa Clara, California

Founded

2019

Growth & Insights
Headcount

6 month growth

-14%

1 year growth

-3%

2 year growth

235%
Simplify Jobs

Simplify's Take

What believers are saying

  • Securing $110 million in Series B funding positions d-Matrix for rapid growth and technological advancements.
  • Their Jayhawk II silicon aims to solve critical issues in AI inference, such as cost, latency, and throughput, making generative AI more commercially viable.
  • The company's focus on efficient AI inference could attract significant interest from data centers and enterprises looking to deploy large language models.

What critics are saying

  • Competing against industry giants like Nvidia poses a significant challenge in terms of market penetration and customer acquisition.
  • The high dependency on continuous innovation and technological advancements could strain resources and lead to potential setbacks.

What makes d-Matrix unique

  • d-Matrix focuses on developing AI hardware specifically optimized for Transformer models, unlike general-purpose AI chip providers like Nvidia.
  • Their digital in-memory compute (DIMC) architecture with chiplet interconnect is a first-of-its-kind innovation, setting them apart in the AI hardware market.
  • Backed by major investors like Microsoft, d-Matrix has the financial support to challenge established players like Nvidia.

Help us improve and share your feedback! Did you find this helpful?