Internship

Machine Learning Intern

Confirmed live in the last 24 hours

d-Matrix

d-Matrix

201-500 employees

AI compute platform for datacenters

Enterprise Software
AI & Machine Learning

Santa Clara, CA, USA

Hybrid position requiring onsite presence in Santa Clara, CA for 3 days per week.

Category
Applied Machine Learning
Deep Learning
AI & Machine Learning
Required Skills
Python
CUDA
Pytorch
Machine Learning
Requirements
  • Currently pursuing a degree in Computer Science, Electrical Engineering, Machine Learning, or a related field.
  • Familiarity with PyTorch and deep learning concepts, particularly regarding model optimization and memory management.
  • Understanding of CUDA programming and hardware-accelerated computation (experience with CUDA is a plus).
  • Strong programming skills in Python, with experience in PyTorch.
  • Analytical mindset with the ability to approach problems creatively.
Responsibilities
  • Research and analyze existing KV-Cache implementations used in LLM inference, particularly those utilizing lists of past-key-values PyTorch tensors.
  • Investigate 'Paged Attention' mechanisms that leverage dedicated CUDA data structures to optimize memory for variable sequence lengths.
  • Design and implement a torch-native dynamic KV-Cache model that can be integrated seamlessly within PyTorch.
  • Model KV-Cache behavior within the PyTorch compute graph to improve compatibility with torch.compile and facilitate the export of the compute graph.
  • Conduct experiments to evaluate memory utilization and inference efficiency on D-Matrix hardware.
Desired Qualifications
  • Experience with deep learning model inference optimization.
  • Knowledge of data structures used in machine learning for memory and compute efficiency.
  • Experience with hardware-specific optimization, especially on custom hardware like D-Matrix, is an advantage.

d-Matrix focuses on improving the efficiency of AI computing for large datacenter customers. Its main product is the digital in-memory compute (DIMC) engine, which combines computing capabilities directly with programmable memory. This design helps reduce power consumption and enhances data processing speed while ensuring accuracy. d-Matrix differentiates itself from competitors by offering a modular and scalable approach, utilizing low-power chiplets that can be tailored for different applications. The company's goal is to provide high-performance, energy-efficient AI inference solutions to large-scale datacenter operators.

Company Stage

Series B

Total Funding

$149.8M

Headquarters

Santa Clara, California

Founded

2019

Growth & Insights
Headcount

6 month growth

2%

1 year growth

0%

2 year growth

9%
Simplify Jobs

Simplify's Take

What believers are saying

  • Growing demand for energy-efficient AI solutions boosts d-Matrix's low-power chiplets appeal.
  • Partnerships with companies like Microsoft could lead to strategic alliances.
  • Increasing adoption of modular AI hardware in data centers benefits d-Matrix's offerings.

What critics are saying

  • Competition from Nvidia, AMD, and Intel may pressure d-Matrix's market share.
  • Complex AI chip design could lead to delays or increased production costs.
  • Rapid AI innovation may render d-Matrix's technology obsolete if not updated.

What makes d-Matrix unique

  • d-Matrix's DIMC engine integrates compute into memory, enhancing efficiency and accuracy.
  • The company offers scalable AI solutions through modular, low-power chiplets.
  • d-Matrix focuses on brain-inspired AI compute engines for diverse inferencing workloads.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Hybrid Work Options