Full-Time

Machine Learning Performance Architect

Posted on 4/10/2024

d-Matrix

d-Matrix

51-200 employees

AI compute platform using in-memory computing

Data & Analytics
Hardware
AI & Machine Learning

Junior

Santa Clara, CA, USA

Required Skills
Python
Tensorflow
Data Structures & Algorithms
Pytorch
Linux/Unix
Requirements
  • MSEE, Computer Science, Engineering, Math, Physics or related degree + 5 of industry experience, PHD with 1+ Year of industry experience preferred.
  • Strong grasp of computer architecture, data structures, system software, and machine learning fundamentals.
  • Experience with performance modeling, analysis and correlation (w/ RTL) of GPU/AI Accelerator architectures.
  • Proficient in C/C++ or Python development in Linux environment and using standard development tools.
  • Experience with deep learning frameworks (such as PyTorch, Tensorflow).
  • Experience with inference servers/model serving frameworks (such as Triton, TFServ, KubeFlow,…).
  • Experience with distributed systems collectives such as NCCL, OpenMPI,...
  • Experience with MLOps from definition to deployment including training, quantization, sparsity, model preprocessing, and deployment.
  • Self-motivated team player with a strong sense of ownership and leadership.
  • Prior startup, small team or incubation experience.
  • Work experience at a cloud provider or AI compute / sub-system company.
  • Experience with open-source ML compiler frameworks such as MLIR.
Responsibilities
  • Design space exploration, workload characterization/mapping spanning the data plane as well as control plane in the SoC.
  • Design, model and drive new architectural features to help design next generation hardware.
  • Evaluate performance of cutting edge AI workloads.
  • Build and scale software deliverables in a tight development window.
  • Work with a team of hardware architects to build out the modeling infrastructure and work closely with other software (ML, Systems, Compiler) and hardware (mixed signal, DSP, CPU) experts in the company.

d-Matrix is developing a unique AI compute platform using in-memory computing (IMC) techniques with chiplet level scale-out interconnects, revolutionizing datacenter AI inferencing. Their innovative circuit techniques, ML tools, software, and algorithms have successfully addressed the memory-compute integration problem, enhancing AI compute efficiency.

Company Stage

Series B

Total Funding

$161.5M

Headquarters

Santa Clara, California

Founded

2019

Growth & Insights
Headcount

6 month growth

0%

1 year growth

33%

2 year growth

203%