Full-Time
Machine Learning Performance Architect
Posted on 4/10/2024
AI compute platform using in-memory computing
Data & Analytics
Hardware
AI & Machine Learning
Junior
Santa Clara, CA, USA
Required Skills
Python
Tensorflow
Data Structures & Algorithms
Pytorch
Linux/Unix
Requirements
- MSEE, Computer Science, Engineering, Math, Physics or related degree + 5 of industry experience, PHD with 1+ Year of industry experience preferred.
- Strong grasp of computer architecture, data structures, system software, and machine learning fundamentals.
- Experience with performance modeling, analysis and correlation (w/ RTL) of GPU/AI Accelerator architectures.
- Proficient in C/C++ or Python development in Linux environment and using standard development tools.
- Experience with deep learning frameworks (such as PyTorch, Tensorflow).
- Experience with inference servers/model serving frameworks (such as Triton, TFServ, KubeFlow,…).
- Experience with distributed systems collectives such as NCCL, OpenMPI,...
- Experience with MLOps from definition to deployment including training, quantization, sparsity, model preprocessing, and deployment.
- Self-motivated team player with a strong sense of ownership and leadership.
- Prior startup, small team or incubation experience.
- Work experience at a cloud provider or AI compute / sub-system company.
- Experience with open-source ML compiler frameworks such as MLIR.
Responsibilities
- Design space exploration, workload characterization/mapping spanning the data plane as well as control plane in the SoC.
- Design, model and drive new architectural features to help design next generation hardware.
- Evaluate performance of cutting edge AI workloads.
- Build and scale software deliverables in a tight development window.
- Work with a team of hardware architects to build out the modeling infrastructure and work closely with other software (ML, Systems, Compiler) and hardware (mixed signal, DSP, CPU) experts in the company.
d-Matrix is developing a unique AI compute platform using in-memory computing (IMC) techniques with chiplet level scale-out interconnects, revolutionizing datacenter AI inferencing. Their innovative circuit techniques, ML tools, software, and algorithms have successfully addressed the memory-compute integration problem, enhancing AI compute efficiency.
Company Stage
Series B
Total Funding
$161.5M
Headquarters
Santa Clara, California
Founded
2019
Growth & Insights
Headcount