Simplify Logo

Full-Time

AI Simulation Architect

Confirmed live in the last 24 hours

Tenstorrent

Tenstorrent

201-500 employees

Develops AI/ML hardware accelerators and software

AI & Machine Learning
Hardware
Data & Analytics

Compensation Overview

$100k - $500kAnnually

Expert

Remote in USA

US Citizenship Required

Category
Applied Machine Learning
AI Research
AI & Machine Learning
Required Skills
Python
CUDA
Requirements
  • 15+ years of experience
  • Experience coding performance models in C++
  • Bachelor's or Master's degree in Computer Engineering, Electrical Engineering, or a related field. A Ph.D. is a plus.
  • Strong expertise in high-performance computing architecture design, including processors, accelerators, interconnects, and memory subsystems.
  • Experience developing new architectures using large scale performance simulation environments, for example GEM5 or SST
  • Experience analyzing workload behavior on large systems using open-source or custom software tools
  • Proven experience in designing and optimizing HPC architectures for scientific, research, or data-intensive applications.
  • Proficiency in parallel programming models and frameworks, such as OpenMP, MPI, CUDA, or OpenCL, and their application to HPC workloads.
  • Solid understanding of performance analysis and optimization techniques for parallel computing, including profiling, tracing, and performance counters.
  • Familiarity with industry-standard interconnects and network fabrics, such as InfiniBand, Ethernet, or Omni-Path, and their impact on HPC system performance.
  • Knowledge of memory subsystems and memory hierarchy designs, including cache coherence protocols, memory models, and NUMA architectures.
  • Experience with HPC software stack components, such as compilers, runtime systems, job schedulers, and scientific libraries.
  • Strong programming skills in languages commonly used in HPC, such as C, C++, Fortran, or Python.
  • Excellent problem-solving abilities and the ability to analyze and address complex performance and scalability challenges.
  • Strong communication and collaboration skills to work effectively with cross-functional teams and domain experts.
Responsibilities
  • Design simulation models/environments for large-scale AI/HPC systems consisting of tens of thousands of computational nodes, scale-out/scale-up switches/interconnects, and heterogeneous caching/memory systems.
  • Define simulation abstraction layers to manage different levels of simulation hierarchies, from abstract analytical roofline models to detailed cycle-accurate models, balancing simulation speed and accuracy.
  • Conduct performance analysis and benchmarking, writing performance models to identify bottlenecks, optimize system parameters, and guide architectural enhancements.
  • Simulate, design, and lead the development of high-performance computing architectures that deliver exceptional computational performance, scalability, and energy efficiency.
  • Collaborate with hardware engineers to design and optimize computational components, including processors, accelerators, interconnects, and memory subsystems.
  • Work closely with software developers to define and implement software development frameworks, libraries, and tools that maximize performance and productivity on the target HPC architecture.
  • Define and recommend system-level requirements, including processing power, memory capacity, I/O bandwidth, and storage capabilities, ensuring compliance with industry standards and customer expectation.
  • Evaluate and select appropriate technologies, including processors, accelerators, and network fabrics, based on application requirements, performance & power characteristics, and cost considerations.

Tenstorrent specializes in developing high-quality AI/ML accelerators, including individual PCIe cards, workstations, servers, and ultra-dense Galaxy pods, featuring industry-standard chiplets and a modular RISC-V CPU. Their products also incorporate a BUDA software framework designed for open-source collaboration, enabling efficient and scalable hardware for deep learning applications.

Company Stage

Series C

Total Funding

$334.5M

Headquarters

Toronto, Canada

Founded

2016

Growth & Insights
Headcount

6 month growth

19%

1 year growth

39%

2 year growth

125%
Simplify Jobs

Simplify's Take

What believers are saying

  • The launch of next-generation Wormhole-based developer kits and workstations could attract a significant developer community, driving innovation and adoption.
  • Collaborations with industry giants like Hyundai and Rapidus indicate strong growth potential and access to advanced manufacturing technologies.
  • The introduction of specialized AI inference acceleration boards like the Grayskull e75 and e150 can capture a niche market in AI and machine learning applications.

What critics are saying

  • The competitive landscape in AI hardware is intense, with major players like NVIDIA and Intel posing significant challenges.
  • Dependence on strategic partnerships for advanced manufacturing and technology development could lead to vulnerabilities if these partnerships falter.

What makes Tenstorrent unique

  • Tenstorrent's use of RISC-V architecture in their AI processors offers a unique alternative to traditional x86 and ARM architectures, providing flexibility and open-source benefits.
  • Their focus on high-performance AI chips and scalable developer kits positions them as a key player in the AI hardware market, particularly for developers seeking robust multi-chip solutions.
  • Strategic partnerships with global entities like Rapidus and C-DAC enhance their capabilities in cutting-edge semiconductor technology and edge AI processing.