Simplify Logo

Full-Time

Acceleration Kernel Developer

Confirmed live in the last 24 hours

Tenstorrent

Tenstorrent

201-500 employees

Develops AI/ML hardware accelerators and software

AI & Machine Learning
Hardware

Compensation Overview

$100k - $500kAnnually

Mid, Senior

Santa Clara, CA, USA

Hybrid role based out of Santa Clara, CA.

US Citizenship Required

Category
Embedded Engineering
Software Engineering
Required Skills
Software Testing
Requirements
  • Bachelor's degree in Computer Science, Software Engineering, or a related field.
  • Proven experience in kernel development, with a strong focus on low-level optimizations and tensor optimization.
  • Proficiency in C/C++ programming languages.
  • Familiarity with machine learning frameworks and concepts.
  • Strong problem-solving skills and the ability to analyze and debug complex issues.
  • Experience with performance profiling and optimization tools.
  • Excellent communication and teamwork skills.
  • Self-motivated, detail-oriented, and able to work independently as well as in a team.
Responsibilities
  • Participate in the design, development, and maintenance of kernel-level software components for our applications. develop and optimize kernels and kernel libraries for efficient machine learning and HPC applications.
  • Implementation of tensor compute and tensor data movement optimizations kernels
  • Heavy focus on optimizations.
  • Analyze and optimize low-level code to improve the performance and efficiency of our software, with a strong emphasis on tensor optimization.
  • Collaborate with machine learning engineers and data scientists to integrate optimized kernels and low-level routines into machine learning frameworks and pipelines.
  • Identify performance bottlenecks, conduct performance profiling, and develop strategies to address and resolve them.
  • Write comprehensive unit tests, conduct thorough debugging, and ensure the stability and reliability of kernel-level code.
  • Create clear and concise documentation for code, APIs, and best practices to facilitate collaboration within the team.
  • Stay up-to-date with the latest developments in kernel development, tensor optimization, and machine learning to propose innovative solutions and improvements.

Tenstorrent specializes in developing high-quality AI/ML accelerators, including individual PCIe cards, workstations, servers, and ultra-dense Galaxy pods, featuring industry-standard chiplets and a modular RISC-V CPU. Their products also incorporate a BUDA software framework designed for open-source collaboration, enabling efficient and scalable hardware for deep learning applications.

Company Stage

Series C

Total Funding

$334.5M

Headquarters

Toronto, Canada

Founded

2016

Growth & Insights
Headcount

6 month growth

19%

1 year growth

39%

2 year growth

125%
Simplify Jobs

Simplify's Take

What believers are saying

  • The launch of next-generation Wormhole-based developer kits and workstations could attract a significant developer community, driving innovation and adoption.
  • Collaborations with industry giants like Hyundai and Rapidus indicate strong growth potential and access to advanced manufacturing technologies.
  • The introduction of specialized AI inference acceleration boards like the Grayskull e75 and e150 can capture a niche market in AI and machine learning applications.

What critics are saying

  • The competitive landscape in AI hardware is intense, with major players like NVIDIA and Intel posing significant challenges.
  • Dependence on strategic partnerships for advanced manufacturing and technology development could lead to vulnerabilities if these partnerships falter.

What makes Tenstorrent unique

  • Tenstorrent's use of RISC-V architecture in their AI processors offers a unique alternative to traditional x86 and ARM architectures, providing flexibility and open-source benefits.
  • Their focus on high-performance AI chips and scalable developer kits positions them as a key player in the AI hardware market, particularly for developers seeking robust multi-chip solutions.
  • Strategic partnerships with global entities like Rapidus and C-DAC enhance their capabilities in cutting-edge semiconductor technology and edge AI processing.