Senior NN Kernel Engineer

Untether AI

51-200 employees

AI acceleration using at-memory computing

AI & Machine Learning


Toronto, ON, Canada

Required Skills
Data Structures & Algorithms
  • Computer Science, Engineering, Math, Physics or related degree, preferably MS or PhD
  • Deep knowledge of modern C++ with emphasis on code generation and low level compute optimizations
  • Knowledge of Neural Network basic operator algorithms - Convolutions, Transformers, RNNs
  • Demonstrated ability to work independently through challenging but tightly constrained problems
  • Interest and ability to work with both high level conceptual and very low-level technical details
  • Python experience
  • Experience with other AI accelerator programming
  • Strong mathematical skills
  • Enjoy solving very complex problems (like doing IQ tests, solving tricky math problems)
  • Design, prototype and implement C++ low-level flexible programs (kernels) for various neural net operations
  • Design, document and communicate configuration APIs for these kernels to compiler team
  • Communicate performance optimization ideas both to compiler engineers and to architects working on future product generations
  • Design overall computation strategies across kernels for multikernel and multi-chip neural net implementations

Untether AI specializes in runAI200® devices and tsunAImi® accelerator cards that leverage at-memory computing to accelerate AI inference for various neural networks, such as vision and natural language processing, delivering industry-leading efficiency and improved accuracy. Their groundbreaking chip architecture eliminates the data movement bottleneck in traditional architectures, enabling ultra-efficient and high-performance AI chips for new frontiers in AI applications.

Company Stage

Series B

Total Funding



Toronto, Canada



