Full-Time

Principal Inference Stack Engineer

Confirmed live in the last 24 hours

Groq

Groq

201-500 employees

AI inference technology for scalable solutions

AI & Machine Learning

Compensation Overview

$248.7k - $407.1kAnnually

Expert

Remote in Canada

The company is geo-agnostic, meaning employees can work from anywhere, but some roles may require being located near primary sites.

Category
Applied Machine Learning
Deep Learning
AI & Machine Learning
Required Skills
Tensorflow
Pytorch
Machine Learning
C/C++
FPGA

You match the following Groq's candidate preferences

Employers are more likely to interview you if you match these preferences:

Degree
Experience
Requirements
  • 10+ years of experience in the area of computer science/engineering or related
  • 5+ years of direct experience with C/C++ and runtime frameworks
  • Knowledge of LLVM and compiler architecture
  • Experience with mapping HPC, ML, or Deep Learning workloads to accelerators
  • Knowledge of spatial architectures such as FPGA or CGRAs an asset
  • Knowledge with distributed systems and disaggregated compute desired
  • Knowledge of functional programming an asset
  • Experience with ML frameworks such as TensorFlow or PyTorch desired
  • Knowledge of ML IR representations such as ONNX and Deep Learning
Responsibilities
  • Analyze latest ML workloads from Groq partners or Cloud and develop optimization roadmap and strategies to improve inference performance and operating efficiency of workload
  • Design, develop, and maintain optimizing compiler for Groq's LPU
  • Expand Groq runtime API to simplify execution model of Groq LPUs
  • Benchmark and analyze output produced by optimizing compiler and runtime, and drive enhancements to improve its quality-of-results when measured on the Groq LPU hardware.
  • Manage large multi-person and multi-geo projects and interface with various leads across the company
  • Mentor junior compiler engineers and collaborate with other senior compiler engineers on the team.
  • Review and accept code updates to compiler passes and IR definitions.
  • Work with HW teams and architects to drive improvements in architecture and SW compiler
  • Publish novel compilation techniques to Groq's TSP at top-tier ML, Applications, Compiler, and Computer Architecture conferences.
Desired Qualifications
  • Knowledge of distributed systems and disaggregated compute desired
  • Knowledge of functional programming an asset
  • Experience with ML frameworks such as TensorFlow or PyTorch desired
  • Knowledge of ML IR representations such as ONNX and Deep Learning

Groq specializes in AI inference technology, providing the Groq LPU™, which is known for its high compute speed, quality, and energy efficiency. The Groq LPU™ is designed to handle AI processing tasks quickly and effectively, making it suitable for both cloud and on-premises applications. Unlike many competitors, Groq's products are designed, fabricated, and assembled in North America, which helps maintain high standards of quality and performance. The company targets a variety of clients across different industries that require fast and efficient AI processing capabilities. Groq's goal is to deliver scalable AI inference solutions that meet the growing demands for rapid data processing in the AI and machine learning market.

Company Stage

Series D

Total Funding

$1.3B

Headquarters

Mountain View, California

Founded

2016

Growth & Insights
Headcount

6 month growth

5%

1 year growth

-1%

2 year growth

-5%
Simplify Jobs

Simplify's Take

What believers are saying

  • Groq secured $640M in Series D funding, boosting expansion and talent acquisition.
  • Partnership with Aramco Digital to build a large data center enhances market presence.
  • Integration with Touchcast's Cognitive Caching sets new standards in AI processing speeds.

What critics are saying

  • DeepSeek's R1 model poses a competitive threat with its cost-effective capabilities.
  • SambaNova and Gradio's integration may reduce Groq's competitive edge in AI inference.
  • Geopolitical risks may impact the Saudi Arabia data center project with Aramco Digital.

What makes Groq unique

  • Groq's LPU offers exceptional compute speed and energy efficiency for AI inference.
  • The company emphasizes deterministic performance, ensuring predictable AI computation outcomes.
  • Groq's products are designed and assembled in North America, ensuring high quality.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Remote Work Options

Company Equity