Full-Time

Machine Learning Compiler Frontend Engineer

Posted on 4/18/2024

Etched.ai

11-50 employees

Develops servers for transformer inference technology

Hardware

Senior

Cupertino, CA, USA

Required Skills

Python

Requirements

5+ years of experience writing production-grade software.
Able to write production-grade code in Python
Experience with LLMs to build products
Experience with at least one of TensorRT, TensorRT-LLM, Transformer Engine, or vLLM
Great understanding of how companies working with LLMs build their inference stacks
1+ year of work experience at a cloud provider
Deeply creative and able to think from first principles

Responsibilities

Design and develop our integrations with current transformer-specific inference libraries, like TensorRT-LLM, TransformerEngine, Hugging Face TGI, and vLLM.
Provide feedback to the firmware, compiler, and hardware teams based on compiler development work
Ensure the software we expose to customers is reliable and production-grade as soon as our servers begin to ship

ML Compiler Frontend Engineer

Etched is building the hardware for superintelligence.

GPUs and TPUs are flexible AI chips that can run many kinds of models: CNNs, RNNs, LSTMs, and more. But today, almost all AI workloads, from ChatGPT to self-driving cars, are done on one model architecture: transformers. Using flexible AI chips for transformers is very inefficient: <5% of the transistors on an H100 are used for matrix multiplication!

Etched is building a single-purpose chip exclusively for transformer inference. We only support transformers, but in exchange our chips have an order of magnitude more throughput and lower latency than an H100. With Etched, you can build products that would be impossible with GPUs, like tree-of-thought agents and ultra-low-latency audio chat bots.

Etched is looking for exceptional ML compiler frontend engineers to join our team and build production-grade integrations with today’s transformer libraries. The ideal candidate has experience working closely with LLMs in products and also understands how efficient inference works under the hood.

Responsibilities:

Design and develop our integrations with current transformer-specific inference libraries, like TensorRT-LLM, TransformerEngine, Hugging Face TGI, and vLLM.
Provide feedback to the firmware, compiler, and hardware teams based on compiler development work
Ensure the software we expose to customers is reliable and production-grade as soon as our servers begin to ship

Requirements:

5+ years of experience writing production-grade software.
Able to write production-grade code in Python
Experience with LLMs to build products
Experience with at least one of TensorRT, TensorRT-LLM, Transformer Engine, or vLLM
Great understanding of how companies working with LLMs build their inference stacks
1+ year of work experience at a cloud provider
Deeply creative and able to think from first principles

Desired qualifications:

Experience with hardware design and development
Proficiency with GPU programming
Experience working in hardware simulation/emulation environments.

Benefits:

Competitive salary and equity package
Full medical, dental, and vision packages, with 100% of premium covered
Work with world-class people and state-of-the-art AIs everyday

Etched is committed to fair and equitable compensation practices. Compensation is determined based on your qualifications and experience. Compensation packages also include generous equity in Etched.

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

Etched.ai

View

Website

View Company Profile

This company is an excellent workplace for those passionate about cutting-edge server technology, specifically in the realm of transformer inference. By specializing in powerful servers that integrate advanced transformer architecture into their chips, the company leads in delivering high-performance computing solutions. This focus not only drives industry standards but also fosters a culture of technical excellence and continuous innovation among its team.

Company Stage

Seed

Total Funding

$5.4M

Headquarters

Cupertino, California

Founded

2022

Growth & Insights

Headcount

6 month growth

↑ 125%

1 year growth

↑ 1250%

2 year growth

↑ 1250%