Full-Time

Machine Learning Compiler Frontend Engineer

Posted on 4/18/2024

Etched.ai

Etched.ai

11-50 employees

Develops servers for transformer inference technology

Hardware

Senior

Cupertino, CA, USA

Required Skills
Python
Requirements
  • 5+ years of experience writing production-grade software.
  • Able to write production-grade code in Python
  • Experience with LLMs to build products
  • Experience with at least one of TensorRT, TensorRT-LLM, Transformer Engine, or vLLM
  • Great understanding of how companies working with LLMs build their inference stacks
  • 1+ year of work experience at a cloud provider
  • Deeply creative and able to think from first principles
Responsibilities
  • Design and develop our integrations with current transformer-specific inference libraries, like TensorRT-LLM, TransformerEngine, Hugging Face TGI, and vLLM.
  • Provide feedback to the firmware, compiler, and hardware teams based on compiler development work
  • Ensure the software we expose to customers is reliable and production-grade as soon as our servers begin to ship

This company is an excellent workplace for those passionate about cutting-edge server technology, specifically in the realm of transformer inference. By specializing in powerful servers that integrate advanced transformer architecture into their chips, the company leads in delivering high-performance computing solutions. This focus not only drives industry standards but also fosters a culture of technical excellence and continuous innovation among its team.

Company Stage

Seed

Total Funding

$5.4M

Headquarters

Cupertino, California

Founded

2022

Growth & Insights
Headcount

6 month growth

125%

1 year growth

1250%

2 year growth

1250%