Full-Time

Machine Learning Applications Engineer

Posted on 4/18/2024

Etched.ai

Etched.ai

11-50 employees

Develops servers for transformer inference technology

Hardware

Junior

Cupertino, CA, USA

Required Skills
Python
Requirements
  • Deeply creative and able to think from first principles
  • Good understanding of LLM architecture and how to use them to build applications
  • 1+ year(s) of work experience at a cloud provider, AI company, or LLM startup
  • Experience writing performant real-time code AND proficient in Python
  • Breadth of knowledge about current research on large language models
Responsibilities
  • Provide input for engineers designing our integrations with current transformer-specific inference libraries, like TensorRT-LLM, TransformerEngine, Hugging Face TGI, and vLLM.
  • Help profile and understand where latency comes from in modern LLM serving stacks
  • Help customers create products that leverage the unique capabilities of model-specific silicon

This company is an excellent workplace for those passionate about cutting-edge server technology, specifically in the realm of transformer inference. By specializing in powerful servers that integrate advanced transformer architecture into their chips, the company leads in delivering high-performance computing solutions. This focus not only drives industry standards but also fosters a culture of technical excellence and continuous innovation among its team.

Company Stage

Seed

Total Funding

$5.4M

Headquarters

Cupertino, California

Founded

2022

Growth & Insights
Headcount

6 month growth

125%

1 year growth

1250%

2 year growth

1250%