Machine Learning Engineer
LLM Infrastructure
Posted on 9/7/2023
Scale AI

51-200 employees

Data platform for AI
Company Overview
Scale AI's mission is to accelerate the development of AI applications.
Locations
San Francisco, CA, USA
Experience Level
Entry
Junior
Mid
Senior
Expert
Desired Skills
AWS
CUDA
Docker
Google Cloud Platform
Terraform
Kubernetes
Python
CategoriesNew
AI & Machine Learning
DevOps & Infrastructure
Software Engineering
Requirements
  • 2+ years of experience building machine learning training pipelines or inference services in a production setting
  • Experience with LLM deployment, fine tuning, training, prompt engineering, etc
  • Experience with LLM inference latency optimization techniques, e.g. kernel fusion, quantization, dynamic batching, etc
  • Experience with CUDA, model compilers, and other model-specific optimizations
Responsibilities
  • Build highly available, observable, performant, and cost-effective APIs for model inference and fine tuning for LLMs
  • Engage with ML researchers and stay up to date on the latest trends from industry and academia
  • Participate in our team's on call process to ensure the availability of our services
  • Own projects end-to-end, from requirements, scoping, design, to implementation, in a highly collaborative and cross-functional environment
  • Exercise good taste in building systems and tools and know when to make build vs. buy tradeoffs, with an eye for cost efficiency
Desired Qualifications
  • Experience working with a cloud technology stack (eg. AWS or GCP)
  • Experience building, deploying, and monitoring complex microservice architectures
  • Experience with Python, Docker, Kubernetes, and Infrastructure as code (e.g. terraform)