Full-Time

Distributed LLM Inference Engineer

Confirmed live in the last 24 hours

Anyscale

Anyscale

501-1,000 employees

Platform for scaling AI workloads

Enterprise Software
AI & Machine Learning

Compensation Overview

$170.1k - $247kAnnually

Mid, Senior

Palo Alto, CA, USA + 1 more

More locations: San Francisco, CA, USA

This is a hybrid role, requiring in-office attendance.

Category
Applied Machine Learning
Deep Learning
AI & Machine Learning
Required Skills
Pytorch

You match the following Anyscale's candidate preferences

Employers are more likely to interview you if you match these preferences:

Degree
Experience
Requirements
  • Familiarity with running ML inference at large scale with high throughput
  • Familiarity with deep learning and deep learning frameworks (e.g. PyTorch)
  • Solid understanding of distributed systems, ML inference challenges
Responsibilities
  • Iterate very quickly with product teams to ship the end to end solutions for Batch and Online inference at high scale which will be used by Customers of Anyscale
  • Work across the stack integrating Ray Data and LLM engine providing optimizations across the stack to provide low cost solutions for large scale ML inference
  • Integrate with Open source software like VLLM, work closely with the community to adopt these techniques in Anyscale solutions, and also contribute improvements to open source
  • Follow the latest state-of-the-art in the open source and the research community, implementing and extending best practices
Desired Qualifications
  • ML Systems knowledge
  • Experience using Ray Data
  • Work closely with community on LLM engines like vLLM, TensorRT-LLM
  • Contributions to deep learning frameworks (PyTorch, TensorFlow)
  • Contributions to deep learning compilers (Triton, TVM, MLIR)
  • Prior experience working on GPUs / CUDA

Anyscale provides a platform designed to scale and productionize artificial intelligence (AI) and machine learning (ML) workloads. Its main product, Ray, is an open-source framework that helps developers manage and scale AI applications across various fields, including Generative AI, Large Language Models (LLMs), and computer vision. Ray allows companies to enhance the performance, fault tolerance, and scalability of their AI systems, with some users reporting over 90% improvements in efficiency, latency, and cost-effectiveness. Anyscale serves a range of clients, including major tech companies like OpenAI and Ant Group, who rely on Ray to train their largest models. The company operates on a software-as-a-service (SaaS) model, charging clients a subscription fee for access to the Ray platform, which ensures a consistent revenue stream. Anyscale's goal is to empower organizations to effectively scale their AI workloads and optimize their operations.

Company Stage

Series C

Total Funding

$252.5M

Headquarters

San Francisco, California

Founded

2019

Growth & Insights
Headcount

6 month growth

16%

1 year growth

-5%

2 year growth

-13%
Simplify Jobs

Simplify's Take

What believers are saying

  • Anyscale's $100M Series C funding indicates strong investor confidence and growth potential.
  • Partnership with Nvidia enhances performance and cost-efficiency for AI deployments.
  • Anyscale Endpoints offers 10X cost-efficiency for popular open-source LLMs.

What critics are saying

  • ShadowRay vulnerability in Ray framework poses significant security risk with no patch.
  • OctoML's OctoAI service increases competition in AI infrastructure market.
  • Dependency on Nvidia's technology could be risky if Nvidia faces issues.

What makes Anyscale unique

  • Anyscale's Ray framework scales AI applications from laptops to cloud seamlessly.
  • Ray is widely used in Generative AI, LLMs, and computer vision fields.
  • Anyscale's SaaS model provides recurring revenue through subscription fees for Ray platform.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Medical, Dental, and Vision insurance

401K retirement savings

Flexible time off

FSA and Commuter benefits

Parental and family leave

Office & phone plan reimbursement