Full-Time

Software Engineer

Model Training Infrastructure

Confirmed live in the last 24 hours

Anyscale

Anyscale

501-1,000 employees

Platform for scaling AI workloads

Enterprise Software
AI & Machine Learning

Compensation Overview

$170.1k - $237kAnnually

Senior

Palo Alto, CA, USA + 1 more

More locations: San Francisco, CA, USA

This is a hybrid position, requiring in-office presence.

Category
Backend Engineering
FinTech Engineering
Software Engineering
Required Skills
Tensorflow
Data Structures & Algorithms
Pytorch
Machine Learning

You match the following Anyscale's candidate preferences

Employers are more likely to interview you if you match these preferences:

Degree
Experience
Requirements
  • Minimum 5+ years of experience building, scaling, and maintaining software systems in production environments
  • Strong fundamentals in algorithms, data structures, and system design
  • Proficiency with machine learning frameworks and libraries (e.g., PyTorch, TensorFlow, XGBoost)
  • Experience designing fault-tolerant distributed systems
  • Solid architectural skills
Responsibilities
  • Develop scalable, fault-tolerant distributed machine learning libraries that power leading ML platforms
  • Create an exceptional end-to-end experience for training machine learning models
  • Solve complex architectural challenges and transform them into practical solutions
  • Contribute to and engage with the open-source community, collaborating with ML researchers, engineers, and data scientists to build new scalable machine learning abstractions
  • Share your work and expertise with a broader audience through talks, tutorials, and blog posts
  • Collaborate with a team of experts in distributed systems and machine learning
  • Work directly with end-users to iterate on and enhance the product based on their feedback
  • Partner with engineering and product managers to nurture a talented team of software engineers
  • Play a key role in building and shaping a world-class company
Desired Qualifications
  • Experience with cloud technologies (AWS, GCP, Kubernetes)
  • Hands-on experience building ML training platforms in production
  • Background in managing and maintaining open-source libraries
  • Experience leading small teams to achieve ambitious technical goals
  • Familiarity with Ray

Anyscale provides a platform designed to scale and productionize artificial intelligence (AI) and machine learning (ML) workloads. Its main product, Ray, is an open-source framework that helps users efficiently scale their AI applications across various fields, including Generative AI, Large Language Models (LLMs), and computer vision. Companies like OpenAI and Ant Group utilize Ray to train large models and enhance the performance and reliability of their ML systems. Anyscale's platform significantly improves scalability, latency, and cost-efficiency, with some clients experiencing over 90% enhancements in these areas. The company operates on a software-as-a-service (SaaS) model, allowing clients to subscribe to access Ray and its features, ensuring a consistent revenue stream. Anyscale's goal is to empower organizations to optimize their AI workloads and improve operational efficiency.

Company Size

501-1,000

Company Stage

Series C

Total Funding

$252.5M

Headquarters

San Francisco, California

Founded

2019

Simplify Jobs

Simplify's Take

What believers are saying

  • Anyscale's $100M Series C funding indicates strong investor confidence and growth potential.
  • Partnership with Nvidia enhances performance and cost-efficiency for AI deployments.
  • Anyscale Endpoints offers 10X cost-efficiency for popular open-source LLMs.

What critics are saying

  • ShadowRay vulnerability in Ray framework poses significant security risk with no patch.
  • OctoML's OctoAI service increases competition in AI infrastructure market.
  • Dependency on Nvidia's technology could be risky if Nvidia faces issues.

What makes Anyscale unique

  • Anyscale's Ray framework scales AI applications from laptops to cloud seamlessly.
  • Ray is widely used in Generative AI, LLMs, and computer vision fields.
  • Anyscale's SaaS model provides recurring revenue through subscription fees for Ray platform.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Medical, Dental, and Vision insurance

401K retirement savings

Flexible time off

FSA and Commuter benefits

Parental and family leave

Office & phone plan reimbursement

Growth & Insights and Company News

Headcount

6 month growth

8%

1 year growth

0%

2 year growth

-13%
Blockchain News
Oct 29th, 2024
Anyscale and Astronomer Collaborate to Enhance Scalable Machine Learning

This partnership allows organizations to effectively manage and scale their ML workflows by integrating Astronomer's workflow management capabilities with Anyscale's distributed computing power.

Datanami
Oct 1st, 2024
Anyscale Unveils New Products and AI Platform Enhancements at Ray Summit 2024

Anyscale unveils new products and AI Platform enhancements at Ray Summit 2024.

Financial Post
Jul 31st, 2024
Anyscale Names Industry Veteran Keerti Melkote Chief Executive Officer

SAN FRANCISCO, July 31, 2024 (GLOBE NEWSWIRE) - Anyscale, the company behind Ray, the open source framework for scalable AI, named industry veteran Keerti Melkote as chief executive officer following a year of 4x revenue growth and explosive open source adoption.

Blockchain News
Jun 6th, 2024
Anyscale and deepsense.ai Collaborate on Cross-Modal Search for E-commerce

Anyscale and deepsense.ai develop a scalable cross-modal image retrieval system for e-commerce.

VentureBeat
Mar 27th, 2024
‘Shadowray’ Vulnerability On Ray Framework Exposes Thousands Of Ai Workloads, Compute Power And Data

Join us in Atlanta on April 10th and explore the landscape of security workforce. We will explore the vision, benefits, and use cases of AI for security teams. Request an invite here. Thousands of companies use the Ray framework to scale and run highly complex, compute-intensive AI workloads — in fact, you’d be hard-pressed to find a large language model (LLM) that hasn’t been built on Ray. Those workloads contain loads of sensitive data, which, researchers have found, could be highly exposed through a critical vulnerability (CVE) in the open-source unified compute framework. For the last seven months, this flaw has allowed attackers to exploit thousands of companies’ AI production workloads, computing power, credentials, passwords, keys, tokens and “a trove” of other sensitive information, according to new research from Oligo Security. The vulnerability is under dispute — meaning that it is not considered a risk and has no patch. This makes it a “shadow vulnerability,” or one that doesn’t appear in scans. Fittingly, researchers have dubbed it “ShadowRay.”