Full-Time

Distributed ML Systems Engineer-Inference

Confirmed live in the last 24 hours

Together AI

Together AI

51-200 employees

Decentralized cloud services for AI development

Enterprise Software
AI & Machine Learning

Compensation Overview

$160k - $230kAnnually

+ Equity + Benefits

Mid, Senior

San Francisco, CA, USA

Category
Applied Machine Learning
AI Research
AI & Machine Learning
Required Skills
Rust
Microsoft Azure
Python
Machine Learning
Operating Systems
AWS
Go
C/C++
Google Cloud Platform

You match the following Together AI's candidate preferences

Employers are more likely to interview you if you match these preferences:

Degree
Experience
Requirements
  • 3+ years of experience in building large-scale, fault-tolerant, high-performance distributed systems.
  • Strong programming skills in one or more of Python, Go, Rust, or C/C++.
  • Excellent understanding of low-level operating systems concepts including multi-threading, memory management, networking, and storage, performance, and scale.
  • Experience with cloud computing platforms (AWS, GCP, Azure etc.) and large-scale infrastructure.
  • Strong problem-solving skills and ability to work in a fast-paced environment.
Responsibilities
  • Design and build large-scale, distributed machine learning systems that are fault-tolerant and high-performance.
  • Develop and optimize distributed processing frameworks and storage systems.
  • Collaborate with researchers, engineers, and product managers to integrate ML systems into our infrastructure.
  • Conduct architecture and design reviews to ensure best practices in system design.
  • Implement robust monitoring and logging systems to ensure the health and performance of our ML systems.
Desired Qualifications
  • Experience with Kubernetes
  • Experience with Pytorch

Together AI focuses on enhancing artificial intelligence through open-source contributions. The company offers decentralized cloud services that allow developers and researchers from various organizations to train, fine-tune, and deploy generative AI models. Their services cater to a wide range of clients, including small startups, large enterprises, and academic institutions. Together AI's business model is based on providing cloud-based solutions that support the development and deployment of AI models, generating revenue through service subscriptions and usage fees. The company stands out from competitors by emphasizing open and transparent AI systems, which fosters innovation and aims to achieve beneficial outcomes for society.

Company Stage

Series A

Total Funding

$222.3M

Headquarters

Menlo Park, California

Founded

2022

Growth & Insights
Headcount

6 month growth

0%

1 year growth

5%

2 year growth

6%
Simplify Jobs

Simplify's Take

What believers are saying

  • Together AI leverages Meta's Llama 3.2 Vision, expanding multimodal AI capabilities.
  • FlashAttention-3 optimizes Nvidia GPUs, reducing costs for Together AI's cloud services.
  • Decreasing AI model costs, like DeepSeek R1, allow Together AI to offer cost-effective solutions.

What critics are saying

  • DeepSeek R1's low-cost model could undercut Together AI's pricing strategy.
  • Integration challenges from acquiring CodeSandbox may disrupt service continuity.
  • Meta's Llama 3.2 Vision's free access might reduce demand for Together AI's paid services.

What makes Together AI unique

  • Together AI focuses on open-source contributions, enhancing transparency and innovation.
  • The company offers decentralized cloud services for AI model training and deployment.
  • Together AI's acquisition of CodeSandbox adds a code interpreter to its platform.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Health Insurance

Company Equity