Full-Time

Tech Lead Manager

ML Performance

Confirmed live in the last 24 hours

Baseten

Baseten

51-200 employees

Platform for deploying and managing ML models

AI & Machine Learning

Senior

San Francisco, CA, USA

Category
Backend Engineering
Software Engineering
Required Skills
Kubernetes
Python
CUDA
Pytorch
Machine Learning
Docker
Go
C/C++
Requirements
  • Bachelor’s, Master’s, or Ph.D. in Computer Science, Engineering, or a related field.
  • 5+ years of professional experience in software engineering, with at least 2 years in a technical leadership role.
  • Proven experience managing and mentoring teams of engineers.
  • Expertise in one or more programming languages, such as Python, C++, or Go.
  • In-depth understanding of ML model performance optimization, especially using libraries such as PyTorch, TensorRT, and CUDA.
  • Strong knowledge of containerization (Docker) and orchestration systems (Kubernetes).
  • Experience with production-level AI/ML solutions, including scaling and deploying large models.
  • Ability to balance hands-on technical work with team leadership and project management.
Responsibilities
  • Lead, mentor, and manage a team of engineers focused on developing and optimizing ML model inference and performance.
  • Oversee technical strategy and architecture decisions, driving improvements across our engineering organization.
  • Collaborate with cross-functional teams to ensure seamless integration and scalability of ML models in production environments.
  • Dive into the codebase of frameworks like TensorRT, PyTorch, CUDA, and others to identify and solve complex performance bottlenecks.
  • Drive the development and deployment of large-scale optimization techniques for various ML models, especially large language models (LLMs).
  • Own the full lifecycle of projects from inception through delivery, including planning, execution, and resource management.
  • Foster a collaborative, inclusive team environment that encourages continuous learning and growth.
Desired Qualifications
  • Experience enhancing the performance of large language models (LLMs) or similar AI systems.
  • Familiarity with LLM optimization techniques such as quantization, speculative decoding, or continuous batching.
  • Deep knowledge of GPU architecture and performance tuning.
  • Previous experience in a high-growth startup environment.

Baseten provides a platform for deploying and managing machine learning (ML) models, aimed at simplifying the process for businesses. Users can select from a library of open-source foundation models and deploy them with just two clicks, making it easier for tech companies, data scientists, and ML engineers to implement ML solutions. The platform features autoscaling, which adjusts resources based on demand, and monitoring tools for tracking performance and troubleshooting. A distinctive feature is Truss, an open-source model packaging framework that allows users to deploy any model using a command-line interface. Baseten operates on a usage-based pricing model, charging clients only for the time their models are actively deployed, which helps businesses manage costs while utilizing efficient ML infrastructure.

Company Stage

Series B

Total Funding

$58.4M

Headquarters

San Francisco, California

Founded

2019

Growth & Insights
Headcount

6 month growth

-5%

1 year growth

3%

2 year growth

3%
Simplify Jobs

Simplify's Take

What believers are saying

  • Integration with Google Cloud Marketplace boosts visibility and customer acquisition potential.
  • $40M Series B funding enhances Baseten's platform capabilities and market reach.
  • Chains framework positions Baseten for complex AI workflows, attracting sophisticated projects.

What critics are saying

  • Increased competition from specialized AI models tailored for specific industries.
  • Potential over-reliance on Google Cloud Marketplace may limit flexibility and control.
  • Rapid AI model development could render Baseten's offerings obsolete without continuous innovation.

What makes Baseten unique

  • Baseten offers a serverless backend for machine-learning applications with auto-scaling.
  • Truss, an open-source model packaging framework, allows seamless deployment of custom models.
  • Baseten's platform provides comprehensive monitoring tools for efficient model performance tracking.

Help us improve and share your feedback! Did you find this helpful?

Benefits

💰 Competitive compensation: We aim to provide 90th percentile (or better) salaries and equity grants for every team member commensurate with their experience.

🌎 Remote-first work environment: The Baseten team is welcome to work from wherever they want; fully remote, in our San Francisco office, or a mix of both. We provide a $1,000 stipend for you to make your home office comfortable and productive.

🏓 Regular in-person team summits: We get together as a team three times a year to plan, workshop, and most importantly, get to know each other better.

🌴 Unlimited PTO: We ask that everyone take at least 4 weeks of vacation. And we have a company-wide break between Christmas and New Year's Day.

🏥 Full healthcare coverage: Medical, dental and vision insurance for you and your family.

🍼 Paid parental leave: 16-weeks fully paid parental leave (adoptive and non-birth parents included) and flexibility with schedules while returning to work.

📈 401(k): Company-sponsored 401(k) for you to contribute to.

🧠: Learning and development budget: We encourage you to take classes, attend conferences, and invest in your craft and we’ll cover expenses to make it happen.