Full-Time

Tech Lead Manager

Model Performance and Inference

Confirmed live in the last 24 hours

Baseten

Baseten

51-200 employees

Platform for deploying and managing ML models

AI & Machine Learning

Senior

San Francisco, CA, USA

Category
Backend Engineering
Software Engineering
Required Skills
Kubernetes
Python
CUDA
Pytorch
Docker
Go
C/C++
Requirements
  • Bachelor’s, Master’s, or Ph.D. in Computer Science, Engineering, or a related field.
  • 5+ years of professional experience in software engineering, with at least 2 years in a technical leadership role.
  • Proven experience managing and mentoring teams of engineers.
  • Expertise in one or more programming languages, such as Python, C++, or Go.
  • In-depth understanding of ML model performance optimization, especially using libraries such as PyTorch, TensorRT, and CUDA.
  • Strong knowledge of containerization (Docker) and orchestration systems (Kubernetes).
  • Experience with production-level AI/ML solutions, including scaling and deploying large models.
  • Ability to balance hands-on technical work with team leadership and project management.
Responsibilities
  • Lead, mentor, and manage a team of engineers focused on developing and optimizing ML model inference and performance.
  • Oversee technical strategy and architecture decisions, driving improvements across our engineering organization.
  • Collaborate with cross-functional teams to ensure seamless integration and scalability of ML models in production environments.
  • Dive into the codebase of frameworks like TensorRT, PyTorch, CUDA, and others to identify and solve complex performance bottlenecks.
  • Drive the development and deployment of large-scale optimization techniques for various ML models, especially large language models (LLMs).
  • Own the full lifecycle of projects from inception through delivery, including planning, execution, and resource management.
  • Foster a collaborative, inclusive team environment that encourages continuous learning and growth.

Baseten provides a platform for deploying and managing machine learning (ML) models, aimed at simplifying the process for businesses. Users can select from a library of open-source foundation models and deploy them with just two clicks, making it easier to implement ML solutions without complex setup. The platform features autoscaling, which adjusts resources based on demand, and monitoring tools for tracking performance and troubleshooting. A distinctive feature is Truss, an open-source model packaging framework that allows users to deploy any model via a command-line interface. Baseten operates on a usage-based pricing model, charging clients only for the time their models are actively deployed or making predictions. The goal is to provide efficient and scalable ML infrastructure for tech companies, data scientists, and ML engineers.

Company Stage

Series B

Total Funding

$58.4M

Headquarters

San Francisco, California

Founded

2019

Growth & Insights
Headcount

6 month growth

40%

1 year growth

104%

2 year growth

96%
Simplify Jobs

Simplify's Take

What believers are saying

  • The recent $40M Series B funding round led by IVP and Spark Capital indicates strong investor confidence and provides substantial capital for growth and innovation.
  • Baseten's launch of Chains for compound AI systems and the BaseTen model zoo demonstrates a commitment to continuous innovation and expanding their product offerings.
  • The platform's autoscaling and comprehensive monitoring tools ensure high performance and reliability, making it an attractive option for businesses looking to scale their ML operations.

What critics are saying

  • The competitive landscape in AI and ML infrastructure is intense, with numerous players vying for market share, which could impact Baseten's growth.
  • Reliance on a usage-based pricing model may lead to revenue fluctuations based on client activity and demand.

What makes Baseten unique

  • Baseten's platform allows for the deployment of popular open-source foundation models with just two clicks, significantly reducing the complexity involved in ML model deployment.
  • The open-source model packaging framework, Truss, offers flexibility for businesses to package and deploy both public and private models seamlessly.
  • Baseten's usage-based pricing model ensures cost-effectiveness, allowing clients to pay only for the time their models are actively used.

Help us improve and share your feedback! Did you find this helpful?

Benefits

💰 Competitive compensation: We aim to provide 90th percentile (or better) salaries and equity grants for every team member commensurate with their experience.

🌎 Remote-first work environment: The Baseten team is welcome to work from wherever they want; fully remote, in our San Francisco office, or a mix of both. We provide a $1,000 stipend for you to make your home office comfortable and productive.

🏓 Regular in-person team summits: We get together as a team three times a year to plan, workshop, and most importantly, get to know each other better.

🌴 Unlimited PTO: We ask that everyone take at least 4 weeks of vacation. And we have a company-wide break between Christmas and New Year's Day.

🏥 Full healthcare coverage: Medical, dental and vision insurance for you and your family.

🍼 Paid parental leave: 16-weeks fully paid parental leave (adoptive and non-birth parents included) and flexibility with schedules while returning to work.

📈 401(k): Company-sponsored 401(k) for you to contribute to.

🧠: Learning and development budget: We encourage you to take classes, attend conferences, and invest in your craft and we’ll cover expenses to make it happen.