Member of Technical Staff
Inference & Model Serving
Updated on 4/1/2024
Cohere

501-1,000 employees

Natural language processing software
Company Overview
Cohere's mission is to build machines that understand the world, and to make them safely accessible to all.
AI & Machine Learning
Crypto & Web3
Financial Services
Education
B2B

Company Stage

Series C

Total Funding

$440M

Founded

2019

Headquarters

Toronto, Canada

Growth & Insights
Headcount

6 month growth

40%

1 year growth

219%

2 year growth

730%
Locations
San Francisco, CA, USA
Experience Level
Entry
Junior
Mid
Senior
Expert
Desired Skills
AWS
Go
Natural Language Processing (NLP)
Google Cloud Platform
CategoriesNew
Backend Engineering
IT & Support
Security Engineering
Software Engineering
Requirements
  • Experience with serving ML models.
  • Experience designing, implementing, and maintaining a production service at scale.
  • Familiarity with inference characteristics of deep learning models, specifically, Transformer based architectures.
  • Familiarity with computational characteristics of accelerators (GPUs, TPUs, and/or Inferentia), especially how they influence latency and throughput of inference.
  • Strong understanding or working experience with distributed systems.
  • Experience in performance benchmarking, profiling, and optimization.
  • Experience with cloud infrastructure (e.g. AWS, GCP).
  • Experience in Golang (or, other languages designed for high-performance scalable servers).
Responsibilities
  • Developing, deploying, and operating the AI platform delivering large language models through easy to use API endpoints.
  • Working closely with many teams to deploy optimized NLP models to production in low latency, high throughput, and high availability environments.
  • Interfacing with customers and creating customized deployments to meet their specific needs.