Lead Software Engineer
Machine Learning Platform
Posted on 3/22/2023
INACTIVE
Mobile banking & debit cards
Company Overview
Chime's mission is to make financial peace of mind a reality for everyone. The company has created "a new approach to online banking that doesn’t rely on fees, gets you your paycheck up to 2 days early with direct deposit, and helps you grow your savings automatically."
Locations
San Francisco, CA, USA
Experience Level
Entry
Junior
Mid
Senior
Expert
Desired Skills
Apache Spark
AWS
Apache Kafka
Data Science
Docker
Google Cloud Platform
Airflow
Keras
Microsoft Azure
MySQL
Pandas
Postgres
Snowflake
SQL
Tensorflow
Terraform
Apache Flink
Kubernetes
Python
Go
NoSQL
Datadog
CategoriesNew
AI & Machine Learning
DevOps & Infrastructure
Software Engineering
Requirements
- CS or related degree with 6+ years of experience implementing and deploying large-scale distributed systems, backend systems, or ML pipeline infrastructure in production
- 4+ years of end-to-end DSML experience building data pipelines, ML models, DS automation workflows, machine learning tools and using frameworks (Keras, Tensorflow, SparkML, pandas, scikit-learn, etc)
- Experience working with various data and ML infrastructure technologies (SageMaker, AzureML, SQL/NoSQL) and data streaming/processing (Spark, Kafka, Flink, Airflow, AWS Kinesis) is preferred
- Competency in programming skills and software engineering fundamentals (Python, Golang, or similar languages)
- Demonstrated experience leading projects, from inception to deployment and monitoring, while collaborating with cross-functional teams
- Technologies we use: Python, SageMaker, AWS cloud stack (e.g. RDS, S3, DynamoDB, lambdas, Kinesis, ECR), Datadog, Terraform, Snowflake, MySQL, and Postgres, among many others
Responsibilities
- Design, implement, scale, and support cloud infrastructure for serving realtime and batch ML inferences
- Build realtime and batch feature pipelines, collaborating with Chime's Data Engineering and Data Science teams, among others
- Design and deploy low latency, scalable, and high-availability microservice APIs that integrate seamlessly with Chime Engineering services
- Deliver on our ML Platform reliability and observability strategy, including standardized metrics and SLO dashboards, dependency-graph-aware alerting, and advocating for best practices Chime-wide
- Provide our Data Science team with ML observability and explainability, including drift detection, model performance monitoring, and feature and prediction cohorting, accelerating both new and iterative model development
- Help grow our SageMaker ML pipeline infrastructure for model training, evaluation and deployment at scale
- Collaborate closely with a wide range of Chime teams, including Product, Engineering, Risk, and more
Desired Qualifications
- 4+ years of experience working with Cloud (AWS, Azure or GCP), Docker, CI/CD, and orchestration technologies (Kubernetes). Experience with Terraform or CloudFormation is a