Facebook pixel

Site Reliability Engineering Manager
Confirmed live in the last 24 hours
Locations
Ontario, Canada • Remote
Experience Level
Entry
Junior
Mid
Senior
Expert
Desired Skills
AWS
Communications
Requirements
  • 5+ years of experience in Site Reliability Engineering
  • 3+ years of experience leading and managing SRE teams
  • Strong understanding of Chaos Engineering and MLOps
  • Experience with cloud infrastructure (preferably AWS)
  • Excellent leadership skills and a passion for coaching and mentoring team members
  • Strong communication and collaboration skills
  • Ability to think critically and solve complex problems
Responsibilities
  • Lead, mentor and grow a team of SREs
  • Drive the adoption of Chaos Engineering practices to identify and remediate system weaknesses
  • Work closely with the MLOps team to operationalize machine learning models in production
  • Develop and maintain SLOs, SLIs and error budgets for our systems
  • Collaborate with development teams to ensure that services are designed for high availability and scalability
  • Partner with other leaders in the organization to prioritize and execute initiatives that drive reliability and performance
  • Foster a culture of continuous improvement and collaboration
Agero

1,001-5,000 employees

Roadside digital assistance platform
Company Overview
Agero's mission is to transform driver assistance programs for motorists everywhere. The company is committed to keep drivers safely moving forward through a powerful combination of people and technology.
Benefits
  • Competitive salary
  • Flexible time off
  • 401(k) matching
  • Tuition assistance
  • Commuter benefits
  • Fully remote opportunities