Senior Site Reliability Engineer
Posted on 9/11/2023
Rootly
Locations
Remote
Experience Level
Entry
Junior
Mid
Senior
Expert
Desired Skills
AWS
Heroku
CategoriesNew
DevOps & Infrastructure
Requirements
  • Participate in an on-call rotation to support critical Rootly services, and in some cases be on call with software teams
  • Participate in the definition and management of SLOs and error budgets for the Engineering teams that own services in production
  • Build tools to support our processes
  • Embed with feature delivery software teams to build and enhance observability, reliability, and availability of those services
  • Work with other teams around Engineering to understand their systems and their challenges at the code level and identify improvements Rootly Infrastructure to improve the services they own (contribute code where possible)
  • You have 5+ years of experience in an SRE or Infrastructure Engineering role
  • 5+ years of experience writing software as a SWE or Software heavy SRE role
  • You have strong technical knowledge of cloud infrastructure, distributed systems, and reliability practices
  • You've supported services at web or RPC services at a significant scale
Responsibilities
  • Moving off Heroku to AWS
  • Creating our CI/CD pipeline
  • Creating developer tools to enable our engineers to ship code fast and in a reliable way
Desired Qualifications
  • You have experience solving infrastructure problems by writing software
  • You have a big-picture perspective on systems and tools
  • You can collaborate with other Engineering teams to understand their systems and help to improve them