Facebook pixel

Senior Site Reliability Engineer
Posted on 4/1/2022
Atlanta, GA, USA
Experience Level
Desired Skills
  • Think about systems - edge cases, failure modes, behaviors, specific implementations
  • Have an understanding of large scale system design, monitoring, observability, and operational practices
  • Have strong programming skills - Go, Python, and/or Ruby
  • Have an enthusiastic, go-for-it attitude. When you see something broken, you can't help but fix it
  • Have experience with Weave Flux, Nginx, Kubernetes, Terraform, Prometheus, Loki, Cortex, Tempo, or similar technologies
  • Are compelled to keep a constant eye on the Observability space, identifying and planning ahead based on changes in practices/technologies as they arise
  • Contribute to our team's Telemetry Platform that consists of Prometheus, Cortex, Loki, Tempo, and Grafana deployed in EKS using Terraform and Weave Flux on AWS
  • Contribute to projects across the organization to address challenges that your skill set exceeds
  • Work with our dev teams to determine how to make their paging strategy more meaningful and less problematic
  • Develop ways to aid our development teams in instrumenting their services to collect important information about our applications that allows for investigation
  • Working to reduce the level of effort needed to utilize the instrumentation that the teams are creating
  • Provide valuable feedback and collaborate with the teams whose products we use as we iterate on our own infrastructure
  • The salary band for this position ranges from 140k - 200k, plus bonus and equity, commensurate with experience and performance
  • Full medical, dental, vision package to fit your needs
  • Flexible vacation policy; work hard and take time when you need it
  • Pet discount plans & retirement plan with company match (401K)
  • Determine what information is important enough to drive service levels for our services
  • Use service level information to determine reliability on our Telemetry Platform
  • Participate in an on-call rotation that responds to incidents concerning the Telemetry Platform
  • Contribute to solutions defined in GitLab projects and GitHub repositories
  • Maintain AWS EKS clusters using our Terraform modules
  • Automate complex business challenges that require your specific skill set
  • Contribute to core infrastructure pieces that allow Angi to scale to meet the needs of its clients
  • Use the Telemetry Platform to assist in investigations that happen across the organization
  • Plan and shape the growth of Angi's infrastructure as we iterate it over time

1,001-5,000 employees

Comprehensive solution for home needs
Company mission
At Angi, they invest their resources into growing their business and their people. Angi's mission is to help the best consumers find the best service providers and promote happy transactions remains the same.
  • Competitive compensation.
  • This position will be eligible for a competitive year end performance bonus & equity package.
  • Full medical, dental, vision package to fit your needs
  • Flexible vacation policy: work hard and take time when you need it
  • Pet discount plans & retirement plan with company match (401K)
  • The rare opportunity to work with sharp, motivated teammates solving some of the most unique challenges and changing the world
Company Values
  • Start with the customer
  • All about talent
  • Strength in diversity
  • Create & build momentum
  • Be an owner
  • Disagree as individuals, deliver as a team
  • Drive growth
  • Better today, perfect tomorrow
  • Do more with less
  • Deliver results
  • Data beats opinion
  • Enjoy the journey