Facebook pixel

Manager – Site Reliability Engineering
Posted on 9/9/2022
United States
Experience Level
Desired Skills
  • You have 5+ years of software support, reliability, or operations engineering experience in a highly customer-focused SaaS environment
  • Experience in migrating from a monolith N-tier architecture to a distributed microservices architecture (event driven vs message driven)
  • Experience in designing for the cloud and utilizing cloud native solutions
  • Experience with medium-scale to large-scale UNIX/Linux production environments, preferably as part of an online service provider
  • Strong sense of ownership of large projects and complex tasks
  • You have production experience with multiple cloud vendors
  • You endorse infrastructure as code
  • You have a proven track record of managing diverse and distributed teams, ensuring all members can bring their best
  • You possess strong leadership skills and the ability to motivate teams
  • You will bring a collaborative partnership mindset, focused on business impact
  • Ability to solve problems quickly while taking an automation first approach
  • Hands on experience release, deployment, and environment lifecycle management
  • Experience with Open Source technologies
  • Experience with virtualization & container technologies
  • Hands-on experience with infrastructure-as-code tools and CI/CD concepts. (Preferably HashiCorp tools like Terraform/Consul/Packer/Nomad and management tools like Kubernetes/Salt)
  • Experience with more advanced automated monitoring and log aggregation systems. (NewRelic, DataDog, SumoLogic, Splunk, Logstash, etc.)
  • Experience with multi-geography and distributed systems
  • Working knowledge of web, application, database, and OS server systems (Nginx, Tomcat, MongoDB, ElasticSearch, ZooKeeper, RabbitMQ, Redis)
  • Ability to manage competing priorities in a complex environment
  • Able to pass a Federal drug screening
  • Lead and manage Everbridge's high-performing site reliability team while being hands on
  • Mentor, grow, and empower your team by giving them the skills, confidence, space, and motivation to make decisions independently that lead to their personal and professional success, and enable them to become technical leaders. In other words, align the best outcomes for growth of the people around and business impact
  • Participate in deep technical design discussions within your team, and across engineering teams, and ensure that we're building the right systems and keeping the quality high
  • Drive Design, Architecture, Operability, Security, and Scaling of the Everbridge Platforms
  • Help develop and maintain processes, tools, and documentation in a multi-region cloud deployment
  • Facilitate the evaluation of automation and new software solutions
  • Collaborate with Architects, Developers, Data Reliability, and platform teams on designing scalable and highly available systems
  • Ensure proper security, monitoring, alerting and reporting for application platforms
  • Troubleshoot and resolve production issues
  • Help drive the capacity planning process
Desired Qualifications
  • Previous experience working in SaaS Site Reliability Engineering (Site Reliability Leader, Software Engineering Manager, Operations Manager, etc.)
  • Bachelor's degree or equivalent work experience

1,001-5,000 employees

Public warning platform
Company mission
Everbridge's mission is to keep people safe and businesses running. The company builds infrastructure to move resources after natural disasters.
  • Dental Insurance
  • Vision Insurance
  • Health Insurance
  • Life Insurance
  • PTO/Vacation Policy
  • Paid Holidays
  • Maternity / Paternity Leave
  • 401K / Retirement Plan
  • Performance Bonus
  • Employee Stock Purchase Plan
  • Free Food
  • Work From Home Policy
  • Company Social Outings
  • Unique Office Space
Company Values
  • Obsessed with Customer Service
  • Fueled with Passion
  • Guided with Integrity
  • A Team of Leaders