Site Reliability Engineering Manager
Updated on 3/28/2024
Art of Problem Solving

501-1,000 employees

Online advanced learning platform for eager students
Company Overview
Art of Problem Solving (AoPS) is a leading educational platform that fosters a culture of intellectual curiosity and collaborative learning, with a proven track record of training the majority of the US International Math Olympiad team. Their competitive advantage lies in their comprehensive and advanced curriculum, which not only focuses on math but also expands into language arts, science, and computer science. With an international online community of over 800,000 members, AoPS demonstrates industry leadership by providing a platform for students to learn from expert mentors and connect with like-minded peers.
Education

Company Stage

N/A

Total Funding

N/A

Founded

2003

Headquarters

San Diego, California

Growth & Insights
Headcount

6 month growth

5%

1 year growth

30%

2 year growth

38%
Locations
San Diego, CA, USA
Experience Level
Entry
Junior
Mid
Senior
Expert
Desired Skills
PHP
Node.js
Postgres
AWS
Terraform
Redis
Nginx
Development Operations (DevOps)
CategoriesNew
DevOps & Infrastructure
DevOps Engineering
Site Reliability Engineering
IT & Security
Cloud Engineering
Requirements
  • Expert-level experience in AWS ecosystem
  • Experience with Terraform
  • Familiarity with Node.js and/or PHP
  • Familiarity with MariaDB, PostgreSQL, Redis, Apache, and nginx
  • Prior full-stack or backend software engineering experience
  • Prior people management experience in SRE or DevOps role
Responsibilities
  • Managing a team of Site Reliability Engineers
  • Owning and maintaining company cloud infrastructure strategy and SRE team roadmap
  • Implementing/evaluating reliability metrics for products and services
  • Running, evaluating, and improving SRE processes and procedures
  • Providing technical expertise and allocating team resources
  • Driving continuous improvement in the SRE space and broader Engineering Department
  • Accountability for overall risk management and reduction practices
  • Communicating cross-team and performing all duties of a Site Reliability Engineer