Facebook pixel

Senior Site Reliability Engineer
Posted on 6/18/2022
INACTIVE
Locations
San Francisco, CA, USA
Experience Level
Entry
Junior
Mid
Senior
Expert
Desired Skills
Node.js
AWS
Bash
Docker
Google Cloud Platform
iOS/Swift
JavaScript
Jenkins
Git
Java
Linux/Unix
Management
Microsoft Azure
Operating Systems
Puppet
Ruby
Splunk
Terraform
Kubernetes
Python
Go
Ansible
Requirements
  • 6+ years of professional experience, with proven track record of handling highly scalable and robust large-scale distributed infrastructure
  • Experience scaling web applications and microservices using container orchestration systems such as Kubernetes
  • Experience implementing monitoring, reporting and alerting on large production systems with tools such as Grafana, Prometheus, and Splunk
  • Experience building and running infrastructure and services on AWS
  • Experience supporting live production systems, maintaining high availability and responding swiftly to issues as they appear
  • Experience with CI/CD practices, using Jenkins, GitHub Actions, Docker, or equivalent, and source control systems like perforce and git
  • Experience provisioning cloud infrastructure using CloudFormation, Terraform or Puppet
  • Expertise in Linux operating systems with user level experience in others
  • Ability to develop operational tools using Python, Ruby, Bash, and/or NodeJS
  • Aim to proactively see opportunities for improvement in our systems and propose solutions
  • Strong written and verbal communication skills
Responsibilities
  • Develop and automate highly scalable infrastructure in the cloud using modern infrastructure-as-code principles
  • Build in performance and operational monitoring to ensure scalability and allow swift diagnosis and resolution of service degradation or disruption
  • Diagnose and resolve technical issues from both internal and external customers
  • Develop tooling to automate and simplify common tasks such as building and deploying applications, and assist with integration into CI/CD pipelines
  • Document processes and procedures relating to the deployment, monitoring, and administration of D2C infrastructure and applications
  • Participate in a rotating on-call team to triage, diagnose, and resolve live service issues
  • Collaborate closely with fellow engineers and team members, and maintain a strong working relationship based on communication, respect, and trust
Desired Qualifications
  • Desire to automate everything possible
  • An obsession with performance and providing phenomenal end user experience
  • Experience in Azure, GCP, and other cloud providers
  • Experience administering databases at scale
  • Experience using enterprise third-party monitoring solutions such as Datadog or New Relic
  • Solid understanding of JavaScript, Go, and/or Java
  • Working knowledge of configuration management tools like Puppet, Chef, or Ansible
Take Two
Game publisher