Facebook pixel

Site Reliability Engineer
Posted on 11/26/2022
INACTIVE
Locations
Toronto, ON, Canada
Experience Level
Entry
Junior
Mid
Senior
Expert
Desired Skills
AWS
Bash
BigQuery
Google Cloud Platform
JIRA
Git
Linux/Unix
Management
MySQL
Postgres
SQL
Terraform
Python
Looker
Ansible
Requirements
  • You are a learn a lot, not a know it all
  • Post-secondary diploma, certificate or degree in an IT-related discipline or the wherewithal to have obtained such things
  • Comfort with an outrageous number of acronyms
  • Experience with GCP architecture and related products including: Datastream, BigQuery, Looker, Cloud Storage, Billing, Cloud Function, etc
  • Experience with AWS architecture and related products including: EC2, RDS, ALB, VPC, VPC Peering, Multi-AR, EFS, EBS, Beanstalk, Auto Scale groups, direct connect, etc
  • You have a good understanding of what the Well Architected Framework is and how it relates to EC2, BigQuery, GCS/S3, RDS, VPCs, NATs, and more
  • 2-3 years Linux/UNIX systems administration experience, preferably in a LAMPJ environment
  • Familiarity with CentOS/RHEL, MySQL, PostgreSQL, Reverse Proxies, Firewalls/NAT, HTTPS, SSL Certificates, SFTP, FTPS, DNS, SMTP
  • Experience in configuration, implementation, and maintenance of SaaS platforms
  • A fierce passion for availability, reliability, and short MTTR
  • Well-formed experience with at least one scripting language (BASH, Python, etc.) We code on this team
  • You know SQL and can help to debug MySQL and PostgreSQL Databases (slow queries, traces, killing queries)
  • Experience with the practice of Infrastructure as Code and have done infrastructure coding (Ansible and Terraform)
  • Familiarity with monitoring and metric collection systems
  • Familiarity with information security best practices and tools
  • Familiarity with backup strategies and tools
  • Version Control Systems (Git)
  • A belief that companies should be socially responsible
Responsibilities
  • Day-to-day administration of Google Cloud Platform (GCP) data platforms with support for our legacy Amazon Web Service (AWS) environments
  • Collaborate with our Data Engineers on remediating technical debts, implementing a new data platform on GCP, and helping to optimize existing AWS workflows and infrastructure in collaboration with Site Reliability Engineering (SRE) team
  • Implement monitoring and logging solutions across all systems
  • Adhere with Site Reliability Engineering principles on incident management and service level objectives
  • Assist in implementation of security best practices and initiatives at all levels of the systems infrastructure
  • Serve as a steward for the service life cycle for the AWS in collaboration with SRE team and GCP data platforms
  • Troubleshoot issues that arise, document defects in JIRA, work with colleagues to resolve production issues
  • Assist with sourcing and testing infrastructure enhancements before deployment
  • Support workflow automation using configuration management and continuous deployment frameworks
  • Work in an on-call rotation
  • Be a great teammate!
Desired Qualifications
  • You'll get that competitive salary, flexible health benefits, mental health support, a generous program, stock options, a hybrid office/home work environment and so much more
Benevity
Donation & grant management platform