Senior Site Reliability Engineer – Cloud Platform
Posted on 5/28/2022
Development Operations (DevOps)
- Bachelor's Degree in Computer Science or related field
- Software engineering and task automation skills with Bash, Python, and/or Go are a must
- Familiarity with the Agile software development lifecycle
- Deep background in Linux systems and engineering
- Highly experienced with engineering and automating on Amazon Web Services (AWS)
- Experience supporting web applications running on Java / Apache / Tomcat in a live production environment
- Prior experience with IaC tools like Terraform
- Prior experience with DevOps tools (Git, Bitbucket, Teamcity) for gate promotions
- Production-At-Scale support background in a heavily microservice-based world
- Hands-on engineering and ops expertise in containerization (Docker, Kubernetes/EKS, CNI, and Ingress networking)
- Strong understanding of Single-Sign-On, SAML, and OAuth (Bonus if the hands-on experience with Okta)
- Seasoned expertise around x.509 certificate technology and basic concepts of encryption
- Experience working with Relational Databases such as Aurora Postgres and/or Oracle RDS
- Advanced exposure to application development, web UI (design and development), JSON, application architecture
- Experience strongly utilizing observability tools (logging/APM) like Datadog, CloudWatch, and PagerDuty
- You greatly prefer CLIops to ClickOps
- You enjoy teaching and being a mentor to others
- Outstanding troubleshooting skills; ability to think critically and display an aptitude for problem-solving
- Strongly analytical mind with a penchant for process development and enhancement
- A highly positive can-do attitude with a desire for being a team player
- Great communication skills and ability to explain complex technical concepts to a varied audience
- Demonstrate strong follow-through, a strong work ethic and consistently keep and meet commitments
- This position will require the employee to work on-location at our Mississauga, ON office ~ 2 to 3 days per week
- We provide 24x7 support to our customers, so we expect you to take turns with your teammates being on-call for weekend production emergencies or to provide rotating weekend operational support
- Ability to read, write, and speak English
- Travel - Expect occasional travel (less than 5%) to other Guidewire offices for training and team meetings
- Take a purist SRE approach to shared multi-tenant infrastructure for a resilient SaaS microservice-based containerized systems in addition to the customer-centric application environments
- Oversee and automate the team's growing presence in AWS
- Contribute to core infrastructure systems development with features, bug fixes, reliability improvements, etc
- Platform reliability engineering of a complex single sign-on SAML/OAuth-based central authentication platform
- Creatively build and develop tooling to aid in driving 24x7x365 follow-the-sun operations of critical production systems
- Automate deployment tasks for core product and infrastructure tools and maintain automation infrastructure
- Create system documentation and training materials to empower and educate our fellow team members
- Build and maintain observability tooling, metrics, and dashboarding for a global platform product infrastructure
- Improve our incident management lifecycle to identify, mitigate, and learn from reliability risks and issues
- Enhance platform observability with helping create a self-healing approach to platform reliability
- Collaborate with engineer teams, providing product feedback and where necessary contribute code to the product
Recurring revenue software
Guidewire's mission is the be the platform insurers trust to engage, innovate, and grow efficiently. The company is building a data-backed insurance solution.
- Financial: Receive market-competitive pay and incentive programs—because you deserve it! To help future-proof your income, we offer generous support through retirement savings plans.
- Health & Wellness: Keep your physical and emotional health in tip-top shape with health insurance for you and your family, an employee assistance program, annual wellness reimbursement, and access to wellness resources.
- Flexible Working: Work in an environment where you’ll have the freedom and trust to make an impact, with time for your life outside of work.
- Downtime: Relax and kick back through our generous paid time-off programs. Make a difference in your community with three volunteer days each year. Take your own personal day of rest with My Day. We also offer ample paid leave for all new parents.
- Continual Development: We encourage self-directed learning, giving you every chance to become a better version of yourself, both professionally and personally. At Guidewire, lifelong learning is here for the taking.
- Career Mobility: Your career opportunities are only limited by your own imagination. Guidewire’s community is filled with chances to expand your horizons across any of our teams or worldwide locations.
- Integrity: We build and maintain honest, candid, and caring relationships with clients, potential customers, partners, investors, and of course, each other!
- Rationality: All we do is supported by factual evidence—whether it’s building awesome products, making decisions, or communicating with each other.
- Collegiality: We’re in it together—so we’re all equal. We work in professional harmony, with respect, without arrogance, and as part of a structure where responsibility is shared and owned by all of us.