What if you could use your technology skills to develop a product that impacts the way communities’ hospitals, homes, sports stadiums, and schools across the world are built? Construction impacts the lives of nearly everyone in the world yet it’s also one of the world’s least digitized industries. That’s why we’re looking for an experienced Staff Site Reliability Engineer to join Procore’s journey to revolutionize a historically underserved industry.
We’re looking for Staff Site Reliability Engineer to join Procore’s Internal Platform Group. In this role, you’ll help champion solutions to systemic issues affecting every team at Procore. Leveraging your software and systems architecture expertise, you’ll conduct consultative engagements with our service authors that improve our software’s reliability. If you have a passion for solving complex problems unique to running large, highly scalable, resilient systems, we would love for you to join us!
This position will report to the Manager of the Reliability Engineering team and will be located in our Austin, TX location. We’re looking for someone to join our team immediately.
Lead projects within a small team of Reliability Engineers to continually improve the reliability of Procore’s services through engineering and process improvement
Collaborate with your peers to envision, design, and develop solutions in your respective area with a bias toward reusability, toil reduction, and resiliency
Surface opportunities across the broader organization for solving systemic issues
Use a collaborative approach to make technical decisions that align with Procore’s architectural vision
Partner with internal customers, peers, and leadership in planning, prioritization, and roadmap development
Develop teammates by conducting code reviews, providing mentorship, pairing, and training opportunities
Serve as a subject matter expert on tools, processes, and procedures and help guide others to create and maintain a healthy codebase
Facilitate an “open source” mindset and culture both across teams internally and outside of Procore through active participation in and contributions to the greater community
BS or MS degree in Computer Science or related discipline; or comparable work experience. Technical Certifications are a plus
8+ years of combined experience as a Software, Resiliency, or Reliability Engineer
Experience architecting and designing services within distributed systems
Experience seeking and solving complex problems
Experience working with software, platforms, and infrastructure at scale (we run thousands of hosts and have millions of users)
Experience as a technical leader on projects with the ability to course-correct as needed
Experience with the following is preferred:
Public cloud (AWS, GCP)
Container orchestration (Kubernetes)
Cloud automation tooling (e.g., CloudFormation, Terraform, Ansible)
Continuous Integration Tooling (e.g., CircleCI, Jenkins, Travis, etc.)
Continuous Deployment Tooling (e.g., ArgoCD, Spinnaker)
Service Mesh / Discovery Tooling (e.g., C