Simplify Logo

Full-Time

Senior Site Reliability Engineer

Confirmed live in the last 24 hours

Kognitos

Kognitos

Senior

San Jose, CA, USA

Category
DevOps & Infrastructure
Site Reliability Engineering
Required Skills
PowerShell
Bash
Kubernetes
Microsoft Azure
Python
Digital Ocean
Terraform
Requirements
  • A Bachelor's degree or higher in Computer Science/Engineering or a related field, or equivalent work experience.
  • 5-8 years of industry experience in Site Reliability Engineering or a related role.
  • Proficiency in managing Azure infrastructure and deploying resources using Terraform and ARM templates.
  • Experience with container orchestration tools like Kubernetes and microservices architecture on Azure.
  • Solid understanding of networking principles, security best practices, and system performance optimization in Azure.
  • Strong scripting and automation skills (e.g., PowerShell, Python, Bash) for infrastructure management.
  • Familiarity with Azure monitoring tools and practices for ensuring system health and performance.
  • Previous experience with on-call rotations and incident response.
Responsibilities
  • Design, implement, and manage our Azure infrastructure using Terraform, ARM templates, and other advanced technologies, ensuring scalability, reliability, and security.
  • Collaborate closely with development and operations teams to enhance deployment strategies, monitoring, and incident response procedures on Azure.
  • Drive automation efforts for infrastructure provisioning, configuration management, and application deployment, empowering our team to move faster and more efficiently.
  • Proactively monitor system performance and reliability, identifying and addressing potential issues in our Azure environment before they impact our users.
  • Conduct regular system reviews and optimizations, leveraging your expertise to enhance overall system performance and efficiency on Azure.
  • Champion infrastructure-as-code practices, enabling version-controlled and reproducible infrastructure across our Azure platform.
  • Participate in architectural discussions and contribute insights to enhance the reliability and scalability of our Azure systems.
  • Be part of an on-call rotation, responding to incidents promptly and ensuring the reliability of our systems 24/7.

Company Stage

N/A

Total Funding

N/A

Headquarters

N/A

Founded

N/A