Line of Service
Advisory
Industry/Sector
Not Applicable
Specialism
Managed Services
Management Level
Senior Associate
Job Description & Summary
At PwC, our people in infrastructure focus on designing and implementing robust, secure IT systems that support business operations. They enable the smooth functioning of networks, servers, and data centres to optimise performance and minimise downtime.
Those in cloud operations at PwC will focus on managing and optimising cloud infrastructure and services to enable seamless operations and high availability for clients. You will be responsible for monitoring, troubleshooting, and implementing industry leading practices for cloud-based systems.
Focused on relationships, you are building meaningful client connections, and learning how to manage and inspire others. Navigating increasingly complex situations, you are growing your personal brand, deepening technical expertise and awareness of your strengths. You are expected to anticipate the needs of your teams and clients, and to deliver quality. Embracing increased ambiguity, you are comfortable when the path forward isn’t clear, you ask questions, and you use these moments as opportunities to grow.
Examples of the skills, knowledge, and experiences you need to lead and deliver value at this level include but are not limited to:
- Respond effectively to the diverse perspectives, needs, and feelings of others.
- Use a broad range of tools, methodologies and techniques to generate new ideas and solve problems.
- Use critical thinking to break down complex concepts.
- Understand the broader objectives of your project or role and how your work fits into the overall strategy.
- Develop a deeper understanding of the business context and how it is changing.
- Use reflection to develop self awareness, enhance strengths and address development areas.
- Interpret data to inform insights and recommendations.
- Uphold and reinforce professional and technical standards (e.g. refer to specific PwC tax and audit guidance), the Firm's code of conduct, and independence requirements.
Site Reliability Engineer (SRE)
- Azure Kubernetes Service (AKS) Specialist
We are looking for a skilled Site Reliability Engineer (SRE) specializing in Azure Kubernetes Service (AKS) to join our Cloud Operations team. The ideal candidate will specialize in application deployments, code promotions, and AKS resource troubleshooting and administration while also supporting a diverse IT stack including Azure Virtual Desktop/Intune, M365, Kafka on HDInsight, and MS SQL MI.
The candidate will help to ensure the uptime, reliability, and performance of our applications. This involves implementing comprehensive monitoring solutions and maintaining observability using tools like Azure Application Insights. To prepare applications for varying levels of demand, the candidate will design and deploy automated scaling solutions that dynamically adjust resources based on workload requirements. Additionally, they will leverage automation to streamline incident response, conduct thorough root cause analyses, and implement robust, long-term solutions to prevent recurring issues.
Essential Functions
- Manage and automate application deployments and code promotions across environments.
- Understanding of container orchestration, containerization technologies (Docker, Kubernetes), and infrastructure as code (IaC) tools such as Terraform or Bicep.
- Troubleshoot and administer Azure Kubernetes Service (AKS) resources to ensure high availability and performance.
- Collaborate with development teams to enhance CI/CD pipelines using tools like GitHub Actions and Helm.
- Monitor and optimize system reliability, utilizing Azure Monitor and other observability tools.
- Implement and manage network security and access control, using tools such as Palo Alto VM-Series VMSS.
- Support incident response and root cause analysis for production issues.
- Ensure compliance with security standards and best practices in cloud environments.
- Participate in on-call rotations to provide 24/7 support for critical systems.
Minimum Requirements
- Proven experience in a Site Reliability Engineer role with a focus on Azure AKS.
- Strong understanding of container orchestration and management with Kubernetes.
- Proficiency in CI/CD tools and practices, including GitHub Actions and Helm.
- Experience with monitoring and observability tools like Azure Monitor.
- Familiarity with network security and access management solutions.
- Knowledge of scripting and automation using PowerShell or similar languages.
- Excellent problem-solving skills and ability to work in a fast-paced environment.
- Strong communication skills for effective collaboration with cross-functional teams.
- Azure certifications (e.g., Azure Administrator, Azure DevOps Engineer) are a plus.
Education (if blank, degree and/or field of study not specified)
Degrees/Field of Study required:Degrees/Field of Study preferred:
Certifications (if blank, certifications not specified)
Required Skills
Optional Skills
Accepting Feedback, Accepting Feedback, Active Listening, Analytical Thinking, Cloud Administration, Cloud-Based Service Management, Cloud Engineering, Cloud Infrastructure, Cloud Infrastructure Architecture Design, Cloud Infrastructure Optimization, Cloud Migration, Cloud Operations (CloudOps), Cloud Performance Optimization, Cloud Service Delivery, Cloud Strategy, Communication, Creativity, Embracing Change, Emotional Regulation, Empathy, Inclusion, Infrastructure Management, Infrastructure Performance, Intellectual Curiosity, Learning Agility {+ 9 more}
Desired Languages (If blank, desired languages not specified)
Travel Requirements
Not Specified
Available for Work Visa Sponsorship?
No
Government Clearance Required?
No
Job Posting End Date