Full-Time

High Performance Computing Engineer

Updated on 6/24/2026

Microsoft

Microsoft

10,001+ employees

Develops software, OS, and cloud services

Compensation Overview

$119.8k - $304.2k/yr

Company Historically Provides H1B Sponsorship

Mountain View, CA, USA

In Person

In-office 4 days/week in Mountain View, CA; SF Bay Area local.

Category
DevOps & Infrastructure (1)
Required Skills
Bash
Kubernetes
Microsoft Azure
Python
AWS
Google Cloud Platform
Requirements
  • Bachelor’s degree in Computer Science or related technical field
  • 4+ years technical engineering experience deploying or operating on-premise or cloud high-performance clusters
  • 4+ years experience working with high-scale training clusters (e.g., NVIDIA InfiniBand clusters, SLURM, Kubernetes, Ray, etc.)
  • 4+ years experience building scalable services on top of public cloud infrastructure such as Azure, AWS, or Google Cloud Platform (GCP)
  • Equivalent experience may be considered
Responsibilities
  • Design, operate, and maintain large-scale high-performance computing environments, drawing on hands-on engineering experience in production settings
  • Own the deployment, configuration, and day-to-day operation of HPC schedulers (e.g., SLURM, Kubernetes), ensuring reliable and efficient job scheduling at scale
  • Serve as a technical owner for at least one core HPC domain (GPU compute, high-performance storage, networking, or similar), including ongoing maintenance, performance tuning, and troubleshooting of massive clusters
  • Develop and maintain automation and tooling using Bash and/or Python to improve cluster reliability, observability, and operational efficiency
  • Partner closely with researchers and engineers to support their workloads, troubleshoot cluster usage issues, and triage failed or underperforming jobs to resolution
  • Drive work forward independently by navigating ambiguity and technical roadblocks, delivering incremental improvements that get capabilities into users’ hands quickly
  • Enjoy working in a fast-paced, design-driven product development environment, balancing stability with rapid iteration and experimentation
  • Embod y our Culture and Values (content appears as a link in posting)
Desired Qualifications
  • Master’s Degree in Computer Science or related technical field AND 6+ years technical engineering experience with deploying or operating on-premise or cloud high-performance clusters
  • 6+ years experience working with high-scale training clusters (e.g., NVIDIA InfiniBand clusters, SLURM, Kubernetes, Ray, etc.)
  • 6+ years experience building scalable services on top of public cloud infrastructure such as Azure, AWS, or GCP
  • Experience with LLM training clusters
  • Experience working with AI platforms, frameworks, and APIs
  • Experience using machine learning frameworks, including deploying, and scaling language learning models, either personally or professionally
  • Experience working with large-scale HPC or GPU systems (e.g., NVIDIA H100/GB200 or equivalent)
  • Ability to identify, analyze, and resolve complex technical issues, ensuring optimal performance, scalability, and user experience
  • Dedication to writing clean, maintainable, and well-documented code with a focus on application quality, performance, and security
  • Demonstrated interpersonal skills and ability to work closely with cross-functional teams
  • Ability to clearly communicate complex technical concepts to both technical and non-technical stakeholders
  • Passion for learning new technologies and staying up to date with industry trends, best practices, and emerging technologies
  • Ability to work in a fast-paced environment, manage multiple priorities, and adapt to changing requirements and deadlines
  • Proven ability to collaborate and contribute to a positive, inclusive work environment, fostering knowledge sharing and growth within the team

Microsoft develops software, devices, and cloud services. Windows is an operating system that runs on personal computers, Office provides productivity apps, and Azure offers cloud computing and developer tools. The company differentiates itself with a large, integrated ecosystem of software, devices, and services, plus long-standing partnerships with PC makers and a broad enterprise footprint. Its goal is to put a computer on every desk and in every home, and to extend that reach through cloud services, professional networking (LinkedIn), and gaming.

Company Size

10,001+

Company Stage

IPO

Headquarters

Redmond, Washington

Founded

1975

Your Connections

People at Microsoft who can refer or advise you

Simplify Jobs

Simplify's Take

What believers are saying

  • ByteDance may spend over $1 billion annually on Microsoft AI services.
  • Azure AI revenue in China tripled after 400% growth the prior year.
  • Usage-based Copilot pricing can improve margins by matching compute to consumption.

What critics are saying

  • US export controls or Chinese retaliation can disrupt Microsoft’s China AI revenue.
  • Copilot price increases can trigger enterprise churn and slower adoption.
  • Heavy AI capex can compress margins if Azure monetization slows.

What makes Microsoft unique

  • Microsoft combines Azure, OpenAI models, and enterprise distribution at global scale.
  • Its China AI business serves ByteDance, Ant Group, Meituan, and Tencent.
  • Nadella promotes distributed AI and multi-model orchestration over winner-take-all control.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Health Insurance

Dental Insurance

Vision Insurance

401(k) Company Match

Professional Development Budget

Conference Attendance Budget

Flexible Work Hours

Remote Work Options

Growth & Insights and Company News

Headcount

6 month growth

0%

1 year growth

-1%

2 year growth

0%
LinkedIn
Jun 19th, 2026
LinkedIn

This link will take you to a page that’s not on LinkedIn

Yahoo News Singapore
May 15th, 2026
Bill Ackman bets on Microsoft as AI winner, cites $200B OpenAI stake and Azure growth

Billionaire investor Bill Ackman has revealed new investments in Microsoft by both his hedge fund, Pershing Square Capital Management, and his closed-end fund Pershing Square USA. The positions were initiated in February after Microsoft's shares fell following second-quarter earnings. Ackman highlighted Microsoft's ownership of M365 and Azure, which is benefiting from surging AI inference demand. He noted the company traded at 21 times forward earnings in February, well below recent averages, and said its valuation doesn't reflect Microsoft's approximately 27% economic interest in OpenAI, worth roughly $200 billion. Microsoft shares have fallen about 15% this year. Ackman compared the investment to previous successful bets on Alphabet, Amazon and Meta, calling Microsoft's current valuation "highly compelling" for long-term value.

Tech in Asia
Apr 14th, 2026
Microsoft adds 30,000 Nvidia chips to Norway site after $6.2B commitment

Microsoft has secured a deal with neocloud provider Nscale to expand its Norway data centre site with 30,000 Nvidia chips. The agreement adds to Microsoft's earlier $6.2 billion commitment to the location, whilst OpenAI did not finalise a capacity agreement there. The move is part of Microsoft's roughly $60 billion spending wave on specialised neocloud providers that rent AI computing infrastructure. CEO Satya Nadella has identified power availability and data centre construction speed as the company's biggest bottleneck, rather than chip supply. The deal reflects how cheap electricity and clear regulations increasingly shape AI data centre locations. Nscale, a UK-based startup that emerged from crypto-mining firm Arkon Energy in 2024, raised $2 billion at a $14.6 billion valuation in March 2026.

Bloomberg L.P.
Apr 14th, 2026
Microsoft takes over $6.2B Stargate data centre from OpenAI in Norway

Microsoft has agreed to rent data centre capacity at a Norwegian site originally intended for OpenAI as part of its Stargate initiative. The company will rent 30,000 additional Nvidia Vera Rubin chips from neocloud provider Nscale at a campus inside the Arctic Circle in Narvik, Norway. The deal builds on Microsoft's prior $6.2 billion commitment at the same location. Nscale announced the agreement in a statement, marking a shift in the facility's intended purpose from OpenAI to Microsoft operations.

Yahoo Finance
Apr 14th, 2026
Microsoft stock down 23% despite Azure growing 39% and $625B revenue backlog

Microsoft shares have fallen 23.14% year-to-date to $370.87, despite strong Q2 FY2026 results showing non-GAAP EPS of $4.14, a 7.57% beat. Revenue reached $81.27 billion, up 16.72% year-over-year, with Azure growing 39%. The company's commercial remaining performance obligation surged 110% to $625 billion in contracted future revenue, providing multi-year visibility. Microsoft's OpenAI partnership includes a $250 billion incremental Azure services commitment, whilst the company holds a 27% stake valued at approximately $135 billion. Despite the decline, 95% of covering analysts remain bullish, with a consensus price target of $587.31. Analysts cite the year-to-date drop as creating an entry point for investors confident in Azure's AI growth trajectory.