Full-Time

Engineering Manager

Incident Analysis

Posted on 10/3/2025

PagerDuty

PagerDuty

1,001-5,000 employees

Real-time incident management and on-call platform

Compensation Overview

CA$137k - CA$207k/yr

+ Bonus + Commission + Equity

Toronto, ON, Canada

Hybrid

Two days per week on-site in Toronto.

Category
Engineering Management (1)
Requirements
  • 8+ years of overall experience in Engineering
  • 2+ years of Engineering Management in a SaaS organization
  • Proven experience building user-facing web applications or data-driven platforms
Responsibilities
  • Partner closely with Product Management and senior engineers to create a vision and roadmap for incident analysis capabilities.
  • Work with Engineering, Product, and UX stakeholders to deliver intuitive, actionable incident analysis/retrospective features.
  • Champion clarity and usability in the incident analysis experience by working hand-in-hand with UX to translate complex data into simple, valuable workflows.
  • Interact with internal engineering teams to resolve dependencies and integrate analytics across the PagerDuty platform.
  • Make pragmatic trade-off decisions on resourcing and priority.
  • Collaborate with other engineering leaders at PagerDuty on cross-team initiatives, both cultural and technical.
  • Measure impact by partnering with Product and Design on defining success metrics for analytics features and ensuring engineering delivery drives those outcomes.
  • Support your team’s on-call rotation or incident command rotation, being available for escalations and production issues as they arise.
  • Learn PagerDuty’s culture and values and be a strong proponent for them.
Desired Qualifications
  • Experience working with large-scale data systems, analytics platforms, or data pipelines, with knowledge of common pitfalls in under/over-engineering
  • Strong sense of urgency toward action, a customer-first philosophy, and ability to think and act strategically in close partnership with Product to shape the incident analysis roadmap
  • Ability to balance planning with execution
  • Demonstrated ability to hire and develop engineers at varying levels of experience to help them reach their full potential
  • Excellent communication skills across all levels of the organization, with an emphasis on providing context and empowering teams rather than prescribing solutions
  • Deep understanding of Agile principles and effective application
  • Technical depth to engage in design/code discussions when needed
  • Clear perspective on what “good engineering” looks like, and when to empower teams versus step in on critical decisions

What PagerDuty does: PagerDuty provides an incident management platform that helps organizations detect and resolve IT issues quickly to minimize disruption. How its product works: It integrates with monitoring tools and IT systems to detect incidents in real time, then sends alerts to the right people, enforces on-call rotations, and guides incident resolution through automated workflows and escalations. How it differentiates from competitors: It focuses on on-call management and real-time incident response across many integrations, offering a scalable, subscription-based platform with configurable alerting, escalation policies, and professional services to support faster recovery. What the company's goal is: To reduce downtime and maintain the reliability and performance of digital services for organizations across industries by providing reliable, scalable incident management.

Company Size

1,001-5,000

Company Stage

IPO

Headquarters

San Francisco, California

Founded

2009

Simplify Jobs

Simplify's Take

What believers are saying

  • Usage-based flex pricing secures 40+ enterprise deals worth $100K+ quarterly.
  • AI ecosystem expansion with 30+ partners including Claude enables autonomous operations.
  • New CEO DiLullo's enterprise software expertise drives operational efficiency and profitability.

What critics are saying

  • Revenue growth decelerated from 18.2% to 6.9% annually; flat FY2027 guidance.
  • Dollar-based net retention dropped to 98% with mid-size customer churn accelerating.
  • Datadog and AWS agents commoditize incident management; PagerDuty loses differentiation.

What makes PagerDuty unique

  • 16-year historical incident data builds proprietary AI models competitors cannot replicate.
  • SRE Agent integrates autonomously across Slack, Teams, and observability tools natively.
  • Context flywheel captures human decisions during crises for continuous learning loops.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Health, AD&D, Disability, Vision, Life, and Dental Insurance

Paternity and Maternity Leave

Employee Assistance Program

PTO (Vacation / Personal Days)

Sick Time

Remote Work

Adoption Assistance

401(k)

Employee Stock Purchase Program

Flexible Spending Account

Student Loan Repayment Plan

Growth & Insights and Company News

Headcount

6 month growth

0%

1 year growth

0%

2 year growth

1%
Yahoo Finance
Apr 10th, 2026
TD Cowen cuts PagerDuty price target to $10 as shift to usage-based pricing begins

PagerDuty, a cloud-based incident management platform provider, faces muted growth prospects despite transitioning to a new pricing model. TD Cowen cut its price target to $10 from $20 whilst maintaining a Buy rating after the company forecast flat revenue growth for fiscal 2027, disappointing expectations of 4% growth. The company reported fiscal Q4 2026 adjusted earnings per share of $0.29, beating estimates of $0.24, on revenue of $124.8 million. Full-year fiscal 2026 revenue reached $492.5 million, up 5.4% year-on-year. PagerDuty is shifting from a seat-based to usage-based pricing model. TD Cowen believes faster adoption of this new model could improve the company's position, noting early positive signals from the transition. PagerDuty comprises 0.28% of George Soros' stock portfolio.

PagerDuty
Mar 24th, 2026
Meet your virtual responder: PagerDuty's SRE Agent for ai-driven reliability.

Meet your virtual responder: PagerDuty's SRE Agent for ai-driven reliability. Modern SRE teams face an overwhelming challenge: too many signals, too little time. Incidents are faster, systems are more complex, and reliability targets only get stricter. What if you had a teammate who could jump in instantly - context-aware, tireless, and armed with your runbooks, metrics, and alert data? Introducing PagerDuty's SRE Agent, the next evolution in AI-driven operations. The SRE Agent acts as your virtual responder, collaborating with your team to accelerate response, reduce toil, and continuously improve reliability. From alert fatigue to autonomous action. Every responder knows the weight of alert fatigue: the constant triage, switching contexts, and hunting for data across tools. The SRE Agent changes that dynamic. The agent connects directly to PagerDuty's event intelligence, on-call data, and service context. When an incident triggers, it summarizes the situation, identifies potential root causes, and recommends next actions, all before a human joins the call. It doesn't just surface alerts; it turns them into structured, actionable insights. And because it operates as a virtual responder within your existing workflows, the SRE Agent participates right alongside you in Slack, Microsoft Teams, or the PagerDuty web interface, suggesting remediations from your runbooks, and even executing predefined actions when authorized. Resolve incidents faster without burning out. The agent automates common tasks such as: * Context gathering: Pulls logs, metrics, changes, and incident history in seconds * Collaboration setup: Creates or joins incident channels automatically * Incident summarization: Maintains a rolling timeline of key events for stakeholders * Next-step recommendations: Suggests mitigation paths based on prior successful resolutions With this automation, engineers can focus less on coordination and more on critical decision-making, shortening the path from alert to service restoration. Continuous improvement. The SRE Agent doesn't stop when the incident ends. It feeds into a continuous improvement loop and will capture key insights for your post-incident reviews. By analyzing patterns across incidents, it helps identify recurring reliability risks and automation opportunities, making your systems - and your teams - stronger over time. For practitioners, this means fewer late-night alerts and more confidence that your reliability posture is improving with every incident handled. How teams are using the SRE Agent today. Early adopters across industries are integrating the SRE Agent into their reliability workflows to: * Act as a first responder for low-severity incidents, reducing pager load * Automatically trigger diagnostic scripts or rollbacks * Provide knowledge continuity across shifts through AI-powered contextual summaries This isn't about replacing engineers - it's about amplifying their expertise and ensuring that your operational excellence scales as your environments do. The future of reliability is augmented. PagerDuty has always been about empowering human responders. The SRE Agent is the next logical step. An intelligent, always-on teammate embedded into every stage of your incident lifecycle. Whether you're managing hundreds of microservices or a global infrastructure, the SRE Agent helps your teams move faster, stay calmer, and keep customers happy. Explore how the PagerDuty SRE Agent can transform your incident response and reliability practices - visit https://www.pagerduty.com/platform/ai-agents/sre/ to learn more.

Business Wire
Mar 24th, 2026
PagerDuty named Leader in IT incident response platforms for fourth consecutive year

PagerDuty has been named a Leader and Outperformer in the 2026 GigaOm Radar for IT Incident Response Platforms for the fourth consecutive year. The company achieved the highest average score across key feature evaluations and was placed in the Innovation/Platform Play quadrant. The report recognised PagerDuty's strengths in incident lifecycle orchestration, collaborative response and mobile incident operations. GigaOm highlighted the company's AI-powered Operations Cloud platform, which automates and orchestrates incident management from detection to resolution across distributed teams and systems. PagerDuty's Scribe Agent uses generative AI to transcribe incident calls and generate real-time timelines during high-severity incidents. The platform serves over 35,000 organisations worldwide, including nearly half of the Fortune 500, and integrates with more than 700 systems.

Yahoo Finance
Mar 13th, 2026
PagerDuty achieves first GAAP profitable year with $499M ARR and 700 basis point margin expansion

PagerDuty reported $125 million in revenue for Q4, a 3% year-over-year increase, and achieved its first GAAP profitable year. Annual recurring revenue reached $499 million, whilst the company expanded its non-GAAP operating margin by nearly 700 basis points. The company secured over 40 deals worth $100,000 or more in Q4, demonstrating strong enterprise demand. However, dollar-based net retention stood at 98%, and there was a modest decline in customers spending over $100,000 annually, reflecting churn in the mid-size segment. PagerDuty's flex-based pricing has been positively received by large enterprises, reducing friction and enabling access to new products. The company is focusing on re-accelerating growth by targeting large enterprises and AI-first companies whilst maintaining operational efficiency.

Yahoo Finance
Mar 12th, 2026
PagerDuty Q4 revenue beats estimates but weak guidance sends stock down 14.9%

PagerDuty reported Q4 2025 revenue of $124.8 million, up 2.7% year-on-year and beating analyst estimates by 1.3%. The digital operations platform also exceeded profit expectations with non-GAAP earnings of $0.29 per share, 16.5% above consensus. However, the company's Q1 2026 revenue guidance of $119 million came in 3.9% below analyst estimates, causing shares to drop 14.9%. Operating margin improved to 3.6% from negative 9.6% in the prior year period, whilst free cash flow margin increased to 18.1%. PagerDuty's customer count decreased slightly to 15,351 from 15,398 in the previous quarter. Whilst the company has grown revenue at 18.2% annually over five years, recent growth has decelerated to 6.9% over the past two years.

INACTIVE