Incident Commander-Remote
Posted on 3/29/2023

5,001-10,000 employees

Data management & visualization platform
Company Overview
Splunk's mission is to address the challenges and opportunities of managing massive streams of machine-generated big data. Splunk is the leading software platform for machine data that enables customers to gain real-time Operational Intelligence.
Madison, WI, USA • Austin, TX, USA • Detroit, MI, USA • Cleveland, OH...
Experience Level
  • This is an opportunity for candidates with some incident management experience, excited about the technology and want to be part of a global team. You will be progressing your career in the Incident Management space with the support and tools to succeed
  • You have a bachelor's and 2+ years of major incident response and management experience or equivalent work experience
  • Have a clear understanding of the ITIL Incident framework
  • You can think outside the box and work on multiple tasks simultaneously while dynamically prioritizing based on changing conditions
  • Ability to work multi-functionally and to influence and execute across groups
  • You enjoy problem solving and analyzing global-scale distributed systems
  • You have outstanding interpersonal and communication skills
  • You remain calm and collected in stressful situations, such as a major service outage
  • You are willing to work a 4x10 hour weekly shift model including weekends and holidays
  • Negotiation, mediation, and conflict management skills
  • Strong leadership skills
  • We value diversity at our company. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, or any other applicable legally protected characteristics in the location in which the candidate is applying
  • For job positions in San Francisco, CA, and other locations where required, we will consider for employment qualified applicants with arrest and conviction records
  • Note: Splunk provides flexibility and choice in the working arrangement for most roles, including remote and/or in-office roles. We have a market-based pay structure which varies by location. Please note that the base pay range is a guideline and for candidates who receive an offer, the base pay will vary based on factors such as work location as set out below, as well as the knowledge, skills and experience of the candidate. In addition to base pay, this role is eligible for incentive compensation and benefits, and may be eligible for equity
  • Benefits are an important part of Splunk's Total Rewards package. This role is eligible for a competitive benefits package which includes medical, dental, vision, a 401(k) plan and match, paid time off, an ESPP and much more! Learn more about our comprehensive benefits and wellbeing offering here
  • Want to work in a dynamic environment with the latest cloud technologies? Want to learn Splunk from the inside and grow your career in exciting ways? Splunk is looking for self-starting individuals to be a part of the Splunk Incident Response Team (SIRT). The SIRT manages incidents that affect the availability and performance of Splunk platform and products for our customers globally. The SIRT is an always-on / always-active team making sure that each of our customers has an outstanding experience. We're looking for an Incident Commander to join our team in supporting and supervising our ever-expanding product range
  • As a member of the Splunk Incident Response Team, you will be responsible for leading cohesive response to high profile customer impacting incidents. In this role, you will be part of a team of global incident commanders responsible for managing high priority incidents from initial triage through to post incident review forums. This is a senior role at Splunk requiring an individual who can take charge in high stress situations and give direction to both customer personnel and to Splunk engineers to drive expeditious resolution of incidents. We are looking for a natural leader with proven knowledge of incident management frameworks, a demonstrable understanding of distributed systems environments and the ability to communicate clearly and effectively to technical and business audiences
  • Use the Splunk Incident Management Process to restore normal service operations as quickly as possible thus reducing the Mean Time to Repair on business operations
  • Assemble and lead the response team using strong methodical troubleshooting techniques
  • Capture and document key events and milestones during the life-cycle of the incident and communicate status accordingly to internal and external audiences as required
  • Set clear incident resolution objectives (exit criteria) and timings
  • Establish accurate expectations from response teams to ensure customer satisfaction throughout the process
  • Supervise and manage incidents fully to ensure accurate information is captured
  • Own Incident Commander responsibilities, contribute to post incident review, and follow through with action plans assigned to you
  • Coordinate with global peers to hand-off active incidents using the follow-the-sun principle
  • An eye for Continuous Service Improvement programs to drive more efficiency in People, Process and Technology in an effort to improve the Customer Experience