Full-Time

Director – IT Incident and Problem Management

Confirmed live in the last 24 hours

Smarsh

Smarsh

1,001-5,000 employees

Provides archiving and compliance solutions

Enterprise Software
Financial Services

Compensation Overview

$200k - $250kAnnually

+ Bonus Programs

Senior

Atlanta, GA, USA

Hybrid position based in Atlanta.

Category
IT Project Management
IT Support
System Administration
IT & Security
Required Skills
Microsoft Azure
AWS
Google Cloud Platform
Requirements
  • 10-15 years of experience in IT incident and problem management, ideally within SaaS and platform-based environments, with a minimum of 5 years in a senior leadership capacity.
  • Demonstrated expertise in using cutting-edge incident management tools (e.g., incident.io, FireHydrant) and AI-driven solutions to streamline processes, drive rapid response, and enhance service reliability.
  • Expertise in leading comprehensive root cause analysis and problem resolution efforts, incorporating Google SRE principles for preventive actions.
  • In-depth knowledge of Google SRE philosophies, including error budget management, service level indicators/objectives (SLIs/SLOs), and effective incident response strategies.
  • Strong understanding of platform-oriented operations within B2B SaaS, ideally with experience in supporting a pivot from product to platform. FinTech experience is advantageous but not required.
  • Proven record of building and leading high-performing teams, with an emphasis on holding teams accountable to clear standards and ensuring consistency in incident response and resolution.
  • Excellent ability to influence and collaborate with cross-functional teams and executive-level stakeholders. Skilled in delivering complex insights to both technical and non-technical audiences.
  • Ability to drive continuous improvement through innovative practices, data insights, and strategic thinking. An advocate for evolving incident/problem management to proactively support business goals.
  • Experience managing incident and problem resolution in cross-cloud environments, ideally with a focus on seamless integration of diverse platforms.
  • Bachelor’s degree in Computer Science, Information Systems, or a related field; a Master’s degree is preferred.
  • ITIL Expert certification and familiarity with Google SRE principles; advanced certifications in cloud platforms (AWS, GCP, Azure) or incident management tools are highly advantageous.
  • Familiarity with leveraging AI and machine learning within incident and problem management to predict incidents, automate responses, or identify root causes, showcasing an ability to bring innovative solutions to the role.
Responsibilities
  • Provide visionary leadership to evolve our incident and problem management practices, embedding modern approaches that use AI and automation and predictive capabilities to reduce response times and predict potential issues before they impact service.
  • Foster a culture of accountability, holding engineering teams and incident responders to high standards for incident resolution. Ensure robust tracking and reporting of incident response metrics, creating transparency and setting clear performance expectations.
  • Drive alignment between incident/problem management and the organization's shift towards a unified platform model, ensuring that incident management processes are scalable, adaptable, and aligned with platform objectives.
  • Deploy and optimize advanced incident management platforms such as incident.io and FireHydrant, utilizing these tools to enhance visibility, speed, and effectiveness of response across our platform. Adapt methodologies beyond traditional ITIL to remain agile and customer-focused.
  • Lead comprehensive root cause analysis for major incidents, advocating a preventative stance through continuous improvement and resilience-focused practices. Apply SRE principles and drive actionable outcomes to prevent recurrence.
  • Utilize data-driven insights to inform incident response strategies. Present trends, risk factors, and improvement opportunities to senior executives and stakeholders, supporting business decisions with clear, actionable metrics.
  • Define and implement strategic roadmaps for incident and problem management, ensuring alignment with business objectives and platform goals. Regularly update practices to incorporate the latest in AI, automation, and predictive analytics.
  • Oversee major incident response efforts, ensuring fast, effective containment, resolution, and customer impact mitigation. Lead executive-level post-mortems and ensure comprehensive follow-ups.
  • Conduct and oversee in-depth root cause analyses for recurring or high-impact incidents, developing and deploying preventive measures across the platform to reduce recurrence.
  • Collaborate closely with IT operations, engineering, product, and support teams to ensure a unified approach to incident and problem resolution, with a focus on consistent customer experience.
  • Define, monitor, and optimise KPIs and performance metrics related to incident and problem management. Lead continuous improvement initiatives to ensure process agility and alignment with evolving business requirements.
  • Lead continuous improvement initiatives, including evaluating and refining AI algorithms and predictive models to align with evolving business needs and platform scalability.
  • Drive modular and scalable incident management practices, adaptable to the complexities of a multi-service platform architecture.
  • Develop and deliver reports on incident and problem management metrics for stakeholders, including executive leadership, product management, and customer success teams, to provide insights into trends, risks, and opportunities for improvement.

Smarsh provides archiving and compliance solutions specifically designed for financial services, government agencies, and other regulated industries. Their main product is a cloud-native archive that allows organizations to securely store, search, and manage their communications data, including emails, text messages, and social media interactions. This system helps businesses meet complex security, data privacy, and regulatory requirements. Smarsh differentiates itself from competitors by offering a scalable Software-as-a-Service (SaaS) model that caters to both large enterprises and smaller organizations, ensuring that clients can adapt to evolving regulations. Their goal is to help organizations efficiently manage their communication data, identify risks, and maintain compliance, particularly through tools like Connected Capture for Microsoft Teams, which supports remote workforces.

Company Stage

M&A

Total Funding

$42.4M

Headquarters

Portland, Oregon

Founded

2001

Growth & Insights
Headcount

6 month growth

10%

1 year growth

11%

2 year growth

1%
Simplify Jobs

Simplify's Take

What believers are saying

  • Smarsh's strategic partnerships, such as with SOCi and Verizon, enhance its market reach and product capabilities.
  • The appointment of experienced leaders to the board and executive team positions Smarsh for robust governance and strategic growth.
  • Integration with popular tools like Microsoft Teams and OpenAI's ChatGPT ensures Smarsh remains relevant and valuable in the evolving digital communication landscape.

What critics are saying

  • The highly regulated nature of Smarsh's target industries means any compliance failures could have severe repercussions.
  • Dependence on strategic partnerships, such as with Verizon and SOCi, could pose risks if these relationships falter.

What makes Smarsh unique

  • Smarsh's focus on regulated industries like financial services and government sets it apart from competitors who target broader markets.
  • Their integration with OpenAI's ChatGPT Enterprise Compliance API showcases a commitment to leveraging cutting-edge AI for compliance solutions.
  • The partnership with Verizon's Bill-on-Behalf-of program simplifies procurement and deployment, making Smarsh's mobile capture solutions more accessible to Verizon's extensive customer base.

Help us improve and share your feedback! Did you find this helpful?