Full-Time

Director of Site Reliability Engineering

Nadp

Confirmed live in the last 24 hours

Thousand Eyes

Thousand Eyes

501-1,000 employees

Network performance monitoring and analytics platform

Data & Analytics
Enterprise Software

Senior, Expert

London, UK

Category
DevOps & Infrastructure
Site Reliability Engineering
Required Skills
Airflow
Data Science
Apache Spark
Requirements
  • You have a deep understanding of the distributed systems design, cloud technology and their components, dependencies, and code that define infrastructure
  • You possess a deep understanding of SRE principles, including monitoring, alerting, error budgets, fault analysis, and other common reliability engineering concepts
  • Extensive hands-on experience building cloud, big data and/or ML/AI infrastructure (e.g. EMR, Airflow, Comet ML, AWS SageMaker, Spark, etc)
  • Extensive hands-on experience operating mission-critical services in production environments which are required to have high availability and reliability.
  • Proven ability to think strategically and align technical initiatives with business objectives
  • Can provide a strong technical vision for your teams and ensure consistent delivery of objectives
  • Have experience formulating a team's technical strategy and roadmap; you've collaborated and partnered effectively with several other teams to execute shared goals
  • Understand how to balance tactical needs with strategic growth and quality-based initiatives that can span multiple quarters
  • Proven site reliability engineering management experience leading multiple teams
Responsibilities
  • Lead and inspire a talented team of site reliability engineers, fostering a culture of innovation, collaboration, and excellence in development and operation of infrastructure platforms
  • Drive the strategic vision for the development, implementation, and management of cloud, data, ML/AI platforms.
  • Collaborate closely with cross-functional teams, including development, product management, and security to define and implement reliable, secure, and scalable infrastructure platforms
  • Provide oversight and direction in the development and operation of cloud platforms, ensuring high-quality, scalable, and reliable solutions that meet customer needs
  • Drive operational excellence in operations and security processes
  • Mentor and develop engineering talent, fostering a culture of continuous learning and professional growth within the site reliability engineering group

ThousandEyes specializes in monitoring network infrastructure and analyzing internet performance. Its platform operates in the cloud, providing businesses with tools to understand and enhance their digital experiences. By offering visibility into the performance of networks and applications, ThousandEyes enables companies to monitor their online services, identify issues, and improve reliability. The platform maps the global topology of wide-area networks and measures performance metrics, ensuring that services run smoothly. Unlike many competitors, ThousandEyes focuses on comprehensive internet intelligence, serving a diverse range of clients from various industries. The company's goal is to help businesses maintain optimal digital performance through real-time monitoring and detailed analytics.

Company Stage

Acquired

Total Funding

$107.5M

Headquarters

San Francisco, California

Founded

2010

Growth & Insights
Headcount

6 month growth

0%

1 year growth

0%

2 year growth

1%
Simplify Jobs

Simplify's Take

What believers are saying

  • AI-driven predictive analytics reduce downtime and improve user experience.
  • Digital Experience Assurance provides a competitive edge with advanced AI capabilities.
  • Custom Webhooks feature enhances flexibility and responsiveness of network monitoring.

What critics are saying

  • Increased EU regulatory scrutiny could impact operations and client relationships.
  • AI-powered capabilities may face adoption and integration challenges.
  • Competition with SolarWinds and Splunk could lead to pricing pressures.

What makes Thousand Eyes unique

  • ThousandEyes offers unmatched visibility into global internet performance for superior digital experiences.
  • The platform's AI-driven analytics predict and mitigate internet outages effectively.
  • ThousandEyes' integration with Cisco enhances network management and visibility.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Health Insurance

Dental Insurance

Vision Insurance

Disability Insurance

401(k) Retirement Plan

401(k) Company Match

Paid Holidays

Paid Vacation

Employee Stock Purchase Plan