Junior Site Reliability Engineer
Posted on 8/31/2023
Tickets.com
Live event experience ticket platform
Locations
New York, NY, USA
Experience Level
Entry
Junior
Mid
Senior
Expert
Desired Skills
Terraform
Kubernetes
Datadog
CategoriesNew
DevOps & Infrastructure
Software Engineering
Requirements
  • Put the needs of our customers first
  • Prioritize unblocking your teammates, collaboration and knowledge sharing
  • Dedicated to continuous improvement of yourself and our SRE capabilities
  • Passionate about the value of SRE, but accepting of our role as a patient influencer
  • Understand basic concepts of APM such as tracing, logs, real user monitoring
  • Have written code in a compiled language that runs in production somewhere
  • Have some experience with o11y tools, including DataDog and Grafana
  • Have some experience with Terraform
  • Agree that documentation is a product
Responsibilities
  • Incident response
  • Kubernetes operations
  • User experience optimization through SLIs and SLOs
  • Observability
  • Debugging running systems and providing tools to assist runtime debugging
  • Optimizations for cost control
  • High Availability and Disaster recovery planning
  • Be an evangelist for Site Reliability Engineering at MLB and TDC
  • Become an expert in Observability and Incident Response best practices and tools
  • Support efforts to maximize the value of our SaaS solutions, including DataDog, FireHydrant and Pagerduty
  • Write code to complement / fill gaps left by SaaS solutions
  • Extensively utilize Terraform for infrastructure as code
  • Engage in incident response
  • Use and administer Grafana including developing and maintaining plugins and Dashboards
  • Help to migrate observability to go forward tools, including DataDog
  • Use and administer DataDog
  • Educate teams on observability best practices
  • Support FinOps program by guiding teams to maximize value of SaaS solutions
  • Enhance observability tooling to make the above possible
  • Enhance alerting tooling to make the above actionable
  • Work with SREs to build 'batteries included' solutions that are extensible across the organization