Senior Site Reliability Engineer
Posted on 3/19/2024
Linus Health

51-200 employees

Mental health digital screening company
Company Overview
Linus Health is on a mission to make brain health accessible for everyone by transforming the way we measure, monitor, and maintain brain function. Linus Health's digital platform delivers a proven, practical means of enabling early detection; empowers providers with actionable clinical insights and recommendations; and supports individuals with personalized action plans.
AI & Machine Learning
B2B & B2C

Company Stage

Series B

Total Funding

$70.6M

Founded

2019

Headquarters

Boston, Massachusetts

Growth & Insights
Headcount

6 month growth

13%

1 year growth

41%

2 year growth

55%
Locations
Remote in USA
Experience Level
Entry
Junior
Mid
Senior
Expert
Desired Skills
Python
Communications
TypeScript
AWS
Terraform
Data Analysis
CategoriesNew
DevOps & Infrastructure
DevOps Engineering
Site Reliability Engineering
IT & Security
Cloud Engineering
Requirements
  • Effective ability to program with one or more high level languages, such as TypeScript and Python.
  • Effective ability to use terraform to control infrastructure in AWS or other public cloud environments.
  • Ability to evaluate and communicate the pros and cons of various application design decisions when running in a cloud-native environment especially around fault tolerance and scalability.
  • Experience in building and supporting serverless applications on AWS using services such as api gateway, lambda, fargate, glue, etc.
  • Experience with containerization and orchestration tools.
  • Outstanding communication and collaboration skills.
Responsibilities
  • Collaborate with Engineering teams to elicit application requirements, document designs, and implement AWS infrastructure for production cloud environments.
  • Promote a service-ownership model throughout the organization by advising application teams and establishing best practices especially around observability, alerting, CI/CD systems, and system design.
  • Leverage infrastructure as code (terraform) to build and maintain complex production and analytics workflows including networking and auth setups.
  • Rapidly diagnose and resolve faults in system services as part of a 24/7 on-call rotation focused on actionable alerting and eliminating toil.
  • Coordinate with data engineers and architects to support management of systems with complex ETL pipelines
  • Address system availability, security, compliance, and performance
  • Estimate work, prioritize tasks, track dependencies, report progress, highlight blockers
  • Drive continuous improvement initiatives, advocate for SRE best practices, and stay current with emerging technologies and trends.