Staff Site Reliability Engineer @ Platform Science

Who We Are

At Platform Science, we’re working to connect everything that moves.

Founded in 2015, we are an open IoT platform that partners with innovative fleets, application developers, vehicle manufacturers, and equipment providers in the transportation industry to deliver revolutionary solutions to supply chain professionals across the globe.

Our employees are an engaging, diverse group of people who believe in the power of great ideas. We hire people with different experiences and perspectives to build a company culture that fuels growth through innovation.

We value thoughtful actions and empathy for others. We approach challenges with resiliency and creativity, while encouraging transparency because, no matter our backgrounds or responsibilities, we are one team.

About the Role

We are looking for a qualified Staff SRE to join our team in San Diego, CA (or remote). You will be hired to solve operational problems and provide support to development teams for critical business applications in production. Our focus is to ensure reliability in all production services and enable dev teams to be able to measure their reliability to effectively make decisions.

The SRE team has the unique opportunity to work with all aspects of our platform. We run entirely in the cloud i.e. AWS, Azure and GCP. Our applications and services are containerized and serverless. If you’re excited about learning and supporting new technologies and many different types of products (including mobile apps, hardware, websites, messaging queues, serverless pipelines, and more), and working with an incredibly talented team, then this is the position for you!

As a Staff SRE, you have a software development background or systems background with strong coding skills. Ideal candidates want to deeply understand how our systems work from the infrastructure level, their dependencies to other systems, to the customer experience, and how to mitigate risk. You are comfortable with giving and taking technical direction. You are a great communicator and self-starter who strives to make the company and our technologies better.

Essential Responsibilities

Lead the development and enhancement of Continuous Integration/Continuous Deployment (CI/CD) pipelines, along with refining release management processes and associated toolsets
Architect and maintain Helm charts to streamline application deployment and management
Establish standardized observability solutions to empower development teams in efficiently managing their applications
Lead the effort in promoting and prioritizing reliability, driving achievement of uptime goals and mentoring colleagues in SRE best practices
Conduct comprehensive Production Readiness Reviews, working with teams to identify and establish Service Level Objectives (SLOs), and ensure high-quality and dependable services
Design and develop software solutions to address operational challenges effectively to improve system stability and reliability
Fulfill on-call duties, providing expert support to development teams for mission-critical applications in production environments
Improve the resiliency of applications and systems using chaos engineering

Experience

Possess 9+ years of hands-on experience in SRE or Platform Engineering roles
Demonstrated expertise (4+ years) with automation technologies like Jenkins, ArgoCD, or similar
Extensive (3+ years) experience with Kubernetes, Helm, and Docker within production environments
Proficiency with current software development lifecycle (SDLC) concepts and best practices, CI/CD pipelines, and test-driven development
Experience with AWS, encompassing proficiency in EKS, IAM, autoscaling, networking, and load balancing/request routing in a production environment
Proficient in Python, Bash, Nodejs, and/or Go
Proficient with distributed tracing methodologies and observability tools such as Prometheus, ELK, or Datadog
Strong emphasis on documentation and fostering knowledge-sharing practices within the team and organization
Track record of successfully training and mentoring engineers
Proven expertise in optimizing performance and managing costs within cloud environments
Sound understanding of SLI/SLO concepts and adherence to SRE best practices

Platform Science Benefits Highlights

The company offers various benefits to regular, full-time employees including:

Medical, dental, and vision insurance
Short-term and long-term disability insurances
AD&D and life insurance
401k plan
Paid vacation, sick leave and holidays
Six weeks of paid parental leave

For more information please see the Benefits Highlights brochure for regular, full-time employees.

In addition, you can access the Benefit Highlights brochure for regular, full-time employees by copying and pasting the link into your browser: https://www.platformscience.com/benefits.

This is an exempt role. Our job titles for each posting may span across more than one job level. The estimated base salary for this role is between $145,292 and $227,680. The range displayed on each job posting reflects the minimum and maximum target range for new hire base salaries across all US locations. Compensation packages are based on many factors unique to each candidate, including but not limited to skill set, work experience, relevant trainings and certifications, business needs, market demands and specific geographical location. The base pay range is subject to change and may be modified in the future. This role may also be eligible for bonus, equity, and benefits.

Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits.

Platform Science collects your personal information to support its business operations, including for human resources, employment, benefits administration, health and safety, and other business-related purposes as well as to be in legal compliance. You can review further details of such collection and use in our Privacy Policy (link for browser: https://www.platformscience.com/privacy-notice).

At this time we only consider candidates in these states: AL, AR, AZ, CA, CO, FL, GA, ID, IL, KY, MA, MD, MI, MN, MO, NC, NH, NV, NY, OH, OK, OR, PA, SC, TN, TX, UT, VA, WA, and WI. In the future we plan to add more states.

Compensation Overview

6 month growth

1 year growth

2 year growth