Senior Site Reliability Engineer
Posted on 3/18/2023
Palo Alto, CA, USA
- Availability to be in on-call rotation for Production issues
- Availability to work with a distributed team in different timezones
- Troubleshooting production performance/service degradation or outage issues at scale
- Experience with Infrastructure Troubleshooting in VMs and/or Bare Metal (ssh/Linux)
- Advanced Kubernetes knowledge
- Advanced Terraform knowledge
- Experience operating NoSQL Databases in Production
- Experience operating Relational Databases in Production
- Generic Configuration Management experience
- Focus on Production operations/matters and on-call
- Provision and scale multi-datacenter Kubernetes Infrastructure and Applications (EKS)
- Deploy Software in multiple Production Environments
- Own monitoring and alerting to production systems, improvements and changes
- Contribute improvements to the current automation
- Contribute improvements to our on-call process and alerting
- 8+ Years of experience with Production Troubleshooting
- 2+ years of Kubernetes Knowledge (operate)
- 2+ years Basic Terraform Knowledge
- Experience both setting up and utilizing Monitoring and observability tools
- E.g. New Relic, Nagios/Icinga, Grafana, Prometheus
- 5+ years of experience Programming/Scripting - one of the following
- Eg. Perl, Python, PHP, GoLang, Java, etc
- 10+ years of experience with modern Linux Operating systems (Enterprise Linux or Debian based)
- 8+ years of experience with modern cloud infrastructure, preferably AWS
- Bachelor's degree in related field or equivalent experience
SaaS experience platform for Communications Service Providers
Plumes' ambition is to build on their diversity of lived experiences to create more equitable opportunities for underserved communities through community engagement and education.
- Lifetime Plume HomePass membership (availability based on geographic location).
- Allowance for home office set-up.
- Monthly data or phone allowance.
- Plume employee referral bonus program.
- Public Transportation reimbursement.
- Retirement savings contributions
- Employee discounts.
- Patent submission support and incentive award program.
- Competitive and comprehensive health insurance coverage and extended benefits such as dental, vision, life, short/long-term disability insurance, and FSA/HSA (detailed coverage may vary by location).
- Employee assistance program.
- Generous vacation policy to ensure time to recharge (open paid time off or generous accrued paid time off based on location).
- Health and wellness reimbursement and a Premium Strava membership.
- Holistic health platform with Walking on Earth.
- Generous parental leave policy.
- Living our Plumian traits.
- Online and in-person team-building events.
- PlumeStrong - our corporate social responsibility program designed to apply our resources (time, brainpower, product, money) for good.
- Remote/flexible work arrangements.
- Professional development opportunities.
- Immigration support.
- Inclusive workplace, where everyone feels at home.
Company Core Values
- Action: We work in the NST 'Now Standard Time' zone
- Innovation: We challenge tradition, creating our own mold
- Do what’s right: We always ask what we should do, instead of what we can
- Focus: We charge towards productive goals with intention
- Detail: We do everything exceptionally well, or we don’t do it
- On-time, on-spec: We deliver what we promise on-time and on-spec
- Seek truth: We keep challenging our own decisions, and the status quo