Full-Time

Senior Site Reliability Engineer

Backblaze

Backblaze

201-500 employees

Cloud storage and data backup provider

Compensation Overview

$150k - $200k/yr

+ RSU Grants

Remote in USA

Remote

Category
DevOps & Infrastructure (1)
Required Skills
Bash
Kubernetes
Python
Grafana
Docker
Go
Prometheus
Jenkins
Terraform
Ansible
Linux/Unix
Requirements
  • Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience).
  • 8+ years of progressive experience in site reliability, systems engineering, or operations.
  • Extensive experience designing, scaling, and operating large-scale, production-grade distributed systems.
  • Expert-level Linux systems administration and advanced troubleshooting skills.
  • Lead security-minded operations, focusing on system-wide patching, hardening, and proactive vulnerability identification.
  • Deep mastery of service reliability concepts, including advanced monitoring, complex alerting strategy, leading incident response, and in-depth root cause analysis.
  • Advanced proficiency in at least one modern scripting/programming language (Python or Go strongly preferred).
  • Expert knowledge of incident response methodologies and operational best practices.
  • Proven experience designing and operating container orchestration (Kubernetes, Docker) and microservices concepts required.
  • Expert experience with Hashicorp products (Nomad, Vault, Terraform) in a production environment.
Responsibilities
  • Own and drive the availability, durability, and performance of critical services across all production environments.
  • Lead and champion complex projects from problem discovery through complete, cross-functional resolution, demonstrating high-level technical ownership.
  • Define, establish, and enforce service health standards, including working with engineering leadership to implement SLIs, SLOs, and error budget policies for multiple services.
  • Lead critical incident response and post-incident reviews, translating findings into strategic, long-term service improvements and architectural changes.
  • Mentor others and act as a subject matter expert in following and evolving established ITIL/OSS processes (incident, change, problem, and capacity management).
  • Design and architect scalable automation solutions to eliminate toil and improve the efficiency of operational tasks across the entire platform.
  • Drive the strategic direction of monitoring, logging, and alerting frameworks (e.g., Prometheus, Grafana, Catchpoint, ELK), and integrate them for comprehensive observability.
  • Build, maintain, and secure advanced CI/CD pipelines, configuration management, and complex infrastructure as code solutions (Terraform, Ansible, Jenkins).
  • Write production-grade code (Bash, Python, Go, etc.) to develop new reliability tools and enhance existing systems.
  • Act as a principal partner to engineering, product, and operations teams, consulting on resilient system design, architecture, and operation.
  • Lead and formalize the Production Readiness Review (PRR) process, ensuring robust operational handoff for all new services and features.
  • Lead capacity planning and disaster recovery strategy across critical infrastructure components.
  • Manage the relationship with vendors and service providers to troubleshoot systemic issues and ensure strict adherence to SLA performance.
  • Drive the creation of high-quality documentation, proactively share advanced learnings, and cultivate a reliability-first engineering culture across teams.
  • Own the creation, maintenance, and dissemination of operational playbooks, runbooks, and detailed system documentation.
  • Proactively identify systemic, recurring issues and architect and drive the implementation of long-term improvements and strategic design action plans.
  • Be a leading voice in promoting and embedding reliability-focused practices within development and operations teams.
Desired Qualifications
  • Significant experience in a SaaS, service provider, or hyper-scale distributed systems environment.
  • Deep familiarity with ITIL/OSS practices and experience defining/enforcing SLO/SLA’s.
  • Exceptional problem-solving skills and a strong drive to learn and apply new, complex technologies.
  • Advanced experience with cloud platforms (AWS, GCP, or Azure) in a production setting.

Backblaze provides cloud storage and data backup services for individuals, businesses, and developers. Its products include Backblaze Computer Backup, which automatically backs up computers for a fixed monthly fee, and Backblaze B2 Cloud Storage, an affordable IaaS with S3-compatible APIs for developers and enterprises. The company runs on purpose-built, commodity-hardware infrastructure and publishes detailed drive-performance reports to ensure transparency, with features like cloud replication and flexible version history. Its goal is to offer simple, affordable, and reliable data protection and cloud storage that scales from individuals to enterprises.

Company Size

201-500

Company Stage

IPO

Headquarters

San Mateo, California

Founded

2007

Simplify Jobs

Simplify's Take

What believers are saying

  • B2 Cloud Storage grows 24% YoY in Q1 2026, driven by 76% AI customer expansion.
  • Anuj Kumar joins as CRO from NetApp to scale enterprise AI storage deals.
  • Backblaze raises 2026 revenue guidance by $5M targeting $14B neocloud market by 2030.

What critics are saying

  • Wasabi undercuts B2 at $5.99/TB/month versus $6/TB plus fees, driving churn.
  • Cloudflare R2 zero egress fees force AI developers to migrate from B2 within 12 months.
  • Ongoing losses since 2021 IPO deplete cash amid 30% equipment cost hikes by 2027.

What makes Backblaze unique

  • Backblaze uses commodity hardware Storage Pods for low-cost S3-compatible cloud storage.
  • Backblaze publishes detailed hard drive failure reports for unmatched storage transparency.
  • Backblaze bootstrapped until 2021 IPO, preserving independent culture without hyperscaler lock-in.

Help us improve and share your feedback! Did you find this helpful?

Your Connections

People at Backblaze who can refer or advise you

Benefits

Health Insurance

Dental Insurance

Vision Insurance

401(k) Retirement Plan

401(k) Company Match

Flexible Vacation Policy

Maternity & paternity leave

Commuter Benefits

Fertility Treatment Support

Learning & development program

Stock Options

RSUs

ESPP program

Hybrid Work Options

Remote Work Options

Paid Vacation

Paid Holidays

Paid Sick Leave

Wellness Program

Mental Health Support

Gym Membership

Phone/Internet Stipend

Home Office Stipend

Childcare Support

Adoption Assistance

Parental Leave

Family Planning Benefits

Tuition Reimbursement

Professional Development Budget

Conference Attendance Budget

Employee Discounts

Legal Services

Meal Benefits

Relocation Assistance

Performance Bonus

Profit Sharing

Employee Stock Purchase Plan

Growth & Insights and Company News

Headcount

6 month growth

-1%

1 year growth

-2%

2 year growth

0%
The Associated Press
Apr 13th, 2026
Backblaze appoints Anuj Kumar as CRO to drive AI-era cloud storage expansion

Backblaze has appointed Anuj Kumar as Chief Revenue Officer. Kumar brings over 20 years of experience scaling cloud revenue organisations at enterprise infrastructure companies, most notably driving NetApp's worldwide cloud business during significant growth. The appointment comes as Backblaze's B2 Cloud Storage business grew 26% year over year in 2025 and closed its first eight-figure total contract value deal. The company is targeting neoclouds and AI-native developers, with the neocloud GPU provider market representing an estimated $14 billion storage opportunity by 2030. Kumar previously served as SVP and General Manager for North America at SUSE and Chief Revenue Officer at HUMAN Security. He also held senior roles at NetApp, VMware, Rackspace, Verisign and Red Hat.

Business Wire
Mar 30th, 2026
Backblaze grants 276,890 RSUs to new SVP of Product and VP of Revenue Operations

Backblaze has granted equity inducement awards to two senior executives as part of their employment agreements. Rhett Dillingham, Senior Vice President of Product, received 194,240 restricted stock units, whilst Joey Myers, Vice President of Revenue Operations, received 82,650 RSUs. The awards will vest over four years, with 25% vesting after one year and the remainder in equal quarterly instalments over the following three years, contingent on continued employment. The grants were made on 24 March 2026 under Nasdaq Listing Rule 5635(c)(4). Backblaze provides cloud object storage services to over 500,000 customers globally, supporting AI workflows, data-heavy applications and media management. The company trades on Nasdaq under the ticker BLZE.

Backblaze
Mar 13th, 2026
Backblaze Now Serving 314 Trillion Digits of Pi

Backblaze now serving 314 trillion digits of Pi. Lots of Backblaze, Inc. were taught that pi equals 3.14. Maybe 3.14159 if your teacher was ambitious. Akira Haraguchi, who holds the Guiness Book of World Records title for reciting the most digits of pi in a single run, got up to 100,000 digits in 16 hours. That's still only a fraction of the record digits of pi that are calculated - 3.18471338 x 10[-8] % to be exact. So why do Backblaze, Inc. need that much pi? A pi record isn't a burst workload. It's a system that runs at sustained pressure for months, writing checkpoints, flushing buffers, and proving that nothing quietly breaks. Last December, StorageReview set a new record, calculating 314 trillion digits on a Dell PowerEdge R7725. In honor of Pi Day, Backblaze B2 Cloud Storage has teamed up with StorageReview to host that dataset, which totals over 130TB. The pi dataset is generally available, publicly accessible, and structured for large-scale retrieval and analysis. Why pi remains a compute benchmark. Pi has long served as a proving ground for computational systems because it offers a deterministic workload with clear correctness criteria and sustained compute and input/output (I/O) demands. Records in pi computation trace back decades and reflect both mathematical and computational advances. In 1949, ENIAC - the first programmable, electronic, general-purpose, digital computer - computed 2,037 digits of pi in about 70 hours, an early demonstration of electronic computing capability that was eventually published in the paper, "The ENIAC'S 1949 Determination of π." Algorithms have evolved significantly since then. The Chudnovsky algorithm, developed in 1988, is one of the fastest converging methods for high-precision pi calculation and has been used in many modern record attempts because of its efficiency at large digit counts. Pi calculations do not mirror typical enterprise workloads such as databases or machine learning training, but their determinism and large scale make them useful for evaluating sustained performance of CPU, memory, and storage subsystems under continuous load. It's also used in various security functions including random number generation (because computers can't be truly random), cryptographic algorithms, hash functions, digital signatures, and secure communications protocols like SSL/TLS. What the 314 trillion digit run represents. In December 2025, StorageReview reported a new record by calculating pi to 314 trillion digits on a single server that ran continuously for approximately 110 days before completion. The achievement emphasizes not only the scale of the computation but also the role of storage architecture, non-uniform memory-access (NUMA) tuning, and system stability in sustaining such a workload. The raw output of the run, including checkpoints, extended beyond 2PB of data. The finalized dataset hosted in Backblaze B2 exceeds 130TB and is divided into 200GB objects suitable for staged retrieval. Engineers, researchers, and pi enthusiasts can freely retrieve their own slice of pi (or the whole thing) for analysis, performance characterization, and tool validation. Structuring the dataset into manageable objects enables selective download for analysis, parallelized workflow testing, and evaluation of sustained object retrieval performance. How to access the dataset. The 314 trillion-digit dataset is available today via Backblaze B2 Cloud Storage. To request access: * Visit the pi landing page. * Submit the required information to receive credentials. * Use the provided instructions to download via rclone, an open-source cloud storage management tool. The object layout supports both partial and full dataset retrieval strategies. Enjoy your pi! With all the ways you can use the pi dataset, Backblaze, Inc. can't wait to hear what you all are working on. Feel free to let Backblaze, Inc. know what you're working on in the comments section below, on socials, or by email. Happy experimenting!

Yahoo Finance
Mar 10th, 2026
Backblaze posts 27.6% EBITDA margin in Q4 despite mixed revenue trends

Backblaze Inc., a cloud storage platform, has received mixed analyst reactions following its fourth-quarter results. The company reported a record adjusted EBITDA margin of 27.6% and made progress on operational and free cash flow fronts, though revenue trends varied across its B2 and Computer Backup segments. On 24 February, Oppenheimer reduced its price target on Backblaze from $9.50 to $8.50 whilst maintaining an Outperform rating, suggesting upside potential exceeding 124%. The same day, Craig-Hallum downgraded the stock from Buy to Hold, setting a target price of $4.50 with almost 19% upside potential. Some B2 deals closed later in the quarter, contributing to uneven revenue performance despite overall operational progress. The company provides cloud storage services for public, hybrid and multi-cloud data storage.

Business Wire
Feb 26th, 2026
Backblaze launches Advanced Installer and CLI to give IT teams greater endpoint backup control

Backblaze has introduced two new tools for its Computer Backup product: the Advanced Installer and Backblaze Command Line Interface (bzcli). The tools give IT teams greater control over endpoint backup deployments across large or distributed environments. The Advanced Installer allows administrators to preconfigure and lock client settings before deployment, including backup schedules, file exclusions and security preferences. It integrates with deployment platforms such as Jamf, Kandji and Addigy. Bzcli enables remote configuration and reporting after installation through structured JSON input files. Administrators can modify settings, update schedules and retrieve reporting information through automation tools. The tools are designed to help organisations standardise backup policies and reduce configuration variability across growing teams without requiring manual user intervention.