Senior Site Reliability Engineer
Confirmed live in the last 24 hours
Remote • United States
- Excellent communication skills and a sense of ownership
- 4+ years of relevant professional experience. You have a software engineering background and/or an operations background and have worked as an SRE or related role before
- Experience architecting, developing, and troubleshooting distributed systems
- Fluency on design patterns to build performant, resilient and highly available systems
- Proficient software developer, you not only have the ability to read and write code, but also identify opportunities and implement sound solutions to automate routine tasks and eliminate toil
- Experience with system architecture. You can create a design document for a performant and highly available application, involving multiple types of storage, cross-region load-balancing, caching layers and messaging infrastructure
- Excitement for blockchain and Web 3.0
- Be willing to go on-call. Reliability is our most important feature, because on-call is an essential component of a reliable system we take it very seriously
- Some of the tools and services we use daily or almost daily are:
- AWS; Terraform/Terragrunt; Kubernetes, Calico and ArgoCD; Prometheus and Grafana; GitHub Actions; Packer
- We expect you to be comfortable with most of those tools and very proficient in several of them
- Maintain all on-chain and job orchestration configurations
- Automate and reduce complexities around product operations
- Evangelize and enact best practices as experts to guide high-quality Site Reliability Engineering
- Make tooling user-friendly and accessible to create self-sufficient operational experts across the company and our network of Node Operators
- Continue delivering operational tasks in agreed SLAs to expand scalability and reliability
- Deliver high product velocity while protecting reliability and operability
- Support production systems by being on-call
- Deploy and maintain various externally-facing services
- Improve the reliability and observability of Chainlink services
- Provide our engineers with reliable automations and empower them to deploy and maintain Chainlink services in a repeatable and stable manner
- Support monitoring services that watch over the entire Chainlink network
- Support Incident Response by shortening the duration of incidents while keeping an active feedback loop that assures operations and reliability of our systems get better over time
- Support services before they go live through activities such as system design consulting, capacity planning and launch reviews
- Engage in and improve the whole lifecycle of services-from inception and design, through to deployment, operation and refinement
- Manage execution of project priorities, deadlines, and deliverables
- Provide technical leadership for the local team and work closely with partner team technical leads
- Professional experience with Golang, TypeScript, or both
- Experience running blockchain full node operator is a big plus
- Experience with Chainlink as a developer or a node operator is a big plus
- Comfort working with network protocols, proxies, and load balancers
- Experience with CI/CD pipelines. You've worked on both software delivery and cloud-based services deployment
- Experience with information security and DevSecOps
- Experience working remotely in a distributed team
- Experience with container orchestration
Open-source, data-enabled blockchain solutions
Chainlink provides reliable tamper-proof inputs and outputs for complex smart contracts on any blockchain. Chainlinks provide a reliable connection to external data, that is provably secure end-to-end.
Company Core Values
- Being part of an idea meritocracy: We take ownership in the work we do and value real impact. There are no limits to what you can get involved in.
- An open-source software ethos: We’re driven by creating a public good. What we build is by everyone, for everyone.
- Being immersed in excellence: We hold ourselves to the highest standards—for our teams, our product, and our position in the ecosystem.
- Building a world changing product: Smart contracts are redefining major industries, and we’re at the forefront of this global shift.
- Being part of a high growth startup: Experience rapid personal and professional growth by accessing a broad scope of work and exceptional mentorship.
- Autonomous, remote & flexible work: We’re a remote-first workplace that values autonomy, flexibility, and results.