Full-Time

Platform Operations Lead

NIH-NCBI

NIH-NCBI

Compensation Overview

$135k - $165k/yr

Bethesda, MD, USA

Hybrid

Category
Operations & Logistics (1)
Required Skills
TCP/IP
Microservices
Linux/Unix
Requirements
  • BS degree in science, technology, engineering, or mathematics or equivalent experience
  • Customer-focused, team-oriented disposition
  • Good systems debugging skills
  • Comfortable with the Linux environment or UNIX command line interface
  • Experience with some programming or scripting language
  • Experience creating processes, procedures and standard operating procedure documentation
  • General understanding of TCP/IP, HTTP, and related protocols
  • Initiative to take ownership of tasks and drive them to completion
  • Comfortable dealing with users with varying levels of IT knowledge
  • Eager to learn new technologies
  • Strong communication and soft skills to interface with customers, peers and management
  • Good judgement, sense of integrity, and responsibility
Responsibilities
  • Identify and resolve operational problems in a micro-service environment
  • Work with developers to resolve deployment and runtime problems
  • Perform analysis and debugging work across multiple technologies
  • Prioritize issues to keep applications within error budgets and meeting their service level objectives
  • Provide technical solutions to a wide range of problems and user requests
  • Document process, procedures and standard operating procedures by soliciting feedback and suggestions from team members
  • Compile postmortems and action items to minimize future outages
  • Interview other people for team member roles, and decide which ones to recommend for hire
  • Train new team members, and assist them with issues
  • Provide on-call support to NCBI's internal developers and other staff
Desired Qualifications
  • Kubernetes, OpenShift, Cloud or Linux experience
  • Experience with Service Reliability Engineering in any capacity
  • Linux systems administration
  • Automated CI servers, especially TeamCity and/or GitLab
  • Automation programming/scripting in any of: bash, Ruby, Python, Go, Java, Scala, Rust, C++, Perl
  • Automated configuration management, such as Puppet, Ansible, Chef, bcfg2, cfengine, etc. Puppet is preferred.
  • Version control systems, especially git
  • Service Mesh technologies (e.g., Linkerd, Istio)
  • Configuring or using monitoring and alerting technologies (TIGK stack, Grafana, Prometheus, OpsGenie)
  • Confluence, Jira, and Microsoft Office suite
  • GitOps tools, especially ArgoCD
  • Google Anthos
  • Understanding of Linux internals (system calls, file systems, processes, etc.)
  • Understanding of Linux network configuration
  • Understanding of Linux application containerization, especially Docker
  • Understanding of attached network storage technologies
  • Understanding of cloud computing environments such as AWS, GCP or Azure
  • Understanding of automated CI/CD pipelines
  • Understanding of distributed systems design principles

Company Size

N/A

Company Stage

N/A

Total Funding

N/A

Headquarters

N/A

Founded

N/A