Full-Time

Sr. Site Reliability Engineer

Confirmed live in the last 24 hours

CDK Global

CDK Global

5,001-10,000 employees

Integrated software solutions for automotive retail

Automotive & Transportation
Enterprise Software

Compensation Overview

$110k - $140kAnnually

+ Bonus

Senior, Expert

H1B Sponsorship Available

Hoffman Estates, IL, USA

Category
DevOps & Infrastructure
Site Reliability Engineering
Required Skills
RabbitMQ
React.js
Apache Kafka
Java
Postgres
CloudFormation
AWS
Prometheus
Terraform
Nginx
MongoDB
Oracle
Linux/Unix
AngularJS
Requirements
  • Bachelor’s degree, or equivalent experience, in Computer Science, Engineering, or related field, with 8+ years of relevant experience with large-scale enterprise-grade solutions.
  • A strong background in architecture / design and currently working in a similar role, in a forward-thinking and fast paced business.
  • 4+ years professional SRE experience relevant to the responsibilities listed above, including event driven architectures, cloud native and distributed / SaaS solutions
  • 4+ years of experience with CI/CD pipelines, infrastructure as code, proactive monitoring, smart alerting, ensuring performance / scalability and proactive capacity management of enterprise-grade solutions.
  • Expertise troubleshooting across the entire stack: network, server, operating system, and application
  • Expertise with monitoring and alerting tools (e.g., New Relic, Prometheus, Grafana)
  • Strong analytical and problem-solving skills, with a keen attention to detail
  • Experience with Microservices, Java, Node, Kafka / RabbitMQ, Oracle / PostgreSQL, MongoDB / DynamoDB, React / Angular, Istio, NGINX, F5, AWS API Gateway, ECS, Cloudformation, Terraform
  • Experience deploying, maintaining and troubleshooting containerized applications
  • A level of comfort with Linux
  • Solid communication and collaboration skills
  • Certification in AWS or related cloud technologies
  • Automotive retail experience
Responsibilities
  • Engage in and improve the whole lifecycle of solutions, from inception and design, through to build/test, deployment, operation and refinement
  • Ensure our solutions are reliable, fault-tolerant, secure, efficiently scalable, available, reachable and cost-effective
  • Measure, monitor and proactively alert on resource consumption, error rates, traffic anomalies, availability, performance, reachability and overall system health
  • Quickly respond to and prevent disruptions to users. If a disruption does occur, quickly respond to and resolve incidents efficiently
  • Expertly troubleshoot issues with distributed systems, interactions between cloud technology layers and components, common dependencies at scale
  • Practice sustainable incident response, blameless postmortems and prompt implementation of recommended changes to prevent recurrence
  • Contribute to the development and implementation of routine maintenance automation and alerting
  • Recommend configurations optimal of cloud technology solutions and modify the code base that defines systems or cloud technologies to improve the reliability, availability, efficiency, observability, performance and operability of supported products
  • Collaborate well with cross-functional teams across product, architecture, engineering, infrastructure, and security to ensure that reliability standards are integrated into the development and deployment of all solutions
  • Maintain up-to-date documentation on system configurations, incident response protocols, and operational best practices
  • Earnestly participate in code/design reviews, and regular meetings with the engineering teams that develop and/or manage the products in question
  • Research and maintain an awareness in industry trends, advances in distributed systems and cloud technologies, tools, and/or processes for maintaining and improving product availability, reliability, efficiency, observability, and/or performance
  • Contribute to the implementation of new solutions within the team by identifying ways they can be applied to solve persistent problems
  • Ensure that uniform enterprise-wide architecture and design standards are adhered to high availability of products, services and database.

CDK Global provides integrated software solutions specifically designed for the automotive retail industry. Their products help auto dealerships manage various operations such as billing, customer relationship management (CRM), inventory management, and service scheduling. By using these tools, dealerships can streamline their processes, improve customer experiences, and increase sales. Unlike many competitors, CDK Global focuses on the unique challenges of the automotive market, including the transition to electric vehicles (EVs), and tailors its solutions accordingly. The company's goal is to enhance the efficiency and productivity of its clients through advanced technology, ultimately driving the automotive retail industry forward.

Company Stage

IPO

Total Funding

N/A

Headquarters

Hoffman Estates, Illinois

Founded

1972

Simplify Jobs

Simplify's Take

What believers are saying

  • Increased focus on cybersecurity could enhance CDK Global's reputation and client trust.
  • Digital transformation in automotive industry presents expansion opportunities for CDK Global.
  • Rising EV adoption allows CDK Global to develop specialized software for EV dealerships.

What critics are saying

  • Recent cybersecurity breach led to significant operational disruptions for CDK Global's clients.
  • Antitrust lawsuit resulted in a $100 million payout, indicating potential financial instability.

What makes CDK Global unique

  • CDK Global offers integrated software solutions tailored for the automotive retail industry.
  • The company focuses on enhancing dealership efficiency through advanced technology solutions.
  • CDK Global's subscription-based model aligns with industry trends, ensuring revenue stability.

Help us improve and share your feedback! Did you find this helpful?