Data Engineer II
Posted on 5/15/2023
INACTIVE
GoodRx

501-1,000 employees

Prescription drug price tracking platform
Company Overview
GoodRx's mission is to build better ways for people to find the right care at the best price. GoodRx's healthcare marketplace platform offers solutions for consumers, employers, health plans, and anyone else who shares our desire to provide affordable prescriptions to all Americans.
Locations
San Francisco, CA, USA • New York, NY, USA
Experience Level
Entry
Junior
Mid
Senior
Expert
Desired Skills
AWS
Apache Kafka
Data Analysis
Docker
Google Cloud Platform
JIRA
Git
Airflow
Microsoft Azure
Redshift
SQL
Terraform
Kubernetes
Python
CategoriesNew
Data & Analytics
DevOps & Infrastructure
Requirements
  • Bachelor's degree in analytics, statistics, engineering, math, economics, science or related discipline
  • 3+ years of professional experience in any one of the Cloud data spaces such as AWS, Azure or GCP
  • 3+ years experience in engineering data pipelines using big data technologies (Python, pySpark, Real-time data platform like Active MQ or Kafka or Kinesis) on large scale data sets
  • Strong experience in writing complex SQL and ETL development with experience processing extremely large data sets
  • Demonstrated ability to analyze large data sets to identify gaps and inconsistencies, provide data insights, and advance effective product solutions
  • Familiarity with AWS Services (S3, Event Bridge, Glue, EMR, Redshift, Lambda)
  • Ability to quickly learn complex domains and new technologies
  • Innately curious and organized with the drive to analyze data to identify deliverables, anomalies and gaps and propose solutions to address these findings
  • Thrives in fast-paced startup environments
  • Experience using Jira, GitHub, Docker, CodeFresh, Terraform
  • Experience contributing to full lifecycle deployments with a focus on testing and quality
  • Experience with data quality processes, data quality checks, validations, data quality metrics definition and measurement
Responsibilities
  • Collaborate with product managers, data scientists, data analysts and engineers to define requirements and data specifications
  • Develop, deploy and maintain data processing pipelines using cloud technology such as AWS, Kubernetes, Airflow, Redshift, EMR
  • Develop, deploy and maintain serverless data pipelines using Event Bridge, Kinesis, AWS Lambda, S3 and Glue
  • Define and manage overall schedule and availability for a variety of data sets
  • Work closely with other engineers to enhance infrastructure, improve reliability and efficiency
  • Make smart engineering and product decisions based on data analysis and collaboration
  • Act as in-house data expert and make recommendations regarding standards for code quality and timeliness
  • Architect cloud-based data infrastructure solutions to meet stakeholder needs