Data Pipeline Engineer
Caroffer
Posted on 3/21/2024
CarGurus

1,001-5,000 employees

CarGurus provides an online marketplace for buying and selling cars, with a
Company Overview
CarGurus provides an online marketplace for buying and selling cars, with a focus on personalized finance options and data analytics. The main technologies utilized include online marketplace tools, personalized finance algorithms, and data analytics for car sales.
Consumer Goods
Data & Analytics

Company Stage

Series A

Total Funding

$1.8M

Founded

2006

Headquarters

Cambridge, Massachusetts

Growth & Insights
Headcount

6 month growth

3%

1 year growth

12%

2 year growth

19%
Locations
Dallas, TX, USA
Experience Level
Entry
Junior
Mid
Senior
Expert
Desired Skills
Microsoft Azure
Redshift
Python
BigQuery
Apache Kafka
Java
AWS
REST APIs
Data Analysis
Snowflake
Google Cloud Platform
CategoriesNew
Data Engineering
Data & Analytics
Requirements
  • Bachelor's degree in Computer Science, Engineering, or a related field; Master's degree preferred.
  • 1-3 years of experience in data engineering, software development, or a related field, with a focus on building and maintaining data pipelines.
  • Hands-on experience with data pipelining tools and technologies, such as Apache Kafka, Apache NiFi, or AWS Glue, preferred.
  • Proficiency in programming languages such as Python, Java, or Scala, with experience in scripting and automation.
  • Familiarity with cloud platforms and services, such as AWS, Azure, or Google Cloud Platform, and experience with cloud-based data services (e.g., S3, Redshift, BigQuery).
  • Strong understanding of data formats and protocols, including JSON, XML, REST APIs, and FTP/SFTP.
  • Excellent problem-solving and analytical skills, with the ability to troubleshoot and resolve complex data pipeline issues.
  • Strong communication and collaboration skills, with the ability to work effectively in a team environment and interact with stakeholders at all levels.
  • Self-motivated and proactive, with a passion for learning and staying abreast of new technologies and industry trends.
  • Ability to prioritize tasks, manage time efficiently, and meet project deadlines in a fast-paced, dynamic environment.
Responsibilities
  • Collaborate with stakeholders to understand data requirements and source systems, identifying data sources and defining data ingestion strategies.
  • Design and implement robust data pipelines to ingest, process, and stream data into our Snowflake data warehouse from various sources, ensuring data quality and reliability.
  • Develop connectors and integrations with web analytics platforms, third-party APIs, and FTP servers to automate data extraction and ingestion processes.
  • Implement industry-standard practices for data pipelining, including error handling, data validation, and monitoring, to ensure data integrity and availability.
  • Optimize data pipeline performance and resource utilization, identifying bottlenecks and opportunities for improvement in data processing and streaming.
  • Collaborate with data engineers, software engineers, and infrastructure teams to integrate data pipelines into our data ecosystem and ensure seamless data flow.
  • Document data pipeline architecture, configurations, and dependencies, ensuring clear and comprehensive documentation for future reference and troubleshooting.
  • Monitor data pipeline health and performance metrics, proactively identifying and addressing issues to minimize downtime and ensure data availability.
  • Stay current with emerging technologies and best practices in data engineering and streaming technologies, continuously evaluating and adopting new tools and techniques.
  • Participate in team meetings, code reviews, and knowledge-sharing sessions, contributing to a collaborative and supportive team culture.