Data Engineer
Posted on 9/1/2022
INACTIVE
Cloud computing services for pharmaceutical companies.
Company Overview
Veep's mission is to help R&D, quality, and regulatory teams eliminate inefficiencies and bring high-quality, safe, sustainable products to market without compromising quality. The company builds cloud-based tools for pharmaceutical research.
Company Stage
N/A
Total Funding
$224M
Founded
2007
Headquarters
Pleasanton, California
Growth & Insights
Headcount
6 month growth
↑ 3%1 year growth
↑ 16%2 year growth
↑ 37%Locations
Toronto, ON, Canada
Experience Level
Entry
Junior
Mid
Senior
Expert
Desired Skills
Agile
Redshift
Python
Apache Spark
SQL
Java
AWS
Scala
Data Analysis
CategoriesNew
AI & Machine Learning
Data & Analytics
Requirements
- Bachelor's degree in Computer Science, Engineering or a related discipline
- 3+ years of experience working on Apache Spark applications using Python (PySpark) or Scala
- Experience creating spark jobs that work on at least 1 billion records
- Intermediate or greater SQL knowledge
- Experience creating data pipelines in a production system
- Experience working on AWS environments (EMR, S3, Glue, Redshift, Athena)
- We are looking for strong mentors with a proven record of making your team better
Responsibilities
- Design and build applications that perform data analysis, transformations, aggregations, and other augmentations on large sets of data in a spark-based AWS environment (EMR, S3, Glue, Redshift, Athena)
- Evaluate various pipeline models, tools, and environments and implement these to push data from our sources through your transformations and finally to our customers
- Work with product management and data research teams to prototype and test new ideas then take those to production
- Work in a fast-paced, test-driven environment
Desired Qualifications
- Experience working with Data Quality techniques
- Java development experience
- Experience working with Machine Learning/AI models
- Experience with agile methodologies