Data Engineer @ Veeva Systems

INACTIVE

Full-Time

Data Engineer

Posted on 4/3/2024

Veeva Systems

5,001-10,000 employees

Cloud solutions for life sciences sector

Mid

Remote in USA

Required Skills

Agile

Python

Data Structures & Algorithms

Apache Spark

SQL

AWS

Data Analysis

Requirements

3+ years of experience developing data pipelines using cloud-managed Spark clusters (e.g. AWS EMR, Databricks)
Fluent in Python programming language and PySpark (3+ years of experience)
Previous experience building tools and libraries to automate and streamline data processing workflows
Proficient with SQL / SparkSQL
Hands-on experience working with a Data Lakehouse
Good verbal and written communication and proven experience of working and delivering in an Agile environment
Applicants must have the unrestricted right to work in the United States. Veeva will not provide sponsorship at this time
Looking for strong mentors with a proven record of making your team better

Responsibilities

Build and maintain data processing pipeline and tools using state-of-the-art technologies
Work with Python and SQL on Spark-based data pipelines
Develop algorithms to build complex data relationships
Build analytical data structures to support reporting
Build and maintain Data Quality processes
Collaborate with Product team to adapt reference data to changing demands in the market

The Role

Veeva OpenData supports the industry by providing real-time reference data across the complete healthcare ecosystem, to support commercial sales execution, compliance, and business analytics. We drive value to our customers through constant innovation, using cloud-based solutions and state-of-the-art technologies to deliver product excellence and customer success.

As a Data Engineer in Opendata, you will take responsibility for the OpenData processing workflows in US. You will be building and maintaining data processing tools, pipelines and reports, ensuring data quality in our reference data. We value end-to-end ownership, which gives you the freedom to determine the correct course of action, do all due diligence, and execute solutions in your own creative way.

What You’ll Do

Build and maintain data processing pipeline and tools using state-of-the-art technologies
Work with Python and SQL on Spark-based data pipelines
Develop algorithms to build complex data relationships
Build analytical data structures to support reporting
Build and maintain Data Quality processes
Collaborate with Product team to adapt our reference data to changing demands in the market

Requirements

3+ years of experience developing data pipelines using cloud-managed Spark clusters (e.g. AWS EMR, Databricks)
Fluent in Python programming language and PySpark (3+ years of experience)
Previous experience building tools and libraries to automate and streamline data processing workflows
Proficient with SQL / SparkSQL
Hands-on experience working with a Data Lakehouse
Good verbal and written communication and proven experience of working and delivering in an Agile environment
Applicants must have the unrestricted right to work in the United States. Veeva will not provide sponsorship at this time
We are looking for strong mentors with a proven record of making your team better

Nice to Have

Experience running data workflows through DevOps pipelines
Develop data pipelines with orchestration tools (e.g. Airflow)
Experience with AWS services for data processing like EMR, MWAA etc.
Previous experience in the Life Sciences sector

Learn More

Perks & Benefits

Medical, dental, vision, and basic life insurance
Flexible PTO and company paid holidays
Retirement programs
1% charitable giving program

Compensation

Base pay: $75,000 - $130,000
The salary range listed here has been provided to comply with local regulations and represents a potential base salary range for this role. Please note that actual salaries may vary within the range above or below, depending on experience and location. We look at compensation for each individual and base our offer on your unique qualifications, experience, and expected contributions. This position may also be eligible for other types of compensation in addition to base salary, such as variable bonus and/or stock bonus.

#LI-Remote

Veeva Systems

View

Website

View Company Profile

Veeva Systems offers industry cloud solutions for the life sciences sector, providing technologies such as Vault Clinical Data Management, Vault EDC, Vault Coder, Vault Clinical Operations, Vault RIM Suite, Vault Quality Suite, Vault Safety Suite, Veeva Medical Suite, Veeva Data Cloud, and Veeva Commercial Cloud to support critical functions from R&D through commercialization. These technologies aim to streamline quality processes, manage clinical data, and improve regulatory compliance for life sciences companies.

Company Stage

IPO

Total Funding

$224M

Headquarters

Pleasanton, California

Founded

2007

Growth & Insights

Headcount

6 month growth

↑ 3%

1 year growth

↑ 10%

2 year growth

↑ 37%

Benefits

Parental leave

PTO

Free food

Health, dental, & vision insurance

Gym membership reimbursement

INACTIVE