Data Engineer
Posted on 3/29/2023
INACTIVE
Caliva

201-500 employees

Caliva is San Jose's premier cannabis dispensary and cultivation facility.
Locations
San Jose, CA, USA
Experience Level
Entry
Junior
Mid
Senior
Expert
Desired Skills
AWS
BigQuery
Data Structures & Algorithms
Google Cloud Platform
Git
Airflow
Pandas
REST APIs
SQL
Python
CategoriesNew
Data & Analytics
Requirements
  • Bachelor's degree or higher in an engineering or technical field such as Computer Science, Physics, Mathematics, Statistics, Engineering, Business Administration, or similar or equivalent combination of education and experience
  • 4+ Years' experience in a data engineering role supporting production systems
  • 1+ years experience extracting data from REST APIs
  • 1+ years experience managing a codebase in GitHub
  • Previous experience developing ETL pipelines using technologies such as Airflow (preferable), Luigi, Oozie, Azkaban, etc
  • Previous experience developing data models to support a data warehouse
  • Experience manipulating and de-normalizing data in JSON format for storage in relational databases
  • Experience with Google Cloud Platform or AWS cloud services
  • (Preferred) Knowledge and experience with Kubernetes and/or Docker
  • (Preferred) Advanced knowledge of SQL and experience working with relational databases. BigQuery experience is an extra plus
  • Work revolves around objectives, projects and priorities, not hours; must be able to work weekends, holidays, and occasional overtime as needed
  • Must be able to stand, walk, lift, sit, and bend for a majority of their work schedule
  • Must be able to travel to other office locations
  • Ability to use computer and calculator for 8 hours or more
  • Must be 21 years of age or older
  • Must comply with all legal or company regulations for working in the industry
  • Selected candidate will be required to complete a post offer, pre-employment background check with the local law enforcement or San Jose Police Department
Responsibilities
  • Assist with the implementation of new systems and updates to existing systems by leading the data strategy for each, assuring data integrity, value and access
  • Establish best practices in our data engineering practice and strategy
  • Develop appropriate data schemas and structures for use in downstream models/reports
  • Develop data management and oversight program spanning dozens of source systems across all departments, creating new ETL pipelines and maintenance of existing ones, ensuring data richness and quality
  • Engineer capacity and performance in addition to providing forecasting and future planning as well as review and consideration of technology trends
  • Recommend and develop changes to source data structures/systems based on observations of data within the context of operational use
  • Assemble large, complex data models to meet the needs of operational and strategic stakeholders
  • Work closely with our in-house analysts to integrate SQL data models to a dependency tree
  • Document and maintain our data lineage and data dictionary
  • Other duties and responsibilities as assigned by management
Desired Qualifications
  • 1+ years experience manipulating data using Python (experience with Pandas is a plus)