Role Summary
We are seeking a skilled GCP Data Engineer to design, build, and maintain scalable data pipelines on Google Cloud Platform (GCP). The role will support enterprise analytics, reporting, and real‑time data processing needs by collaborating with global stakeholders, architects, and analytics teams.
DevOps, Security & Governance
- CI/CD pipelines (Cloud Build, Jenkins, GitLab)
- Infrastructure as Code (Terraform preferred)
- GCP IAM, service accounts, and role‑based access
- Encryption, PII handling, and compliance awareness
- Logging and monitoring using GCP native tools
Cloud & Architecture Skills
- GCP project setup and environment management
- Basic networking concepts (VPC, subnets, firewall rules)
- Scalable, fault‑tolerant cloud architecture
- Cost optimization and quota management
Preferred Qualifications
- Google Professional Data Engineer Certification
- Experience migrating data platforms from AWS or Azure to GCP
- Exposure to BI tools such as Looker
- Agile/Scrum methodology experience
Offshore Delivery Expectations
- Strong communication skills (written and verbal)
- Ability to work in partial overlap time zones
- Self‑driven and delivery‑focused mindset
Key Responsibilities
- Design, develop, and maintain batch and streaming data pipelines on GCP
- Build and optimize datasets using BigQuery for analytical and reporting use cases
- Develop data ingestion pipelines using Dataflow (Apache Beam), Pub/Sub, and Cloud Storage
- Orchestrate workflows using Cloud Composer (Airflow)
- Implement data transformations, validations, and quality checks
- Optimize pipeline performance, scalability, and cost efficiency
- Collaborate with BI, analytics, and application teams to deliver trusted data
- Implement CI/CD, version control, and automated deployments
- Ensure data security, access control, and compliance standards are met
- Provide monitoring, troubleshooting, and production support
Mandatory Technical Skills
GCP & Data Services
- BigQuery (advanced SQL, partitioning, clustering, optimization)
- Cloud Storage (GCS)
- Dataflow (Apache Beam – batch & streaming)
- Pub/Sub
- Cloud Composer (Airflow)
- Dataproc (Spark/Hive – preferred)
Programming & Development
- Python (primary language)
- SQL (advanced analytics)
- Java or Scala (good to have)
Data Engineering Concepts
- ETL / ELT design patterns
- Data modeling (star/snowflake schemas)
- Handling structured and semi‑structured data
- Metadata management and schema evolution
- Data quality, reconciliation, and validation