Job Description
Publicis Sapient is looking for a Manager Data Engineering, Databricks Platform Architecture, a top-notch technologist who will lead build of global Databricks Platforms to enable real business outcomes for our enterprise clients. You will create impact for some of the world’s biggest brands, by translating their needs into transformative solutions that provide valuable insight. Working with the latest data technologies in the industry, you will be instrumental in helping our clients evolve for a more digital future.
Your Impact:
- Act as a Databricks global platform lead as part of large digital transformation journeys
- Advance the application of Databricks data platforms as a core building block to enable true business transformation
- Lead Data Migration and Data Modernization projects to Databricks
- Build complex data ingestion, processing and consumption storage and pipelines on Databricks
- Work closely with our clients in understanding their needs and translating them to technology solutions
- Provide expertise as a technical resource to solve complex business issues that translate into data integration and Databricks systems designs
- Shape opportunities and create execution approaches throughout the lifecycle of client engagements
- Ensuring all deliverables are of high quality by setting development standards, adhering to the standards, and participating in code reviews
- Mentor, support, and manage team members
Qualifications
Your Skills and Experience:
- Databricks Architecture: Deep understanding of Databricks architecture, including clusters, notebooks, jobs, and the underlying compute and storage layers. Architect level who has also done hands work on Databricks in the past 5+
- Databricks Platform: Experience building Databricks as a global platform (across multiple regions) which supports an organizations multiple Lines of Business (LOB)
- Apache Spark: Proficiency in Apache Spark, including its core components (Spark SQL, Spark Streaming, and MLlib).
- Delta Lake: Knowledge of Delta Lake, its features (ACID transactions, time travel, etc.), and its role in data lakes and data warehouses.
- Databricks SQL: Experience in using Databricks SQL for data querying, analysis, and visualization.
- Databricks Jobs: Ability to create and manage complex data pipelines and workflows using Databricks Jobs.
- Cluster Management: Understanding of cluster configurations, autoscaling, and performance optimization.
- Unity Catalog experience required
- Infrastructure & Security: Deep understanding of AWS or Azure cloud essentials – Storage, Networking, Identity and Access Management and above all Network and Data Security, handling sensitive data and data compliance with GDRR, CCPA etc
- Networking: Understanding of network configurations, VPCs, and security groups for Databricks deployments.
- Cost Optimization: Ability to analyze and optimize Databricks costs by leveraging features like spot instances, cluster policies, and auto-termination.
- Infrastructure & Terraform: Infrastructure experience required and familiarity with Terraform scripts
Set Yourself Apart With:
- Certifications for any of the cloud services like Azure, AWS, or GCP
- Certifications for any Machine Learning/Advanced Analytics Courses
- Experience working with code repositories and continuous integration pipelines using AWS code build/code pipelines or similar tools/technologies
- Experience in data governance and lineage implementation
- Multi-geo and distributed delivery experience in large programs