AI Data Engineer Lead – Principal
Confirmed live in the last 24 hours
Grindr

201-500 employees

Global location-based social networking app for LGBTQ+ community
Company Overview
Grindr, the world's largest social networking app for the LGBTQ community, stands out for its commitment to connecting queer individuals globally through its location-based technology, reaching millions of users in nearly every country. The company's culture fosters a safe and inclusive environment, reflecting the diverse community it serves, while its industry leadership is demonstrated by its continuous expansion into new platforms and content. Furthermore, Grindr's competitive advantage lies in its accessibility, with services available on any device without the need for download, exclusively for XTRA and Unlimited subscribers.

Company Stage

N/A

Total Funding

$643M

Founded

2009

Headquarters

West Hollywood, California

Growth & Insights
Headcount

6 month growth

-14%

1 year growth

-19%

2 year growth

9%
Locations
San Francisco, CA, USA
Experience Level
Entry
Junior
Mid
Senior
Expert
Desired Skills
Bash
Kubernetes
Microsoft Azure
Agile
Python
Airflow
NoSQL
R
Git
Apache Spark
SQL
Apache Kafka
Java
Docker
AWS
Pandas
Natural Language Processing (NLP)
Data Analysis
Snowflake
Google Cloud Platform
CategoriesNew
AI & Machine Learning
Data & Analytics
Requirements
  • Bachelors in Computer Science, Mathematics, Physics, or a related fields
  • 5+ years of experience as a data engineer building production-level pre/post-processing data pipelines for ML/DL models, including 2+ years of technical leadership experience
  • Experience in statistical analysis & visualization on datasets using Pandas or R
  • Experience designing and building highly available, distributed systems of data extraction, ingestion, normalization and processing of large data sets in real time as well as batch
  • Demonstrated prior experience in creating data pipelines for text data sets NLP/ large language models
  • Excellent coding skills in Python, Java, bash, SQL, and expertise with Git version control
  • Experience using big data technologies (Snowflake, Airflow, Kubernetes, Docker, Helm, Spark, pySpark)
  • Experience with any public cloud environment - AWS, GCP or Azure
  • Significant experience with relational databases and query authoring (SQL) as well as NoSQL databases like DynamoDB etc
  • Experience building and maintaining ETL (managing high-quality reliable ETL pipelines)
Responsibilities
  • Establishing and executing the strategy for the organization’s ML Data Engine, with an initial focus on agile ML Data OPs
  • Identification of infrastructure components and data stack to be used, design and implementation of pipelines between data systems and teams, automation workflows, data enrichment and monitoring tools all for AI models
  • Design and build data platforms & frameworks for processing high volumes of data, in real time as well as batch, that will be used across engineering teams
  • Build data processing streams for cleaning and modeling text data for LLMs
  • Research and evaluate new technologies in the big data space to guide continuous improvement
  • Collaborate with multi-functional teams to help tune the performance of large data applications
  • Work with Privacy and Security team on data governance, risk and compliance initiatives
  • Work on initiatives to ensure stability, performance and reliability of data infrastructure
Desired Qualifications
  • 2+ years of experience of technical leadership in building data engineering pipelines for AI
  • Previous experience in building data pipeline for conversational AI APIs and recommender systems
  • Experience with distributed systems and microservices
  • Experience with Kubernetes and building Docker images
  • Experience with building stream-processing systems, using solutions such as Kafka, Storm or Spark-Streaming
  • Strong understanding of applied machine learning topics
  • Be familiar with legal compliance (with data management tools) data classification, and retention
  • Consistent track record of managing and implementing complex data projects