Full-Time

Member of Technical Staff: Data Acquisition

Crawler, Engineer

Confirmed live in the last 24 hours

essential AI

essential AI

11-50 employees

AI and machine learning model development

AI & Machine Learning

Senior

San Francisco, CA, USA

Required Skills
Kubernetes
Rust
Docker
AWS
Google Cloud Platform
Requirements
  • Previous large scale web crawling experience
  • Minimum of 5 years of experience in data-intensive applications and distributed systems
  • Proficiency in high performance programming languages like Go or Rust or C++
  • Strong understanding of orchestration and containerization frameworks like Docker / Kubernetes
  • Experience building on GCP or AWS services
  • Bonus: Deep expertise working with headless browsers and Chrome DevTools Protocol
  • Bonus: Curiosity to learn and develop understanding of how data sources and quality affects LLM capabilities
Responsibilities
  • Architect and build large scale distributed web crawler system
  • Design and implement web crawlers and scrapers to automatically extract data from websites
  • Develop data acquisition pipelines to ingest, transform, and store large volumes of data
  • Develop a highly scalable system and optimize crawler performance
  • Monitor and troubleshoot crawler activities to detect and resolve issues promptly
  • Work closely with data infrastructure and data researcher to improve the quality of the data

This company excels in developing machine learning and artificial intelligence models tailored for everyday applications, making critical technology accessible in multiple languages. It stands out for its commitment to enhancing language translation capabilities, thereby promoting inclusiveness and accessibility in global communications. Working here offers a chance to be at the forefront of AI technology while contributing to solutions that bridge language barriers worldwide.

Company Stage

Series A

Total Funding

$64.5M

Headquarters

San Francisco, California

Founded

2023

Growth & Insights
Headcount

6 month growth

70%

1 year growth

70%

2 year growth

70%