Full-Time
Posted on 2/7/2025
Automated data curation for GenAI training
$180k - $250k/yr
Company Does Not Provide H1B Sponsorship
San Carlos, CA, USA
In Person
In-office 4 days a week; relocation assistance for employees moving to the Bay Area.
DatologyAI offers automated data curation tools to optimize GenAI training by selecting high-quality, relevant data and removing noisy or harmful data. The core tech analyzes datasets and plugs into existing training pipelines, requiring minimal code changes, and scales from small to petabyte-scale data with usage-based pricing. It differentiates itself with end-to-end automated curation at scale and easy integration, supported by recognized research work and contributions to ImageNet, plus a team with CMU PhD expertise and immigrant-founder VC backing. The goal is to help organizations train better AI models more efficiently and cost-effectively by ensuring high-quality data throughout the training lifecycle.
Company Size
11-50
Company Stage
Series A
Total Funding
$57.7M
Headquarters
Redwood City, California
Founded
2023
Help us improve and share your feedback! Did you find this helpful?
Health Insurance
Dental Insurance
Vision Insurance
401(k) Company Match
Unlimited Paid Time Off
Annual Wellness Stipend
Annual Learning and Development Stipend
Relocation Assistance
DatologyAI raises $46M to streamline AI model training data diets - SiliconANGLE
Models are what they eat. AI models trained on large-scale datasets have demonstrated jaw-dropping abilities and have the power to transform every aspect of our daily lives, from work to play. This massive leap in capabilities has largely been driven by corresponding increases in the amount of data we train models on, shifting from millions of data points several years ago to billions or trillions of data points today. As a result, these models are a reflection of the data on which they’re train
DatologyAI raises $11.65M to automate data curation for more efficient AI training.
A new startup, DatologyAI, claims to be able to automatically curate the massive data sets on which AI models train.