Work Here?
Industries
Enterprise Software
AI & Machine Learning
Company Size
201-500
Company Stage
Series D
Total Funding
$384.9M
Headquarters
New York City, New York
Founded
2016
Hugging Face develops machine learning models that can understand and generate human-like text, focusing on artificial intelligence and natural language processing. Their main products include advanced models like GPT-2 and XLNet, which can perform various tasks such as text completion, translation, and summarization. Users can access these models through a web application and a repository, making it easy to integrate AI into different applications. Unlike many competitors, Hugging Face offers a freemium model, allowing users to access basic features for free while providing subscription plans for advanced functionalities. The company also tailors solutions for large organizations, including custom model training. The goal of Hugging Face is to empower researchers, developers, and enterprises to utilize machine learning for text-related tasks effectively.
Help us improve and share your feedback! Did you find this helpful?
Total Funding
$384.9M
Above
Industry Average
Funded Over
5 Rounds
Industry standards
Flexible Work Environment
Health Insurance
Unlimited PTO
Equity
Growth, Training, & Conferences
Generous Parental Leave
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More. AI reasoning models — those that produce “chains-of-thought” in text and reflect on their own analysis to try and catch errors midstream before outputting a response to a user — are all the rage now thanks to the likes of DeepSeek and OpenAI’s “o” series.Still, it’s pretty incredible to me the speed at which the reasoning model approach has spread across the AI industry, with this week’s announcement that there’s yet another new model to try, this one from the mysterious yet laudably principled Nous Research collective of engineers, whose entire mission since launching in New York City in 2023 has been to make “personalized, unrestricted” AI models — often by taking and fine-tuning or retraining open source models such as Meta’s Llama series and those from French startup Mistral. VIDEO
Decrypt’s Art, Fashion, and Entertainment Hub. Discover SCENEA team of international researchers from leading academic institutions and tech companies upended the AI reasoning landscape on Wednesday with a new model that matched—and occasionally surpassed—one of China's most sophisticated AI systems: DeepSeek.OpenThinker-32B, developed by the Open Thoughts consortium, achieved a 90.6% accuracy score on the MATH500 benchmark, edging past DeepSeek's 89.4%.The model also outperformed DeepSeek on general problem-solving tasks, scoring 61.6 on the GPQA-Diamond benchmark compared to DeepSeek's 57.6. On the LCBv2 benchmark, it hit a solid 68.9, showing strong performance across diverse testing scenarios.In other words, it’s better than a similarly-sized version of DeepSeek R1 at general scientific knowledge (GPQA-Diamond). It also beat DeepSeek at MATH500 while losing at the AIME benchmarks—both of which try to measure math proficiency.It’s also a bit worse than DeepSeek at coding, scoring 68.9 points vs 71.2, but since the model is open source, all these scores can drastically get better once people start improving upon it.What set this achievement apart was its efficiency: OpenThinker required only 114,000 training examples to reach these results, while DeepSeek used 800,000.The OpenThoughts-114k dataset came packed with detailed metadata for each problem: ground truth solutions, test cases for code problems, starter code where needed, and domain-specific information.Its custom Curator framework validated code solutions against test cases, while an AI judge handled math verification.The team reported it used four nodes equipped with eight H100 GPUs, completing in approximately 90 hours. A separate dataset with 137,000 unverified samples, trained on Italy's Leonardo Supercomputer, burned through 11,520 A100 hours in just 30 hours."Verification serves to maintain quality while scaling up diversity and size of training prompts," the team noted in their documentation. The research indicated that even unverified versions performed well, though they did not match the verified model's peak results.The model was built on top of Alibaba’s Qwen2.5-32B-Instruct LLM and supports a modest 16,000-token context window—enough to handle complex mathematical proofs and lengthy coding problems but a lot less than the current standards.This release arrives amid intensifying competition in AI reasoning capabilities, which seems to be happening at the speed of thought
Artificial intelligence agents can transform a small staff into a productive workforce that can tackle tasks typically handled by larger teams. While most people are familiar with AI chatbots like ChatGPT, Perplexity AI or Claude, AI agents are the next level for productivity at work. They provide information like chatbots, and they do the task. They can also manage other AI agents in more complex work
The Hugging Face Xet team is developing a new system to optimize file transfers for AI model repositories through an innovative approach to content-defined chunking (CDC).
Salesforce, in collaboration with Hugging Face, Cohere, and Carnegie Mellon University, has announced the release of the AI Energy Score, a benchmarking tool that lets AI developers and users evaluate, identify, and compare the energy consumption of AI models.
Find jobs on Simplify and start your career today
Industries
Enterprise Software
AI & Machine Learning
Company Size
201-500
Company Stage
Series D
Total Funding
$384.9M
Headquarters
New York City, New York
Founded
2016
Find jobs on Simplify and start your career today