Internship
Develops advanced AI and NLP models
No salary listed
New York, NY, USA
Candidates must be based in the United States.
Upload your resume to see how it matches 9 keywords from the job description.
PDF, DOC, DOCX, up to 4 MB
Hugging Face develops machine learning models focused on understanding and generating human-like text. Their main products include advanced natural language processing (NLP) models like GPT-2 and XLNet, which can perform tasks such as text completion, translation, and summarization. Users can access these models through a web application and a repository, making it easy for researchers, developers, and businesses to integrate AI into their applications. Unlike many competitors, Hugging Face offers a freemium model, providing basic features for free while charging for advanced functionalities and enterprise solutions tailored to larger organizations. The company's goal is to empower clients to utilize machine learning for various text-related tasks, enhancing their applications with sophisticated language capabilities.
Company Size
501-1,000
Company Stage
Series D
Total Funding
$395.7M
Headquarters
New York City, New York
Founded
2016
Help us improve and share your feedback! Did you find this helpful?
Flexible Work Environment
Health Insurance
Unlimited PTO
Equity
Growth, Training, & Conferences
Generous Parental Leave
Decrypt’s Art, Fashion, and Entertainment Hub. Discover SCENEIn the wake of Pope Francis's death, the College of Cardinals is already preparing to gather for the conclave next week to hold what could be the most important event in the catholic religion for the next couple of decades: Picking a new pope.During these days, Catholics pray for the Holy Spirit to give their leaders enough understanding and light to leave all differences aside and pick the best candidate to rule the church—effectively merging the spiritual and terrestrial realms to come up with what has also historically been a very political decision.So who’s the odds-on favorite to be the next pope?We queried 13 of the world's most advanced artificial intelligence models and tasked them with evaluating and predicting which cardinal is best positioned to lead the Catholic Church into its next chapter. (Though popes don’t have to be chosen from the ranks of cardinals, historically this has been the case since the 14th century—the movie "Conclave" notwithstanding.)Despite varying methodologies and perspectives, the AI systems pretty much concluded that Cardinal Luis Antonio Tagle of the Philippines would not only be the best candidate to guide the Church through its current challenges, but that he would indeed be the next pope.Interestingly, that pick differs from the leading prediction markets, including Polymarket, Kalshi and Myriad (disclosure: Myriad comes from Decrypt's parent company, DASTAN). While Tagle is a strong contender in those markets, human bettors were predicting that Cardinal Pietro Parolin—or someone else entirely—will be the next pope.This digital "conclave" offers an interesting window into how different analytical systems process the same complex question, and how they can arrive at remarkably similar conclusions even when approaching the problem from distinct angles.The 'Digital College of Cardinals' picks TagleThe analysis involved thirteen top-tier AI models: Claude, GPT-4o, GPT-4.5, Perplexity, Mistral, Meta AI, Grok-3, Gemini, Qwen 2.5 Max, You.com Research, DeepSeek R1, Microsoft Copilot, and even an open-source deep research agent representing the most advanced large language models currently available.Each agent has the capability to browse the web for information and provide a combination of reasoning capabilities and/or a deep research mode.We started with one base prompt: “Act as an expert in theology, Catholic geopolitics, and modern Catholicism. Evaluate all the options for the next pope and predict who will be the next pope and why. Also include who could be the best pope and why.”This base prompt was further enhanced using the “concept elevation” technique that makes a prompt detailed enough for an AI agent to execute its task more accurately
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More. French AI startup Pleias made waves late last year with the launch of its ethically trained Pleias 1.0 family of small language models — among the first and only to date to be built entirely on scraping “open” data, that is, data explicitly labeled as public domain, open source, or unlicensed and not copyrighted. Now the company has announced the release of two open source small-scale reasoning models designed specifically for retrieval-augmented generation (RAG), citation synthesis, and structured multilingual output. The launch includes two core models — Pleias-RAG-350M and Pleias-RAG-1B — each also available in CPU-optimized GGUF format, making a total of four deployment-ready variants. They are all based on Pleias 1.0, and can be used independently or in conjunction with other LLMs that the organization may already or plan to deploy
In brief Open Source models are proving capable of generating consistent videos that last minutes, challenging state of the art close alternatives.SkyReels-V2 breaks video length barriers with its "diffusion forcing framework" that enables infinite-duration AI video generation while maintaining consistent quality throughout.FramePack brings long-form AI video generation to consumer hardware, requiring only 6GB of VRAM to create minute-long videos at 30fps by cleverly compressing older frames.Decrypt’s Art, Fashion, and Entertainment Hub. Discover SCENEOpen-source video generators are heating up and giving closed-source behemoths a run for their money.They're more customizable, less restricted, uncensored, even, free to use—and now producing high-quality videos, with three models (Wan, Mochi, and Hunyuan) ranking among the top 10 of all AI video generators.The latest breakthrough comes in extending video duration beyond the typical few seconds, with two new models demonstrating the ability to generate content lasting minutes instead of seconds.In fact, SkyReels-V2, released this week, claims it can generate scenes of potentially infinite duration while maintaining consistency throughout. Framepack gives users with lower-end hardware the ability to create long videos without burning out their PCs.SkyReels-V2: Infinite Video GenerationSkyReels-V2 represents a significant advance in video generation technology, tackling four critical challenges that have limited previous models. It describes its system, which synergizes multiple AI technologies, as an "Infinite-Length Film Generative Model."The model achieves this through what its developers call a "diffusion forcing framework," which allows seamless extension of video content without explicit length constraints.It works by conditioning on the last frames of previously generated content to create new segments, preventing quality degradation over extended sequences. In other words, the model looks at the final frames it just created to decide what comes next, ensuring smooth transitions and consistent quality.This is the main reason why video generators tend to stick with short videos of around 10 seconds; anything longer, and the generation tends to lose coherence.The results are pretty impressive. Videos uploaded to social media by developers and enthusiasts show that the model is actually pretty coherent, and the images don’t lose quality.Subjects remain identifiable throughout the long scenes, and backgrounds don’t warp or introduce artifacts that could damage the scene.SkyReels-V2 incorporates several innovative components, including a new captioner that combines knowledge from general-purpose language models with specialized "shot-expert" models to ensure precise alignment with cinematic terminology
In brief Tiny, open-source AI model Dia-1.6B claims to beat industry giants like ElevenLabs or Sesame at emotional speech synthesis.Creating convincing emotional AI speech remains challenging due to the complexity of human emotions and technical limitations.While it matches up well against competition, the "uncanny valley" problem persists as AI voices sound human but fail at conveying nuanced emotions.Decrypt’s Art, Fashion, and Entertainment Hub. Discover SCENENari Labs has released Dia-1.6B, an open-source text-to-speech model that claims to outperform established players like ElevenLabs and Sesame in generating emotionally expressive speech. The model is super tiny—with just 1.6 billion parameters—but still can create realistic dialogue complete with laughter, coughs, and emotional inflections.It can even scream in terror.We just solved text-to-speech AI. This model can simulate perfect emotion, screaming and show genuine alarm.— clearly beats 11 labs and Sesame— it’s only 1.6B params— streams realtime on 1 GPU— made by a 1.5 person team in Korea!! It's called Dia by Nari Labs. pic.twitter.com/rpeZ5lOe9z — Deedy (@deedydas) April 22, 2025While that might not sound like a huge technical feat, even OpenAI’s ChatGPT is flummoxed by that: “I can’t scream but I can definitely speak up,” its chatbot replied when asked.Now, some AI models can scream, if you ask them to. But it’s not something that happens naturally or organically, which, apparently, is Dia-1.6B's super power
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More. A two-person startup by the name of Nari Labs has introduced Dia, a 1.6 billion parameter text-to-speech (TTS) model designed to produce naturalistic dialogue directly from text prompts — and one of its creators claims it surpasses the performance of competing proprietary offerings from the likes of ElevenLabs, Google’s hit NotebookLM AI podcast generation product.It could also threaten uptake of OpenAI’s recent gpt-4o-mini-tts.“Dia rivals NotebookLM’s podcast feature while surpassing ElevenLabs Studio and Sesame’s open model in quality,” said Toby Kim, one of the co-creators of Nari and Dia, on a post from his account on the social network X.In a separate post, Kim noted that the model was built with “zero funding,” and added across a thread: “…we were not AI experts from the beginning. It all started when we fell in love with NotebookLM’s podcast feature when it was released last year. We wanted more—more control over the voices, more freedom in the script. We tried every TTS API on the market