Internship

Machine Learning Engineer Internship

Biomedical, Remote

Posted on 12/22/2023

Hugging Face

Hugging Face

501-1,000 employees

Develops advanced NLP models for text tasks

Compensation Overview

$3k - $4k/mo

Remote

The company has office spaces around the world, especially in the US, Canada, and Europe, but is very distributed and offers flexible working hours and remote options. Remote employees have the opportunity to visit the offices, and if needed, the company will outfit their workstation to ensure success.

Category
Natural Language Processing (NLP)
AI Research
AI & Machine Learning
Biology & Biotech
Requirements
  • Passion for open-source technology
  • Interest in making biomedical AI more accessible
  • Ability to contribute to a fast-growing ML ecosystem
Responsibilities
  • Enabling the development of biomedical AI models
  • Integrating with biomedical data providers
  • Improving support for biomedical data types
  • Building demos for biomedical models
  • Creating new visualizations for biomedical datasets
Desired Qualifications
  • Experience in the intersection of biology, health, and artificial intelligence

Hugging Face develops machine learning models that understand and generate human-like text, focusing on natural language processing (NLP). Their main products include models like GPT-2 and XLNet, which can perform tasks such as text completion, translation, and summarization. Users can access these models through a web application and a repository, making it easy to integrate AI into various applications. Unlike many competitors, Hugging Face offers a freemium model, providing basic features for free while charging for advanced functionalities and enterprise solutions tailored to large organizations. The company's goal is to empower researchers, developers, and businesses to utilize AI for text-related tasks effectively.

Company Size

501-1,000

Company Stage

Series D

Total Funding

$395.7M

Headquarters

New York City, New York

Founded

2016

Simplify Jobs

Simplify's Take

What believers are saying

  • Integration with KServe in Kubeflow 1.10 enhances model deployment efficiency.
  • Yourbench enables tailored AI model evaluations, improving enterprise-specific performance.
  • Gradio's 1 million users indicate strong community engagement and growth potential.

What critics are saying

  • Emergence of Deep Cogito's models could threaten Hugging Face's market position.
  • Meta's Llama 4 models may draw users away from Hugging Face.
  • DeepCoder-14B's performance presents competition in the coding model space.

What makes Hugging Face unique

  • Hugging Face offers state-of-the-art NLP models like GPT-2 and XLNet.
  • The company provides a freemium model with advanced features available via subscription.
  • Hugging Face's Yourbench tool allows custom AI model benchmarking for enterprises.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Flexible Work Environment

Health Insurance

Unlimited PTO

Equity

Growth, Training, & Conferences

Generous Parental Leave

Growth & Insights and Company News

Headcount

6 month growth

3%

1 year growth

3%

2 year growth

1%
VentureBeat
Apr 10th, 2025
Deepcoder Delivers Top Coding Performance In Efficient 14B Open Model

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More. Researchers at Together AI and Agentica have released DeepCoder-14B, a new coding model that delivers impressive performance comparable to leading proprietary models like OpenAI’s o3-mini. Built on top of DeepSeek-R1, this model gives more flexibility to integrate high-performance code generation and reasoning capabilities into real-world applications. Importantly, the teams have fully open-sourced the model, its training data, code, logs and system optimizations, which can help researchers improve their work and accelerate progress.Competitive coding capabilities in a smaller packageThe research team’s experiments show that DeepCoder-14B performs strongly across several challenging coding benchmarks, including LiveCodeBench (LCB), Codeforces and HumanEval+.“Our model demonstrates strong performance across all coding benchmarks… comparable to the performance of o3-mini (low) and o1,” the researchers write in a blog post that describes the model.Interestingly, despite being trained primarily on coding tasks, the model shows improved mathematical reasoning, scoring 73.8% on the AIME 2024 benchmark, a 4.1% improvement over its base model (DeepSeek-R1-Distill-Qwen-14B). This suggests that the reasoning skills developed through RL on code can be generalized effectively to other domains.Credit: Together AIThe most striking aspect is achieving this level of performance with only 14 billion parameters. This makes DeepCoder significantly smaller and potentially more efficient to run than many frontier models.Innovations driving DeepCoder’s performanceWhile developing the model, the researchers solved some of the key challenges in training coding models using reinforcement learning (RL).The first challenge was curating the training data

Decrypt
Apr 8th, 2025
Meta Releases Much-Anticipated Llama 4 Models—Are They Truly That Amazing?

Decrypt’s Art, Fashion, and Entertainment Hub. Discover SCENEMeta unveiled its newest artificial intelligence models this week, releasing the much anticipated Llama-4 LLM to developers while teasing a much larger model still in training. The model is state of the art, but Zuck’s company claims it can compete against the best close source models without the need for any fine-tuning.“These models are our best yet thanks to distillation from Llama 4 Behemoth, a 288 billion active parameter model with 16 experts that is our most powerful yet and among the world’s smartest LLMs,” Meta said in an official announcement. “Llama 4 Behemoth outperforms GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on several STEM benchmarks. Llama 4 Behemoth is still training, and we’re excited to share more details about it even while it’s still in flight.”Both Llama 4 Scout and Maverick use 17 billion active parameters per inference, but differ in the number of experts: Scout uses 16, while Maverick uses 128. Both models are now available for download on llama.com and Hugging Face, with Meta also integrating them into WhatsApp, Messenger, Instagram, and its Meta.AI website.The mixture of experts (MoE) architecture is not new to the technology world, but it is to Llama and is a way to make a model super efficient

VentureBeat
Apr 8th, 2025
New Open Source Ai Company Deep Cogito Releases First Models And They’Re Already Topping The Charts

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn MoreDeep Cogito, a new AI research startup based in San Francisco, officially emerged from stealth today with Cogito v1, a new line of open source large language models (LLMs) fine-tuned from Meta’s Llama 3.2 and equipped with hybrid reasoning capabilities — the ability to answer quickly and immediately, or “self-reflect” like OpenAI’s “o” series and DeepSeek R1.The company aims to push the boundaries of AI beyond current human-overseer limitations by enabling models to iteratively refine and internalize their own improved reasoning strategies. It’s ultimately on a quest toward developing superintelligence — AI smarter than all humans in all domains — yet the company says that “All models we create will be open sourced.”Deep Cogito’s CEO and co-founder Drishan Arora — a former Senior Software Engineer at Google who says he led the large language model (LLM) modeling for Google’s generative search product —also said in a post on X they are “the strongest open models at their scale – including those from LLaMA, DeepSeek, and Qwen.”The initial model lineup includes five base sizes: 3 billion, 8 billion, 14 billion, 32 billion, and 70 billion parameters, available now on AI code sharing community Hugging Face, Ollama and through application programming interfaces (API) on Fireworks and Together AI.They’re available under the Llama licensing terms which allows for commercial usage — so third-party enterprises could put them to work in paid products — up to 700 million monthly users, at which point they need to obtain a paid license from Meta.The company plans to release even larger models — up to 671 billion parameters — in the coming months.Arora describes the company’s training approach, iterated distillation and amplification (IDA), as a novel alternative to traditional reinforcement learning from human feedback (RLHF) or teacher-model distillation.The core idea behind IDA is to allocate more compute for a model to generate improved solutions, then distill the improved reasoning process into the model’s own parameters — effectively creating a feedback loop for capability growth. Arora likens this approach to Google AlphaGo’s self-play strategy, applied to natural language.The Cogito models are open-source and available for download via Hugging Face and Ollama, or through APIs provided by Fireworks AI and Together AI. Each model supports both a standard mode for direct answers and a reasoning mode, where the model reflects internally before responding.Benchmarks and evaluationsThe company shared a broad set of evaluation results comparing Cogito models to open-source peers across general knowledge, mathematical reasoning, and multilingual tasks. Highlights include:Cogito 3B (Standard) outperforms LLaMA 3.2 3B on MMLU by 6.7 percentage points (65.4% vs

Ubuntu
Apr 8th, 2025
Announcing Charmed Kubeflow 1.10

I am very excited to see continued collaborations and new features from KServe being integrated in Kubeflow 1.10 release, particularly the model cache feature and integration with Hugging Face, which enables more streamlined deployment and efficient autoscaling for both predictive and generative models.

PYMNTS
Apr 7th, 2025
Ai Explained: What’S A Small Language Model And How Can Business Use It?

Artificial intelligence (AI) is now a household word, thanks to the popularity of large language models like ChatGPT. These large models are trained on the whole internet and often have hundreds of billions of parameters — settings inside the model that help it guess what word comes next in a sequence. The more parameters, the [] The post AI Explained: What’s a Small Language Model and How Can Business Use It? appeared first on PYMNTS.com.

INACTIVE