Deepgram

Deepgram

Speech recognition APIs for audio transcription

About Deepgram

Simplify's Rating
Why Deepgram is rated
B
Rated C on Competitive Edge
Rated A on Growth Potential
Rated B on Differentiation

Industries

Data & Analytics

AI & Machine Learning

Company Size

51-200

Company Stage

Series B

Total Funding

$103.3M

Headquarters

San Francisco, California

Founded

2015

Overview

Deepgram specializes in artificial intelligence for speech recognition, offering a set of APIs that developers can use to transcribe and understand audio content. Their technology allows clients, ranging from startups to large organizations like NASA, to process millions of audio minutes daily. Deepgram's speech recognition technology is designed to be fast, accurate, scalable, and cost-effective, making it suitable for businesses of all sizes that require large-scale audio data processing. The company operates on a pay-per-use model, where clients are charged based on the amount of audio they transcribe, allowing Deepgram to align its revenue with client usage. This approach positions Deepgram well in the growing demand for speech recognition technology.

YC Company
Simplify Jobs

Simplify's Take

What believers are saying

  • Deepgram's pay-per-use model aligns revenue with client usage, supporting scalability.
  • The launch of Nova-3 Medical positions Deepgram as a leader in healthcare transcription.
  • Deepgram's Aura-2 TTS model outperforms competitors, enhancing its market position.

What critics are saying

  • ElevenLabs' Scribe model outperforms Deepgram's Nova-3 in accuracy.
  • xAI's Grok-3 model shows superior AI benchmark performance, posing competitive threats.
  • Gladia's focus on real-time processing could challenge Deepgram's market position.

What makes Deepgram unique

  • Deepgram offers APIs for speech-to-text, text-to-speech, and language understanding.
  • Deepgram's Nova-3 Medical model is tailored for real-time medical transcription.
  • Deepgram's Shortcut provides on-device AI assistant capabilities for personalized user experiences.

Help us improve and share your feedback! Did you find this helpful?

Funding

Total Funding

$103.3M

Above

Industry Average

Funded Over

5 Rounds

Notable Investors:
Series B funding is typically for startups that have proven their business model and need more funding to expand rapidly—often by entering new markets or adding more products. Investors are usually venture capital firms that specialize in later-stage investments.
Series B Funding Comparison
Above Average

Industry standards

$35M
$45M
Linktree
$47M
Deepgram
$65M
Substack
$100M
ClickUp

Benefits

Comprehensive Health Plans

FSA Health Matching up to $1,000

Work from Home Ergonomic Stipend

Healthy Food & Snacks in offices

Community Groups

Unlimited Vacation

Growth & Insights and Company News

Headcount

6 month growth

0%

1 year growth

0%

2 year growth

2%
MedCloud Insider
Apr 16th, 2025
Deepgram Unveils Nova-3 Medical, Advanced Speech AI for Healthcare

Voice AI company Deepgram has unveiled Nova 3 Medical, a next-generation speech-to-text model built specifically for the healthcare sector, claiming it delivers the most accurate real-time medical transcription on the market.

Analytics India Magazine
Apr 15th, 2025
Deepgram's New Text-to-Speech AI Model Outperforms ElevenLabs and Open AI

Deepgram, a voice AI platform, on Tuesday launched Aura-2, its next-generation text-to-speech (TTS) model.

Cryptonomist
Mar 5th, 2025
AI in healthcare: the new Nova-3 Medical for reliable transcription in the sector

Deepgram, voice artificial intelligence platform for developers, has introduced its new AI model for the healthcare sector: Nova-3 Medical.

VentureBeat
Feb 26th, 2025
Elevenlabs’ New Speech-To-Text Model Scribe Is Here With Highest Accuracy Rate So Far (96.7% For English)

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More. ElevenLabs, the highly-valued AI voice cloning and generation startup from former Palantir alumni, today launched Scribe v1, a new speech-to-text model that reportedly achieves the highest accuracy across multiple languages. Users can try it here on the ElevenLabs site.According to the company’s benchmarks, it outperforms Google’s Gemini 2.0 Flash, OpenAI’s Whisper v3, and Deepgram Nova-3 on accurately converting spoken speech into text on the web, achieving new record-low error rates.The company claims that Scribe delivers state-of-the-art transcription accuracy in 99 languages, including improved performance in previously underserved languages such as Serbian, Cantonese, and Malayalam. As Flavio Schneider, ElevenLabs Lead Researcher wrote on X, Scribe is the “smartest audio understanding model” released by ElevenLabs yet. “Scribe doesn’t just transcribe — it understands audio,” Schneider continued in a threaded reply

VentureBeat
Feb 26th, 2025
Elevenlabs' New Speech-To-Text Model Scribe Is Here With Highest Accuracy Rate So Far (96.7% For English)

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More. ElevenLabs, the highly-valued AI voice cloning and generation startup from former Palantir alumni, today launched Scribe v1, a new speech-to-text model that reportedly achieves the highest accuracy across multiple languages. Users can try it here.According to the company’s benchmarks, it outperforms Google’s Gemini 2.0 Flash, OpenAI’s Whisper v3 and Deepgram Nova-3 in accurately converting spoken speech into text on the web, achieving new record-low error rates.The company claims that Scribe delivers state-of-the-art transcription accuracy in 99 languages, including improved performance in previously underserved languages such as Serbian, Cantonese and Malayalam.As Flavio Schneider, ElevenLabs lead researcher wrote on X, Scribe is the “smartest audio understanding model” released by ElevenLabs yet.“Scribe doesn’t just transcribe — it understands audio,” Schneider continued in a thread. “It can detect non-verbal events (like laughter, sound effects, music and background noise) and analyze long audio contexts for accurate diarization, even in the most challenging environments.”“Diarization” is the name given to the process of separating speakers by their vocal qualities on a recording.In fact, ElevenLabs’ documentation states Scribe can distinguish and isolate up to 32 different speakers in the same audio file. While ElevenLabs cautions that Scribe is “best used when high-accuracy transcription is required rather than real-time transcription,” the company also plans to introduce a low-latency version soon, expanding its use for real-time applications.Lowest word error rates (WER)Scribe is designed to handle real-world audio challenges with precision

Recently Posted Jobs

Sign up to get curated job recommendations

Deepgram is Hiring for 1 Jobs on Simplify!

Find jobs on Simplify and start your career today

💡
We update Deepgram's jobs every few hours, so check again soon! Browse all jobs →