Work Here?
Industries
Data & Analytics
AI & Machine Learning
Company Size
51-200
Company Stage
Series B
Total Funding
$103.3M
Headquarters
San Francisco, California
Founded
2015
Deepgram specializes in artificial intelligence for speech recognition, offering a set of APIs that developers can use to transcribe and understand audio content. Their technology allows clients, ranging from startups to large organizations like NASA, to process millions of audio minutes daily. Deepgram's speech recognition technology is designed to be fast, accurate, scalable, and cost-effective, making it suitable for businesses of all sizes that require large-scale audio data processing. The company operates on a pay-per-use model, where clients are charged based on the amount of audio they transcribe, allowing Deepgram to align its revenue with client usage. This approach positions Deepgram well in the growing demand for speech recognition technology.
Help us improve and share your feedback! Did you find this helpful?
Total Funding
$103.3M
Above
Industry Average
Funded Over
5 Rounds
Industry standards
Comprehensive Health Plans
FSA Health Matching up to $1,000
Work from Home Ergonomic Stipend
Healthy Food & Snacks in offices
Community Groups
Unlimited Vacation
Voice AI company Deepgram has unveiled Nova 3 Medical, a next-generation speech-to-text model built specifically for the healthcare sector, claiming it delivers the most accurate real-time medical transcription on the market.
Deepgram, a voice AI platform, on Tuesday launched Aura-2, its next-generation text-to-speech (TTS) model.
Deepgram, voice artificial intelligence platform for developers, has introduced its new AI model for the healthcare sector: Nova-3 Medical.
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More. ElevenLabs, the highly-valued AI voice cloning and generation startup from former Palantir alumni, today launched Scribe v1, a new speech-to-text model that reportedly achieves the highest accuracy across multiple languages. Users can try it here on the ElevenLabs site.According to the company’s benchmarks, it outperforms Google’s Gemini 2.0 Flash, OpenAI’s Whisper v3, and Deepgram Nova-3 on accurately converting spoken speech into text on the web, achieving new record-low error rates.The company claims that Scribe delivers state-of-the-art transcription accuracy in 99 languages, including improved performance in previously underserved languages such as Serbian, Cantonese, and Malayalam. As Flavio Schneider, ElevenLabs Lead Researcher wrote on X, Scribe is the “smartest audio understanding model” released by ElevenLabs yet. “Scribe doesn’t just transcribe — it understands audio,” Schneider continued in a threaded reply
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More. ElevenLabs, the highly-valued AI voice cloning and generation startup from former Palantir alumni, today launched Scribe v1, a new speech-to-text model that reportedly achieves the highest accuracy across multiple languages. Users can try it here.According to the company’s benchmarks, it outperforms Google’s Gemini 2.0 Flash, OpenAI’s Whisper v3 and Deepgram Nova-3 in accurately converting spoken speech into text on the web, achieving new record-low error rates.The company claims that Scribe delivers state-of-the-art transcription accuracy in 99 languages, including improved performance in previously underserved languages such as Serbian, Cantonese and Malayalam.As Flavio Schneider, ElevenLabs lead researcher wrote on X, Scribe is the “smartest audio understanding model” released by ElevenLabs yet.“Scribe doesn’t just transcribe — it understands audio,” Schneider continued in a thread. “It can detect non-verbal events (like laughter, sound effects, music and background noise) and analyze long audio contexts for accurate diarization, even in the most challenging environments.”“Diarization” is the name given to the process of separating speakers by their vocal qualities on a recording.In fact, ElevenLabs’ documentation states Scribe can distinguish and isolate up to 32 different speakers in the same audio file. While ElevenLabs cautions that Scribe is “best used when high-accuracy transcription is required rather than real-time transcription,” the company also plans to introduce a low-latency version soon, expanding its use for real-time applications.Lowest word error rates (WER)Scribe is designed to handle real-world audio challenges with precision
Find jobs on Simplify and start your career today
Industries
Data & Analytics
AI & Machine Learning
Company Size
51-200
Company Stage
Series B
Total Funding
$103.3M
Headquarters
San Francisco, California
Founded
2015
Find jobs on Simplify and start your career today