
Work Here?
Sanas.ai provides real-time accent modification for audio communication through a locally installed SaaS app. The software runs on the user’s device and processes speech in real time to adjust the speaker’s accent during live conversations, with customizable settings. It distinguishes itself by performing on-device processing to emphasize data security and user privacy, avoiding cloud-based data storage. The goal is to reduce misunderstandings and improve communication efficiency for individuals and teams that regularly interact across languages.
Industries
Consumer Software
Enterprise Software
AI & Machine Learning
Company Size
201-500
Company Stage
Series B
Total Funding
$117.2M
Headquarters
Mountain View, California
Founded
2020
Help us improve and share your feedback! Did you find this helpful?
Total Funding
$117.2M
Above
Industry Average
Funded Over
4 Rounds
Industry standards
Company Equity
Sanas, a speech AI platform for enterprise communication, has launched real-time language translation and upgraded speech enhancement capabilities. The platform now supports speech-to-speech translation across more than 13 languages whilst preserving vocal identity and tone. Founded in 2021, Sanas serves enterprises including Carelon, Cigna, Comcast, Huntington, Robinhood, UnitedHealth Group, Vanguard and Wyndham across more than 100 countries. The company has grown from $0 to $60 million in annual recurring revenue since launching commercially in 2023 and is targeting $120 million. The platform uses on-device AI processing and offers containerised deployment options for data security. Sanas holds HITRUST i1 certification and does not monitor, record or store call data, addressing privacy requirements for regulated industries.
Atento, a global customer experience outsourcing provider, has announced a strategic collaboration with Sanas and Thrivin to advance AI-augmented customer experience and impact sourcing at scale. The partnership integrates Atento's global governance and delivery framework with Sanas' real-time speech understanding technology and Thrivin's quality-first impact sourcing platform in Kenya. Sanas' AI technology will be deployed for non-US and non-Puerto Rico voice operations, whilst Thrivin provides access to highly educated, English-proficient African talent. Atento will orchestrate the end-to-end model, ensuring security, compliance and performance management meet enterprise standards. The collaboration aims to enable global expansion whilst maintaining operational quality and creating new opportunities in African markets. The initiative reflects Atento's vision that the future of customer experience must be "Augmented by AI. Driven by People.
Top takeaways from Speech Clarity: The Strategic Layer Most Enterprises Overlook. Enterprise CX leaders are used to diagnosing performance issues through familiar lenses: training gaps, process inefficiencies, or staffing challenges. But a new channel brief from Metric Sherpa, in collaboration with Sanas, argues that one foundational factor is consistently overlooked: whether two people can clearly understand each other in the moment. For leaders looking to drive faster resolution, stronger customer satisfaction, and more reliable automation outcomes, this whitepaper lays out why speech clarity is an often-overlooked lever that delivers results without changing scripts, workflows, or how agents speak. Read on for highlights, then download Speech Clarity: The Strategic Layer Most Enterprises Overlook for the full framework, data, and recommendations. Takeaway 1: Speech Clarity is emerging as enterprise infrastructure. Here's a pattern that shows up across every type of operation: clarity breaks first, and the organization feels the impact long after the damage begins. Leaders often describe symptoms that feel all too familiar: long handle times, unexpected escalations, AI tools that never quite deliver, and teams that burn out faster than expected. But these symptoms are rarely traced back to their root cause. As the whitepaper puts it: "Most leaders point to training, process, or staffing. They rarely look at the one factor that drives all of it: whether two people can clearly understand each other in the moment." Takeaway 2: the hidden costs of poor clarity are significant (and largely unmeasured). Handle Time Slows When Comprehension Requires Extra Effort. Missed words, noisy environments, and accent variation all demand extra decoding effort. Every repeated phrase increases cost and load. Misheard Intent Quickly Shifts the Direction of an Interaction. A single misinterpreted phrase can change tone, prompt a transfer, or extend workload. As the research notes: "Misunderstanding always precedes escalation. They just never see it in the data." AI and Automation Degrade When Audio Inputs Are Unreliable. Low-fidelity audio corrupts transcripts and weakens intent signals, affecting agent assist, routing, and every downstream model. These friction points rarely show up in dashboards or QA reports, but they shape performance every single day. Takeaway 3: three shifts have made clarity a strategic requirement. Three shifts have turned clarity from an operational afterthought into a strategic must-have: * Speaking Environments Became More Complex. Hybrid work, global delivery models, multilingual customers, and wide variation in accents and acoustic conditions have outpaced legacy voice infrastructure. * AI Success Now Depends on Input Quality. When audio quality fluctuates, transcripts degrade, intent signals weaken, and automation reliability collapses. * Customers Expect Instant, Accurate Understanding. Any delay or misunderstanding signals friction and erodes confidence - regardless of accent, emotion, or language. The whitepaper puts it best: "When transformation initiatives miss their targets, leaders should examine the conversations feeding them." Takeaway 4: what changes when clarity improves. When clarity friction disappears, performance improves quickly - and predictably. Metric Sherpa's operational reviews show a consistent pattern across industries. The whitepaper highlights several outcomes, including: * Faster Resolution: Agents stop repeating, clarifying, and decoding. Conversations move forward without changes to scripts, policies, or workflows. * Stronger Customer Confidence: Effortless understanding reduces tension and builds trust. Customers sense competence and control when they feel understood the first time. * More Capacity and Accuracy for Agents: Lower cognitive load improves focus and execution. Agents make fewer errors, manage complexity more effectively, and sustain performance across shifts. * Higher AI Performance: Clean, stable speech inputs improve transcription accuracy, intent detection, routing logic, and automation outcomes across the stack. The good news is that these outcomes appear early and compound over time. Takeaway 5: the Clarity Readiness Model - A four-stage framework. To help enterprises assess their current state, the whitepaper introduces a Clarity Readiness Model with four stages: Most organizations are still in the early stages, where clarity is reactive, fragmented, or unmeasured. The full framework includes guidance on how to recognize each stage and what it takes to move forward. The bottom line: this is about performance, not just audio quality. The Metric Sherpa whitepaper makes a compelling case: speech clarity is the hidden layer shaping performance across CX, workforce operations, and AI systems - yet it remains largely unexamined. When clarity breaks, organizations feel the symptoms but rarely trace them back to the root cause. Get your copy of the full whitepaper for detailed guidance on diagnosing clarity friction in your organization, using the Clarity Readiness Model to assess your current state, and understanding where clarity infrastructure is headed over the next two to three years.
Meet Sanas Accent Translation 4.5: Ultra-Fidelity, increased intelligibility, clearer speech. Accent Translation At Sanas, Sanas design speech AI around a simple principle: technology should adapt to humans, not the other way around. That principle drove the development of Accent Translation 4.0, the first model to demonstrate that real-time accent translation could make English speech more intelligible than the original audio on Automatic Speech Recognition (ASR) systems. Building on that success, Sanas expanded its capabilities to support a wider global audience with new input accents from Africa & the Middle East and a new British output accent, bringing even more voices into the conversation. But Sanas didn't stop at global expansion, Sanas kept raising the bar on quality. Sanas then introduced Speech Enhancement 1.0, with 24 kHz Ultra-Fidelity audio capturing the warmth, texture, and presence that transcends standard telephony. Then its customers asked a critical question: "Can we have both Accent Translation and Ultra-Fidelity audio together?" With Accent Translation 4.5, the answer is yes. Built on its 24 kHz Ultra-Fidelity architecture, Sanas Accent Translation 4.5 delivers accent translation with unprecedented clarity, naturalness, and acoustic detail - bringing the benefits of Speech Enhancement and Accent Translation into a single system. In this article, you'll hear side-by-side audio comparisons, see how Accent Translation 4.5 improves naturalness and intelligibility over Accent Translation 4.0, and learn about the technical breakthroughs that make 24 kHz accent translation possible in real time. What's new in Accent Translation 4.5? With Accent Translation 4.5, Sanas is introducing a set of upgrades that dramatically elevate speech quality across fidelity, naturalness, and intelligibility. Here's what's new: * 24 kHz Ultra-Fidelity Audio: A major leap in audio resolution that delivers richer, fuller voice quality. This upgrade captures high-frequency harmonics like crisp fricatives and subtle breath sounds that standard 8 kHz low-fidelity and even 16 kHz high-fidelity audio can't reproduce. * Enhanced Naturalness and Stability: Refined speech synthesis techniques make voices sound even more natural, less synthetic, and more stable across long utterances and complex phrases. * Superior Intelligibility for Challenging Inputs: Targeted model enhancements and expanded training data deliver significant intelligibility gains, especially for Latin American, African, and Middle Eastern input accents. Why 24 kHz matters: beyond wideband. TLDR: While most communication tools cap audio quality at 8 kHz or 16 kHz, Sanas' latest AT model upgrades it to 24 kHz Ultra-Fidelity, capturing the full richness of the human voice. For decades, contact centers and enterprise communication platforms have been constrained by narrowband (8 kHz) and wideband (16 kHz) audio limits. While functional, these limits cut off the upper frequencies of the human voice responsible for clarity, articulation, and vocal nuance. These bandwidth ceilings blur consonants, flatten the crisp "s" sounds, and remove the breath of a laugh and other subtle cues that make speech sound natural and easy to understand. Sanas Accent Translation 4.5 shatters this ceiling. By outputting speech at 24 kHz, Sanas preserve the high-frequency spectrum that gives a voice its full presence, texture, and expressiveness. The model takes standard low fidelity (8-16 kHz) input and intelligently reconstructs the missing upper band frequencies to produce a pristine 24 kHz audio - quality that become immediately audible in the side-by-side examples included in this article. How do Sanas restore sound that isn't there? It works by analyzing the harmonic structure of human speech. Since the "missing" high frequencies share a predictable relationship with the lower tones Sanas do have, its algorithm uses the input as a blueprint to mathematically predict and regenerate the upper spectrum - effectively filling in the details that standard compression wiped away. * Crystal Clear Fricatives: Sounds like f, s, sh, and th are often lost in lower bandwidths, leading to confusion. 24 kHz renders these with precision. * Reduced Listening Fatigue: Higher-resolution audio is easier for the brain to process, improving comfort and comprehension during long conversations. * Future-Proofing Your Voice Stack: As communication platforms move toward HD audio, your accent translation pipeline is already optimized for next-gen audio. Ready to hear the difference for yourself? Listen to the differences in sampling rates between the 8 kHz, 16 kHz, and 24 kHz examples included below. The best just got better: Accent Translation 4.5 vs. 4.0. When Sanas launched Accent Translation 4.0 three months ago, Sanas redefined what real-time accent translation could achieve, proving that translated English speech could be more intelligible than the original audio. Accent Translation 4.5 builds on this foundation with meaningful upgrades in naturalness, intelligibility, and acoustic richness that you can both hear and measure. 1. More natural, full-spectrum audio quality. TLDR: While Accent Translation 4.0 was smooth, Accent Translation 4.5 brings out full, natural human clarity. Accent Translation 4.5 produces speech with greater fullness, harmonic detail, and vocal presence. In blind A/B tests conducted with independent United States listeners, Accent Translation 4.5 was preferred over 4.0 in 63.18% of trials for naturalness, driven by improvements in "fullness" and "richness" in the voice. By capturing the full harmonic structure of speech, Sanas ensure that the translated accent doesn't just sound more articulate, it sounds lifelike, expressive, and true to the original voice. Hear how the upgrades enhance depth and detail across male and female speakers with a range of input accents. | Original | Accent Translation 4.0 | Accent Translation 4.5 | | / | / | / | 2. Higher intelligibility across more accents. TLDR: Smarter models and more data mean better understanding. Accent Translation 4.0 set a high bar by reducing the Word Error Rate (WER) across multiple accents in English. Accent Translation 4.5 pushes this even further with architectural improvements and a significantly expanded, more diverse training dataset. Improving intelligibility isn't just about audio resolution; it's about the intelligence behind the model. Using a state-of-the-art ASR model on a comprehensive evaluation dataset, Accent Translation 4.5 demonstrates a 16.6% overall relative WER over Accent Translation 4.0. The largest gains appear in regions where speech patterns include tonal, rapid, or phonetically complex elements: Middle East (ME) 29.4% and African (AFR) 23.3% relative improvement. Lowering WER isn't just a technical benchmark, it addresses a long-standing issue in speech technology. Historically, Automatic Speech Recognition (ASR) systems have shown measurable bias against non-native and regional accents in English, leading to higher error rates for millions of speakers across CX, AI agents, transcription, healthcare, and more. By improving intelligibility for accented speakers - especially those historically underserved by ASR systems - Accent Translation 4.5 helps reduce these inequities and strengthens every system that depends on accurate speech understanding. These gains represent more than an acoustic upgrade - they reinforce Sanas's mission to improve communication any time a human is in the loop and to make the speech ecosystem more inclusive for global speakers. The side-by-side transcriptions below show how Accent Translation 4.5 resolves clipped consonants, distorted vowels, and ambiguous phrasing that Accent Translation 4.0 occasionally misinterpreted. | Original | Accent Translation 4.0 | Accent Translation 4.5 | | total okay yes that's right | total archy yes that's right | total okay yes that's right | | good afternoon miss the purpose of the call it's about a snow remove service requested | good afternoon miss the purpose of the call it's bowanow remove service requested | good afternoon miss the purpose of the call it's about a snow remove service requested | | oh donna you have no loans at the moment donna | oh donna you have no launch at the moment donut | oh donna you have no loans at the moment donna | A new standard for connection. Accent Translation 4.5 is more than a model upgrade, it represents a new chapter in what speech technology can deliver. By uniting real-time accent translation with 24 kHz Ultra-Fidelity audio, Sanas bring unprecedented clarity, expressive detail, and stability to global communication. These improvements matter everywhere clarity matters: * in customer experience, where understanding drives trust * in enterprise collaboration across global teams * in AI and agentic systems that depend on accurate speech inputs * and in any human-in-the-loop workflow where miscommunication has real consequences. Most importantly, it supports Sanas' mission to make speech technology more inclusive. By improving intelligibility for accented speakers historically underserved by ASR systems, it reduces inequities and expands access, ensuring every voice is not only heard, but fully understood. And when clarity becomes the default, everything else scales: trust, comprehension, efficiency, and outcomes. Accent Translation 4.5 raises that standard across accents, environments, and distances. Ready to hear what Ultra-Fidelity accent translation sounds like in your workflows? Request a personalized demo of Accent Translation 4.5 and experience the difference for yourself.
Sanas receives Frost & Sullivan's 2025 North american Company of the Year recognition for leadership in Accent Translation solutions.
Find jobs on Simplify and start your career today
Industries
Consumer Software
Enterprise Software
AI & Machine Learning
Company Size
201-500
Company Stage
Series B
Total Funding
$117.2M
Headquarters
Mountain View, California
Founded
2020
Find jobs on Simplify and start your career today