Full-Time
Video understanding API for semantic search
$207k - $220k/yr
No H1B Sponsorship
Washington, DC, USA
Hybrid
US Citizenship, US Top Secret Clearance, Canada Citizenship, Canada Top Secret Clearance, UK Citizenship, UK Top Secret Clearance Required
| , |
TwelveLabs provides an API-based platform that analyzes video content to extract key features such as actions, objects, on-screen text, speech, and people. These features are converted into vector representations to support fast, scalable semantic search across large video datasets. Developers and product managers integrate TwelveLabs’ end-to-end video understanding capabilities into their products, enabling users to search within videos quickly and precisely. The company differentiates itself by offering an end-to-end infrastructure that emphasizes speed and effectiveness compared with open-source and commercial models. The goal is to make all video content searchable and easy to interact with through API calls.
Company Size
51-200
Company Stage
Early VC
Total Funding
$110.1M
Headquarters
San Francisco, California
Founded
2021
Help us improve and share your feedback! Did you find this helpful?
People at TwelveLabs who can refer or advise you
Health Insurance
Dental Insurance
Vision Insurance
Unlimited Paid Time Off
VAST Data and TwelveLabs have partnered to bring advanced video intelligence capabilities to large-scale, secure video archives beyond public cloud deployments. The collaboration introduces TwelveLabs' first customer-managed deployment on the VAST AI Operating System, enabling video search, analytics and reasoning workflows where governance and data sovereignty are critical. TwelveLabs' foundation models, including Marengo for multimodal search and Pegasus for video understanding, will run on VAST's infrastructure designed to manage unstructured data at exabyte scale. The system features unified global namespace, trillion-vector scale storage and real-time data orchestration. Target industries include media and entertainment for content archive management, financial services for surveillance and compliance, and public sector agencies requiring on-premises video intelligence. The partnership addresses demand for deploying AI closer to where video data is created whilst maintaining regulatory compliance and cost efficiency.
TwelveLabs unveils Marengo 3.0: next-gen video understanding model now on Amazon Bedrock. In a groundbreaking move for small business owners managing video content, TwelveLabs has unveiled Marengo 3.0 during the AWS re:Invent conference in Las Vegas. Marketed as the world's most advanced video understanding model, Marengo 3.0 promises to revolutionize how businesses leverage video data, helping them utilize this previously unwieldy asset more effectively. TwelveLabs, a leader in video search and understanding, has designed Marengo 3.0 to provide deep insights into video content by "reading" it in a way that mimics human understanding. The platform processes audio, visual elements, text, and movement as interconnected components, making it easier to analyze and search for specific moments or themes within extensive video libraries. The advantages for small business owners are compelling. Marengo 3.0 offers a 50% reduction in video storage costs and doubles the speed at which videos can be indexed. This means businesses can manage larger volumes of video efficiently while lowering operational costs. As Jae Lee, CEO and co-founder of TwelveLabs, highlights, "Video represents 90% of digitized data, but that data has been largely unusable... Now, Marengo 3.0 shatters the limits of what is possible." The promise of immediate ROI is particularly appealing for small businesses looking to maximize every dollar spent. Among the standout features of Marengo 3.0 is its "native video understanding," meaning it was built specifically for video, as opposed to merely adapting image processing models. This model enables businesses to navigate complex scenes, track events over time, and connect dialogue with visual cues, making it an invaluable tool for sectors like media, entertainment, advertising, and public security. Small business owners can harness this technology to enhance various operational needs. For instance, in sports, the model allows teams to track player actions and jersey numbers, streamlining highlight identification for promotional content. Similarly, companies in advertising can gain insights into audience engagement by analyzing viewers' emotional reactions, thus improving content targeting and ROI. While these features excite many small business leaders, there are practical considerations and potential challenges. The deployment of Marengo 3.0 requires a robust technological infrastructure, particularly for those lacking IT expertise. Fortunately, its availability via Amazon Bedrock simplifies integration into existing AWS environments, a significant bonus for businesses already familiar with Amazon's cloud service. For non-technical teams, a learning curve exists in understanding how to fully utilize the model's capabilities. While TwelveLabs provides documentation and support, owners must invest time and resources into training their teams. Small businesses often operate under tight budgets, so weighing these costs against the anticipated benefits will be crucial. With the ability to conduct multimodal queries, users can combine text with images in their searches, enabling a more thorough exploration of video content. This functionality can unlock new insights across industries, establishing a more informed approach to content strategy. Moreover, Marengo 3.0 supports multiple language inputs, broadening its appeal and utility for diverse businesses operating in a multilingual landscape. Nishant Mehta, VP of AI Infrastructure at AWS, noted, "TwelveLabs' work in video understanding is transforming how entire industries manage their video capabilities." This statement underscores the importance of adopting advanced technologies to remain competitive. For small business owners, collaborating with innovators like TwelveLabs may open new revenue streams, as well as improve operational efficiencies. In a world increasingly dominated by video, leveraging tools like Marengo 3.0 is no longer just an option but a necessity. Small businesses can enhance their competitive advantage by adopting this powerful model, optimizing video content from various angles, and ultimately transforming the way they engage with their customers. TwelveLabs' latest offering is available through their platform and via Amazon Bedrock. With its API-first design and improved video support, the model stands ready to support a multitude of business needs while driving significant cost savings and efficiency gains. For further information on Marengo 3.0's capabilities and to explore how it can specifically benefit your business, visit the original announcement. Sarah Lewis is a small business news journalist and writer dedicated to keeping entrepreneurs informed on the latest industry trends, policy changes, and economic developments. With over a decade of experience in business reporting, Sarah has covered breaking news, market insights, and success stories that impact small business owners. Her work has been featured in prominent business publications, delivering timely and actionable information to help entrepreneurs stay ahead. When she's not covering small business news, Sarah enjoys exploring new coffee shops and perfecting her homemade pasta recipes.
TwelveLabs launches its most powerful Video Understanding model, Marengo 3.0 on TwelveLabs and Amazon Bedrock. News provided by. TwelveLabs Its most significant model to date, Marengo 3.0 delivers human-like video understanding at enterprise-grade scale SAN FRANCISCO, Dec. 1, 2025 /PRNewswire-PRWeb/ - TwelveLabs, the leading video search and understanding company, announced today at AWS re:Invent general availability of its most sophisticated model yet, Marengo 3.0. The new release is a breakthrough video foundation model. It doesn't just watch video, it reads it, hears it, and picks up on the rhythm of a scene. The model can connect a moment of dialogue to a gesture three minutes later. It tracks objects, movement, emotion, and events through time. Simply, it is the world's most powerful video understanding model, and customers can access it today through Amazon Bedrock and TwelveLabs. To find out more about what makes Marengo 3.0 the world's most powerful video understanding model, please click here. Built on TwelveLabs multimodal architecture, Marengo 3.0 uniquely treats video as a living, dynamic system, compressing audio, text, movement, visuals, and context into something that can be searched, navigated, and understood at scale. Marengo 3.0 comes production-ready and delivers immediate ROI. Based on extensive testing, the model offers 50% reduction in storage costs and 2x faster indexing performance among a slew of other benefits so that anyone with stores of video content can fully leverage all of their assets. "Video represents 90% of digitized data, but that data has been largely unusable because it takes too long for humans to break down, and machines have been incapable of grasping and accounting for everything that happens in video," said Jae Lee, CEO and co-founder of TwelveLabs. 'Solving this problem has been its singular obsession. Now, Marengo 3.0 shatters the limits of what is possible. It is an incomparable solution for enterprises and developers." Smarter, Faster, Leaner for True Video Understanding The release of Marengo 3.0 positions TwelveLabs as the breakout leader in video intelligence infrastructure with capabilities no one else can match. Unlike competitors that rely on frame-by-frame analysis or separate image and audio models stitched together, Marengo 3.0 lets users see differently and understand everything in their video. This includes even the most complex, fast-moving clips. Now, Marengo is even better at understanding sports, media & entertainment, and advertising video, as well as sensitive video types found across government and public security use cases. Marengo 3.0 delivers: * Native Video Understanding: Marengo 3.0 was not adapted from image models. It offers understanding at the foundation model level. * Temporal & Spatial Reasoning: The new model uniquely understands context across time and space. * Sports Intelligence: In an industry-first, Marengo 3.0 offers team, player, jersey number, and action tracking to make identifying highlights faster and easier than ever before. * Composed Multimodal Queries: To ensure users always find what they need, Marengo 3.0 enables them to combine image and text in a single query for more granular results. * Production Economics: With 50% storage reduction costs and 2x faster indexing while creating the potential for new revenue streams, Marengo 3.0 is helping businesses save on cost while providing more opportunities for growth. * Enterprise Ready: It's easy for even the largest organization to get started. Marengo 3.0 is available on Amazon Bedrock, enabling fast and secure deployment in their current AWS environment, as well as directly through TwelveLabs as a monthly service. With its API-first design, Marengo 3.0 offers compact embeddings and four-hour video support - a 2x increase over Marengo 2.7. Additionally, it is multilingual across 36 languages. "TwelveLabs' work in video understanding is transforming how entire industries manage their video capabilities, bringing unprecedented speed and efficiency to what has largely been a manual process," said Nishant Mehta, VP of AI Infrastructure at AWS. "We are excited to be the first cloud provider to offer Marengo 3.0 to our customers through Amazon Bedrock, following great adoption from TwelveLabs' previous Marengo and Pegasus models." Marengo 3.0 is currently available through TwelveLabs or Amazon Bedrock, a fully managed service for building and scaling generative AI applications and agents. AWS is the first cloud service provider to offer access to Marengo 3.0. About TwelveLabs TwelveLabs is the world's most powerful video intelligence platform, enabling machines to see, hear, and reason about video like humans do. From semantic search to automated summaries and multimodal embeddings, TwelveLabs empowers developers and enterprises to unlock the full potential of video data across industries including media, advertising, government, security, and automotive. For more information, visit www.twelvelabs.io. SOURCE TwelveLabs
Twelve Labs, a company developing advanced video understanding AI, has secured strategic investment from Firstman Studio, the producer of the globally successful "Squid Game." This investment highlights the practical value of AI technology in the entertainment industry. Twelve Labs' technology can analyze video content to quickly locate specific scenes, aiding major studios and platforms in efficiently utilizing vast video archives. The investment aims to adapt to the evolving global storytelling landscape.
The technology reads emotional flow, not just objects or dialogue on screen.