Full-Time

Technical Product Manager

Posted on 8/7/2025

Unstructured

Unstructured

51-200 employees

Open-source data preprocessing for unstructured data

No salary listed

San Francisco, CA, USA

Remote

Category
Product (1)
Requirements
  • Bachelor's degree in a relevant field; MBA or MS in a technical field is a plus
  • 5-8 years of experience as a product manager for a SaaS web application, with a preference for experience with ETL or LLM tools; previous experience as a software engineer or technical role (e.g. machine learning, data science, software engineering or software management) is a plus.
  • Deep experience with software development at scale and prior experience with developer-facing products is a plus.
  • Strong analytical and problem-solving skills.
  • Excellent communication and interpersonal skills.
  • Demonstrated ability to work in a cross-functional and collaborative environment.
  • Thrives in a fast-paced, constantly changing environment.
  • An ideal candidate will have experience with ETL or LLM products while also having a highly technical background in the software industry.
Responsibilities
  • Own, define, and articulate a compelling product vision, ensuring alignment with the company's overall mission and goals to deliver data to generative AI applications.
  • Develop a deep understanding of market trends, customer needs, and competitive landscape to shape the product strategy for the frontier of generative AI.
  • Build actionable product requirements, technical specifications with an eye on customer needs and strategic vision to grow Unstructured’s ETL user base.
  • Collaborate with the engineering teams to prioritize and drive the development of Unstructured’s product suite.
  • Work closely with engineering teams to create customer-forward experiences across no-code UI and programmatic access patterns.
  • Develop a product with engineering excellence in mind to meet the highest standards of performance, security, and scalability.
  • Develop key success factors for your releases and develop metrics that show customer experience and engineering quality.
  • Conduct market research to identify opportunities and threats, staying informed about industry trends and emerging technologies.
  • Develop a clear understanding of customer personas and use cases, tailoring the product to meet their specific needs.
  • Work closely with sales, marketing, and customer support teams to gather feedback, address customer concerns, and support go-to-market strategies. Communicate tradeoffs between cost and return on different features to the business.
  • Collaborate with stakeholders to ensure seamless integration with external technical stacks, integration points, and on-premise deployments.
  • Empathetic to a wide variety of strengths across different technical and non-technical departments.
  • Develop and execute go-to-market plans for product launches, including marketing collateral, training materials, and customer communication.
  • Provide training and support to internal teams, ensuring they are well-equipped to promote and support the product.
  • Implement metrics and key performance indicators (KPIs) to track product performance and user satisfaction.
  • Continuously analyze data to identify areas for improvement and optimization across product, engineering, and sales funnels.
Desired Qualifications
  • MBA or MS in a technical field is a plus
  • Deep experience with software development at scale and prior experience with developer-facing products is a plus
  • Experience with ETL or LLM tools is a plus
  • Previous experience as a software engineer or technical role is a plus
  • Experience with SaaS product management for a web application is a plus

Unstructured.io provides tools for turning raw unstructured data into ML-ready formats. It delivers open-source libraries and APIs developers and data scientists use to build custom data-preprocessing pipelines for labeling, training, and production workflows. The pipelines support data from HTML, PDFs, CRM data, XML, PPTX, and DOCX, and can be orchestrated with machine learning models, cleaning scripts, and regular expressions, with easy integration to downstream services and strong data security. Users can publish their own APIs and format data for ingestion with various ML services, enabling scalable use of unstructured data. The goal is to help organizations extract value from unstructured data at scale by providing flexible, reusable preprocessing tools.

Company Size

51-200

Company Stage

Series B

Total Funding

$65M

Headquarters

San Francisco, California

Founded

2022

Simplify Jobs

Simplify's Take

What believers are saying

  • $2M AFWERX contract builds Air Force multimodal AI data pipelines.
  • Raised $25M from Madrona for LLM data solutions expansion.
  • 30+ connectors standardize multi-source ETL without custom code.

What critics are saying

  • LlamaIndex v0.10 erodes market share in 6-12 months.
  • LangChain 0.3 captures users via agentic workflow integration.
  • Open-source forks commoditize partitioning, slashing subscriptions.

What makes Unstructured unique

  • Unstructured supports 70+ file types with multimodal processing for AI pipelines.
  • FedRAMP High authorization enables secure federal agency deployments.
  • Partners with Teradata for native Enterprise Vector Store integration.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Remote Work Options

Unlimited Paid Time Off

Home Office Stipend

Health Insurance

Dental Insurance

Vision Insurance

Professional Development Budget

Growth & Insights and Company News

Headcount

6 month growth

0%

1 year growth

-2%

2 year growth

3%
Feedzai
Mar 24th, 2026
The most innovative data science companies of 2026.

The most innovative data science companies of 2026. March 24, 2026 Why Unstructured, Feedzai, Synchron, and Chalk are among Fast Company's Most Innovative Companies in data science for 2026.

Yahoo Finance
Mar 9th, 2026
Unstructured partners with Teradata to embed AI data processing natively in Enterprise Vector Store

Unstructured has partnered with Teradata to embed its data processing platform natively inside Teradata Enterprise Vector Store, enabling enterprises to transform unstructured content into AI-ready data without external tools. The integration will be available to eligible Teradata customers from April 2026. The partnership allows automatic ingestion and processing of documents, PDFs, images, video and audio directly within Teradata Enterprise Vector Store. Unstructured's preprocessing capabilities support over 70 file types, converting them into structured data and embeddings whilst maintaining the same governance and security standards as Teradata's structured analytics. The integration addresses a critical challenge, as roughly 80% of enterprise data exists in formats AI systems cannot natively use. It supports hybrid deployment across AWS, Azure, GCP, on-premises and air-gapped environments, particularly benefiting regulated industries like financial services, healthcare and government.

Business Wire
Feb 18th, 2026
Unstructured wins $2M AFWERX contract to build multimodal AI data pipelines for US Air Force testing

Unstructured has been awarded a $2 million Tactical Funding Increase contract by AFWERX in partnership with the U.S. Air Force Test Center's 96th Test Wing. The contract will develop advanced multimodal data pipelines for generative AI-enabled testing tools and establish test and evaluation frameworks for AI applications across the Air Force. The technology will enable the Air Force to process complex test data formats including charts, diagrams, images, audio, video and telemetry, which current AI tools struggle to access. Unstructured's solution will allow personnel to query and analyse information through AI-powered assistants whilst reducing processing costs and storage requirements. The company will also work with AFTC to develop frameworks measuring accuracy, speed and reliability of AI tools, accelerating test cycles and reducing redundant analysis.

The AI Journal Ltd
Dec 12th, 2025
Unstructured Secures FedRAMP High Authorization to Deliver AI-Ready Data to Federal Agencies and Partners

Unstructured secures FedRAMP High authorization to deliver ai-ready data to federal agencies and partners. SACRAMENTO, Calif. - (BUSINESS WIRE) - Unstructured, the leader in AI-ready data orchestration, today announced it has achieved FedRAMP High authorization. This milestone affirms Unstructured's commitment to delivering secure, scalable, and mission-ready solutions to US government agencies and industry partners, including those with the most stringent data security and compliance requirements. With this authorization, Unstructured becomes one of the few AI infrastructure companies authorized to operate at the FedRAMP High baseline. "FedRAMP High is more than a compliance milestone - it's our gateway to accelerating outcomes and unlocking data preparation cost savings for our public sector customers and partners," said Brian Raymond, Founder and CEO of Unstructured. "With this authorization, government users and industry partners can deploy Unstructured's enterprise-grade solution to get their data AI-ready and focus on delivering production-ready AI applications at scale." Government and industry partners are no longer just experimenting with GenAI - they're building real systems. But when it is time to move from pilot to production, most efforts hit a wall: brittle GenAI data pipelines, modality-specific workarounds, and fragmented architectures that can't adapt as models, file types, modalities or downstream systems evolve. Rather than rebuilding custom data pipelines for every GenAI use case, agencies and integrators can rely on Unstructured's Platform: a modular, enterprise-grade solution purpose-built to extract, transform, enrich, chunk, embed, and deliver AI-ready data - no matter the source or destination. It supports diverse modalities out of the box, works with any model or data store (vector, relational, etc.), and is now accessible in highly secure environments. Unstructured also helps reduce infrastructure and processing costs by intelligently adapting its transformation pipeline to the characteristics of each file - maximizing performance while minimizing costs where possible. Unstructured delivers the production-ready data layer that every GenAI application needs - so teams can focus on building outcomes, not maintaining open-source data pipelines. Unstructured's open source is already widely adopted across the federal government, powering tools like NIPRGPT, CamoGPT, and other systems within the military, national security, federal civilian, and even state and local governments. With the FedRAMP High authorized Platform, government users and industry partners can now operationalize these capabilities at enterprise scale - supported by full end-to-end orchestration across ingestion, transformation, enrichment, and delivery. "Our open-source tools have helped federal teams experiment with LLMs using unstructured data," said Raymond. "Now, with FedRAMP High authorization of our GenAI data orchestration platform, agencies can move beyond experimentation - deploying a secure, production-ready data platform to scale GenAI applications with confidence." About Unstructured Unstructured delivers mission-ready data transformation and orchestration solutions that turn unstructured, multimodal content into AI-ready data at scale. Its modular open platform eliminates the brittleness and high costs of traditional data engineering pipelines, enabling government and commercial organizations to rapidly build and deploy GenAI applications. To learn more or deploy Unstructured, contact [email protected].

readmagazine.com
Aug 11th, 2025
Unstructured.io Joins Palantir FedStart to Advance Federal AI Data Solutions

Unstructured, a leading provider of scalable, mission-ready Generative AI (GenAI) solutions powered by advanced data transformation and orchestration, announced it has joined Palantir Technologies' FedStart program.

INACTIVE