Full-Time
Posted on 9/11/2025
Provides generative, edge-capable foundation models
No salary listed
Boston, MA, USA + 1 more
More locations: San Francisco, CA, USA
Hybrid
Liquid AI builds and deploys foundation models based on liquid neural networks, focusing on efficient, on-device AI. It develops Liquid Foundation Models (LFMs), a family of generative AI models designed to be smaller and more computation-efficient than typical large language models, enabling deployment on edge devices with lower latency, better privacy, and reduced infrastructure costs. The approach includes end-to-end AI expertise and customizable architectures for enterprises that require real-time performance and private processing. Compared to traditional AI providers, Liquid AI emphasizes edge-ready, adaptable models that run efficiently on constrained hardware, and it targets enterprise-grade, private, reliable AI solutions. The company’s goal is to enable real-time, on-device AI at scale for businesses by offering compact, capable foundation models and the tools to customize them for specific applications.
Company Size
51-200
Company Stage
Series A
Total Funding
$287.5M
Headquarters
Brookline, Massachusetts
Founded
2023
Help us improve and share your feedback! Did you find this helpful?
Remote Work Options
Flexible Work Hours
Insilico Medicine releases 2025 annual results and advances AI drug discovery platform. March 29, 2026 at 4:15 PM - by MLQ Agent Key points. * Insilico Medicine will report 2025 financial results and business update on March 30, 202613. * Live conference calls in English and Mandarin set for 9:00 AM and 10:30 AM Beijing Time13. * Company advancing AI drug discovery with recent collaborations and platform enhancements57. * Past quarters show revenue declines averaging 21.7% annually alongside R&D investments6. * Recognized for AI platform impact, including Phase IIa results for rentosertib2. Insilico Medicine, a clinical-stage biotech firm using generative AI for drug discovery, plans to release its 2025 financial results and business update on March 30, 2026. The announcement follows a March 13 press release detailing conference calls in English and Mandarin13. Announcement details. Insilico Medicine announced on March 13, 2026, that it will report financial results for the year ended December 31, 2025, during live conference calls on March 30, 2026, Beijing Time. The English session starts at 9:00 AM Beijing Time, equivalent to 9:00 PM U.S. Eastern Time on March 29, 2026. The Mandarin session follows at 10:30 AM Beijing Time. Participants must pre-register via the provided Zoom link for the English call 134. Replays will be available on the company's website shortly after the calls 1. Recent business highlights. Insilico recently collaborated with Liquid AI, announced on March 8, 2026, focusing on AI advancements for drug discovery 57. In 2025, the company reported Phase IIa results for rentosertib, its first fully AI-discovered and designed small-molecule drug, which showed lung function stabilization or improvement in idiopathic pulmonary fibrosis patients with a favorable safety profile 2. The firm also completed Hong Kong's largest biotech IPO in 2025 and expanded partnerships with global pharmaceutical companies 2. Financial background. Historical data indicates revenues declining at an average annual rate of 21.7%, with net margins at -82.72% as of the last update 6. Quarterly figures for 2025 show revenues of $54 million in Q2 with a $44 million loss, following $70 million in Q1 with a $31 million loss. R&D expenses remained high, at $82 million in Q2 and $87 million in Q1 6. AI platform investment returns. Insilico Medicine's persistent revenue decline of 21.7% annually contrasts with robust R&D spending, reflecting heavy investment in its generative AI platform amid a challenging biotech funding environment 6. The company's recognition by Fast Company for clinical progress, particularly rentosertib's Phase IIa data, underscores the platform's potential to deliver real-world outcomes, even as losses widened 2. This pattern aligns with clinical-stage biotechs prioritizing pipeline advancement over near-term profitability. Strategic collaborations like the March 2026 Liquid AI milestone on private infrastructure highlight Insilico's focus on enhancing AI capabilities for drug discovery applications 57. The upcoming earnings call timing allows investors to assess how these developments offset revenue pressures and one-off costs, positioning the firm amid growing competition in AI-biotech integration. Earnings call pipeline updates. Investors await March 30 conference calls for insights into 2025 performance, including potential updates on rentosertib progression and new pipeline assets 13. Replays and website access will broaden reach, potentially influencing stock movements for ticker 3696.HK. Expanding partnerships could accelerate revenue through milestones or licensing 2 . Longer-term, Insilico's end-to-end AI approach may drive efficiencies in drug development timelines, building on 2025's clinical milestones. Sustained R&D amid revenue challenges will test capital allocation post-IPO, with focus on advancing preclinical and clinical programs toward commercialization 62. Market reception to earnings guidance will signal confidence in AI-driven biotech scalability. Further sources. Written with AI assistance, verified and edited by its team. Questions? Contact MLQ.ai.
Liquid AI releases LocalCowork powered by LFM2-24B-A2B to execute Privacy-First agent workflows locally via Model Context Protocol (MCP). March 5, 2026 Liquid AI has released LFM2-24B-A2B, a model optimized for local, low-latency tool dispatch, alongside LocalCowork, an open-source desktop agent application available in their Liquid4All GitHub Cookbook. The release provides a deployable architecture for running enterprise workflows entirely on-device, eliminating API calls and data egress for privacy-sensitive environments. Architecture and Serving configuration. To achieve low-latency execution on consumer hardware, LFM2-24B-A2B utilizes a Sparse Mixture-of-Experts (MoE) architecture. While the model contains 24 billion parameters in total, it only activates approximately 2 billion parameters per token during inference. This structural design allows the model to maintain a broad knowledge base while significantly reducing the computational overhead required for each generation step. Liquid AI stress-tested the model using the following hardware and software stack: * Hardware: Apple M4 Max, 36 GB unified memory, 32 GPU cores. * Serving Engine: llama-server with flash attention enabled. * Quantization: Q4_K_M GGUF format. * Memory Footprint: ~14.5 GB of RAM. * Hyperparameters: Temperature set to 0.1, top_p to 0.1, and max_tokens to 512 (optimized for deterministic, strict outputs). LocalCowork Tool Integration. LocalCowork is a completely offline desktop AI agent that utilizes the Model Context Protocol (MCP) to execute pre-built tools without relying on cloud APIs or compromising data privacy, logging every action to a local audit trail. The system includes 75 tools across 14 MCP servers capable of handling tasks like filesystem operations, OCR, and security scanning. However, the provided demo focuses on a highly reliable, curated subset of 20 tools across 6 servers, each rigorously tested to achieve over 80% single-step accuracy and verified multi-step chain participation. LocalCowork acts as the practical implementation of this model. It operates completely offline and comes pre-configured with a suite of enterprise-grade tools: * File Operations: Listing, reading, and searching across the host filesystem. * Security Scanning: Identifying leaked API keys and personal identifiable information (PII) within local directories. * Document Processing: Executing Optical Character Recognition (OCR), parsing text, diffing contracts, and generating PDFs. * Audit Logging: Recording every tool call locally for compliance tracking. Performance benchmarks. Liquid AI team evaluated the model against a workload of 100 single-step tool selection prompts and 50 multi-step chains (requiring 3 to 6 discrete tool executions, such as searching a folder, running OCR, parsing data, deduplicating, and exporting). Latency. The model averaged ~385 ms per tool-selection response. This sub-second dispatch time is highly suitable for interactive, human-in-the-loop applications where immediate feedback is necessary. Accuracy. * Single-Step Executions: 80% accuracy. * Multi-Step Chains: 26% end-to-end completion rate. Key takeaways. * Privacy-First Local Execution: LocalCowork operates entirely on-device without cloud API dependencies or data egress, making it highly suitable for regulated enterprise environments requiring strict data privacy. * Efficient MoE Architecture: LFM2-24B-A2B utilizes a Sparse Mixture-of-Experts (MoE) design, activating only ~2 billion of its 24 billion parameters per token, allowing it to fit comfortably within a ~14.5 GB RAM footprint using Q4_K_M GGUF quantization. * Sub-Second Latency on Consumer Hardware: When benchmarked on an Apple M4 Max laptop, the model achieves an average latency of ~385 ms for tool-selection dispatch, enabling highly interactive, real-time workflows. * Standardized MCP Tool Integration: The agent leverages the Model Context Protocol (MCP) to seamlessly connect with local tools - including filesystem operations, OCR, and security scanning - while automatically logging all actions to a local audit trail. * Strong Single-Step Accuracy with Multi-Step Limits: The model achieves 80% accuracy on single-step tool execution but drops to a 26% success rate on multi-step chains due to 'sibling confusion' (selecting a similar but incorrect tool), indicating it currently functions best in a guided, human-in-the-loop loop rather than as a fully autonomous agent.
Insilico Medicine and Liquid AI have partnered to create LFM2-2.6B-MMAI, a lightweight scientific foundation model for drug discovery that runs entirely on private pharmaceutical infrastructure. The 2.6-billion-parameter model achieves state-of-the-art performance across multiple drug discovery tasks whilst being ten times smaller than comparable systems. The model covers property prediction, molecular optimisation, affinity prediction and chemical reasoning. It outperformed TxGemma-27B on 13 of 22 pharmacokinetics and toxicology tasks, achieved 98.8% success rates on multi-parameter optimisation benchmarks, and produced better correlation scores than GPT-5.1, Claude Opus 4.5 and Grok-4.1 on Insilico's internal benchmark featuring 2.5 million experimental measurements across 689 protein targets. The collaboration addresses pharmaceutical companies' need to use advanced AI capabilities without sending proprietary data to external cloud services.
Liquid AI releases LFM2.5-1.2B-Thinking: a 1.2B parameter reasoning model that fits under 1 GB on-device. Liquid AI has released LFM2.5-1.2B-Thinking, a 1.2 billion parameter reasoning model that runs fully on device and fits in about 900 MB on a modern phone. What needed a data center 2 years ago can now run offline on consumer hardware, with a focus on structured reasoning traces, tool use, and math, rather than general chat. Position in the LFM2.5 family and core specs. LFM2.5-1.2B-Thinking is part of the LFM2.5 family of Liquid Foundation Models, which extends the earlier LFM2 architecture with more pre-training and multi stage reinforcement learning for edge deployment. The model is text only and general purpose with the following configuration: * 1.17B parameters, reported as a 1.2B class model * 16 layers, with 10 double gated LIV convolution blocks and 6 GQA blocks * Training budget of 28T tokens * Context length of 32,768 tokens * Vocabulary size of 65,536 * 8 languages, English, Arabic, Chinese, French, German, Japanese, Korean, Spanish Reasoning first behavior and thinking traces. The 'Thinking' variant is trained specifically for reasoning. At inference time it produces internal thinking traces before the final answer. These traces are chains of intermediate steps that the model uses to plan tool calls, verify partial results, and work through multi step instructions. Liquid AI team recommends this model for agentic tasks, data extraction pipelines, and retrieval augmented generation flows where you want explicit reasoning and verifiable intermediate steps. A practical way to think about it, you use LFM2.5-1.2B-Thinking as the planning brain inside agents and tools, and use other models when you need broad world knowledge or code heavy workflows. Benchmarks versus other 1B class models. Liquid AI team evaluates LFM2.5-1.2B-Thinking against models around 1B parameters on a suite of reasoning and instruction benchmarks. Compared to LFM2.5-1.2B-Instruct, three metrics improve strongly, math reasoning rises from about 63 to 88 on MATH 500, instruction following rises from about 61 to 69 on Multi IF, and tool use rises from about 49 to 57 on BFCLv3. LFM2.5-1.2B-Thinking competes with Qwen3-1.7B in thinking mode on most reasoning benchmarks while using around 40 percent fewer parameters and fewer output tokens on average. It also outperforms other 1B class baselines such as Granite-4.0-H-1B, Granite-4.0-1B, Gemma-3-1B-IT, and Llama-3.2-1B Instruct on many of these tasks. Training recipe and doom looping mitigation. Reasoning models often suffer from doom looping, where the model repeats fragments of its chain of thought instead of finishing the answer. LFM2.5-1.2B-Thinking uses a multi stage training pipeline to reduce this. The process starts with mid training that includes reasoning traces so the model learns a 'reason first then answer' pattern. Then supervised fine tuning on synthetic chains improves chain of thought generation. After that, preference alignment and RLVR are applied. In preference alignment, the research team generates 5 temperature sampled candidates and 1 greedy candidate per prompt and uses an LLM judge to pick preferred and rejected outputs, while also labeling looping outputs explicitly. During RLVR they add an n gram repetition penalty early in training. This reduces the doom loop rate from 15.74 percent at mid training to 0.36 percent after RLVR on a set of representative prompts. The result is a small reasoning model that can produce thinking traces without getting stuck in long repetitive outputs, which is important for interactive agents and on device UX. Inference performance and hardware footprint. A key design target is fast inference with a small memory footprint on CPUs and NPUs. LFM2.5-1.2B-Thinking can decode at about 239 tokens per second on an AMD CPU and about 82 tokens per second on a mobile NPU, while running under 1 GB of memory, with broad day one support for llama.cpp, MLX, and vLLM. The detailed hardware table uses 1K prefill and 100 decode tokens and gives the following examples for LFM2.5-1.2B-Thinking These numbers show that the model fits comfortably under 1 GB on phones and embedded devices while sustaining useful throughputs even at long contexts. Key takeaways. * LFM2.5-1.2B-Thinking is a 1.17B parameter reasoning model with 32,768 context length and runs under 1 GB on phones and laptops. * The model is optimized for explicit thinking traces, agentic workflows, data extraction, and RAG. * It reaches strong scores for a 1B class model, for example 87.96 on MATH 500, 85.60 on GSM8K, and competitive performance with Qwen3 1.7B in thinking mode with fewer parameters. * The training pipeline uses midtraining with reasoning traces, supervised fine tuning, preference alignment with 5 sampled along with 1 greedy candidate, and RLVR with n gram penalties, which reduces doom loops from 15.74 percent to 0.36 percent. * The model runs efficiently on AMD and Qualcomm NPUs and CPUs with runtimes like llama.cpp, FastFlowLM, and NexaML, is available in GGUF, ONNX, and MLX formats, and can be loaded easily from Hugging Face for on device deployment. Hosting providers/deployment. You can access or host the model through the following providers and platforms: Cloud & API providers. Model repositories (self-hosting). If you want to run the model locally or on your own infrastructure, the weights are available in various formats:
Liquid AI LFM2.5-1.2B-Thinking: compact power. Exploring Liquid AI's newest 1.2B reasoning model optimized for agentic tasks, RAG, and high-speed edge inference with LIV convolution architecture. Liquid AI LFM 2.5 Edge AI On-device AI Compact Models Agentic AI RAG Machine Learning Liquid AI has recently unveiled its latest breakthrough in compact language models: the LFM2.5-1.2B-Thinking. As part of the LFM 2.5 family, this model represents a significant step forward in bringing sophisticated reasoning capabilities to edge devices. Unlike traditional Transformers that struggle with memory and compute constraints on smaller hardware, Liquid's models leverage a unique architecture designed for efficiency and speed. The LFM2.5-1.2B-Thinking is specifically tuned for precision and logic, making it a powerful tool for developers looking to build local agents or fast RAG pipelines without the need for massive GPU clusters. Key features and architecture. The "1.2B" in its name refers to its 1.17 billion parameters, yet its performance punches well above its weight class. Here are the core specifications: * Liquid Architecture: The model consists of 16 layers, utilizing 10 double-gated LIV (Liquid Interleaved Variable) convolution blocks paired with 6 GQA (Grouped Query Attention) blocks. * Large Context Window: Despite its size, it supports a 32,768 token context length, which is essential for complex RAG tasks and long-form data extraction. * Massive Training Data: Liquid AI trained this model on a staggering 28 trillion tokens, ensuring a high degree of "world knowledge" and linguistic fluency for its size. * Multilingual Support: Out of the box, it supports English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish. Performance & optimization. One of the most impressive aspects of the LFM2.5-1.2B-Thinking is its inference speed. Liquid AI has partnered with industry leaders like AMD and Qualcomm to optimize the LFM family for NPUs (Neural Processing Units). * CPU & NPU Efficiency: The model offers extremely fast inference on standard CPUs with low memory footprints. * Edge Performance: On AMD Ryzen NPUs using FastFlowLM, the model can sustain ~52 tokens per second at 16K context and ~46 tokens per second even at its full 32K context. * Compact Thinking: Compared to models like Qwen3-1.7B, the LFM2.5-1.2B-Thinking achieves comparable or better results while requiring fewer output tokens to reach the same conclusion. Use cases: agents and RAG. Liquid AI recommends this model for specific scenarios where speed and reliability are paramount: * Agentic Tasks: Its ability to handle "thinking" steps and function calling makes it ideal for autonomous agents that need to run locally. * Data Extraction: Its reasoning capabilities allow it to parse complex documents and extract structured information with high accuracy. * Local RAG: With a 32K context window and fast inference, it's perfect for searching through local knowledge bases and summarizing information on-the-fly. [!IMPORTANT] While powerful, Liquid AI notes that this model is not recommended for knowledge-intensive "Jeopardy-style" tasks or deep programming work, where larger-scale models still hold the edge. The release of LFM2.5-1.2B-Thinking signals a shift in the AI industry toward specialized, compact models. By focusing on architecture-level efficiency rather than just raw size, Liquid AI is making "thinking" AI accessible for vehicles, mobile devices, and IoT hardware. As on-device AI continues to evolve, models like this will be the backbone of the next generation of privacy-focused, always-on digital assistants.