Full-Time

Senior Cloud Data Infrastructure Engineer

Cloud AutoScaling, Multiple Teams

Posted on 9/4/2025

Clickhouse

Clickhouse

501-1,000 employees

Open-source columnar OLAP database system

Compensation Overview

$133.4k - $232k/yr

Remote in USA

Remote

Category
DevOps & Infrastructure (1)
Required Skills
Kubernetes
Microsoft Azure
Apache Spark
Apache Kafka
AWS
Go
C/C++
Google Cloud Platform
Requirements
  • 5+ years of relevant software development industry experience building and operating scalable, fault-tolerant, distributed systems.
  • Experience building operators with Kubernetes, controller runtime
  • Production experience with programming languages like Go, C++
  • Strong problem-solver with experience in debugging things in production
  • Expertise with a public cloud provider (AWS, GCP, Azure) and their infrastructure as a service offering (e.g., EC2)
  • Experience with Data Storage, Ingestion, and Transformation (Spark, Kafka or similar tools)
  • Excellent communication skills and the ability to work well within and across engineering teams
Responsibilities
  • Build a cutting-edge Cloud Native platform on top of the public cloud.
  • Improve the metrics pipeline and build algorithms to generate better autoscaling statistics and recommendations.
  • Work on the autoscale and Kubernetes operator to support seamless Vertical and Horizontal Auto-scaling.
  • Work closely with our ClickHouse core development team and other data plane teams, partnering with them to support auto-scaling use cases as well as other internal infrastructure improvements.
  • Architecting and building a robust, scalable, and highly available distributed infrastructure
Desired Qualifications
  • Experience with Python (uv, rye, fastAPI) Data Science (Pandas, NumPy etc) is good to have.

ClickHouse builds a fast, scalable database designed for analytics. It provides a column-oriented database system that stores data by column, which speeds up analytical queries and makes it well-suited for OLAP workloads. The software is available as open-source and can be deployed locally or in the cloud, and the company also offers a fully managed ClickHouse service on AWS, Google Cloud, and Azure. This combination gives users a low-cost, easy-to-manage option for large-scale data processing. ClickHouse differentiates itself from many competitors with its high performance on analytical queries, open-source model, and the added option of a managed cloud service. The company’s goal is to help developers and businesses analyze large datasets quickly and cost-effectively by providing a fast, easy-to-use, and scalable data management solution.

Company Size

501-1,000

Company Stage

Late Stage VC

Total Funding

$1.1B

Headquarters

Palo Alto, California

Founded

2021

Simplify Jobs

Simplify's Take

What believers are saying

  • Raised $400M Series D in February 2026 at $15B valuation with 250% ARR growth.
  • Acquired Langfuse to dominate LLM observability for AI workloads.
  • Native Postgres service unifies OLTP and OLAP for 100x faster AI analytics.

What critics are saying

  • Databricks undercuts pricing and captures 40% more workloads via Unity Catalog.
  • Snowflake Cortex bundles vector search, eroding Langfuse advantage in 12 months.
  • Google BigQuery v2 achieves 5x insert throughput, slashing ClickHouse cloud ARR 30%.

What makes Clickhouse unique

  • ClickHouse uses column-oriented storage for 100x faster OLAP queries than row-based databases.
  • Vectorized processing boosts CPU efficiency for real-time analytical reports.
  • Merge-tree replication and distributed queries enable massive scalability.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Health Insurance

Unlimited Paid Time Off

Flexible Work Hours

Remote Work Options

Stock Options

Home Office Stipend

Growth & Insights and Company News

Headcount

6 month growth

-2%

1 year growth

-5%

2 year growth

-4%
ClickHouse
Feb 19th, 2026
February 2026 newsletter

Feb 19, 2026 · 7 minutes read This month, ClickHouse has ClickHouse's $400M Series D, the release of the official Kubernetes operator, a data modelling guide, how ClickHouse optimizes Top-N queries, and more! Featured community member: Ino de Bruijn #. This month's featured community member is Ino de Bruijn, Data Visualization Team Lead at Memorial Sloan Kettering Cancer Center's Cancer Data Science Initiative. Ino leads a team of engineers building software tools for cancer research, visualizing and disseminating data from major consortia including HTAN, Break Through Cancer, AACR GENIE, and the Gray BRCA Pre-Cancer Atlas. For nearly 11 years, he's also been instrumental in developing cBioPortal - the most popular cancer genomics tool worldwide, with over 3,000 daily users and more than 25,000 citations. At the ClickHouse New York Meetup in December, Ino presented on his team's work building a conversational AI interface for cBioPortal using ClickHouse, Anthropic's Claude, and LibreChat - a fully open-source solution making cancer research data more accessible to researchers and clinicians. Global virtual events #. Virtual training #. Data Warehousing Events in AMER #. Events in EMEA #. Events in APAC #. 26.1 release #. The first release of 2026 adds support for the sparseGrams tokenizer to the text index, which also now supports arrays of Strings or FixedStrings. There's support for the Variant data type in all functions, new syntax for indexing projections, deduplication of asynchronous inserts with materialized views, and more! ClickHouse raises $400M Series D, acquires Langfuse, and launches postgres #. ClickHouse closed a $400 million Series D funding round led by Dragoneer Investment Group, with participation from Bessemer Venture Partners, GIC, Index Ventures, Khosla Ventures, Lightspeed Venture Partners, T. Rowe Price Associates, and WCM Investment Management. Alongside the funding announcement, ClickHouse acquired Langfuse, an open-source LLM observability platform with over 20K GitHub stars and more than 26M+ SDK installs per month. Additionally, ClickHouse launched an enterprise-grade PostgreSQL service integrated with its platform. Provable completeness: guaranteeing zero data loss in trade collection from crypto exchanges #. Unreliable WebSocket connections and network interruptions create a persistent challenge to data quality in cryptocurrency market data collection. Koinju, a crypto platform built for finance professionals, ingests millions of trades per day across hundreds of markets. For their clients, even a single missing trade can distort volumes, P&L calculations, risk exposures, and regulatory reports - making data completeness non-negotiable. In this blog post, Dmitry Prokofyev, CTO of Koinju, describes a novel solution using only ClickHouse to detect and automatically remediate missing trades from Coinbase. The architecture combines three ClickHouse features to create a self-healing system: Refreshable Materialized Views for detection, a separate validation service for REST API backfilling, and ReplacingMergeTree for automatic deduplication of resolved gaps. Introducing the official ClickHouse Kubernetes Operator: seamless analytics at scale #. Grisha Pervakov introduces ClickHouse's official open-source Kubernetes Operator, designed to simplify the deployment and management of ClickHouse clusters on Kubernetes. The operator enables rapid provisioning of production-ready clusters with built-in sharding and replication capabilities while eliminating the need for separate ZooKeeper installations by using ClickHouse Keeper for cluster coordination. AI-Generated analytics without wrecking your cluster #. Luke from Faster Analytics Fridays outlines three guardrail patterns for safely enabling AI-generated database queries without crashing clusters: * Using pre-vetted query templates with parameter binding instead of raw SQL generation * Exposing curated materialized views rather than raw tables, and * Enforcing query budgets that validate estimated row scans and execution time before queries hit the database. Data modeling guide for real-time analytics with ClickHouse #. Simon Späti has written a comprehensive guide to designing optimized data models in ClickHouse for sub-second real-time analytics, emphasizing that performance comes from shifting computational work from query time to insertion time. The article covers core principles, including denormalization to minimize joins, partitioning by time and secondary dimensions for query pruning, and predicate pushdown optimization that moves filters closer to data sources. PostgreSQL + ClickHouse as the open source unified data stack #. Lionel Palacin introduces an open-source unified data stack that combines PostgreSQL for transactional workloads with ClickHouse for analytics. It uses PeerDB for near-real-time CDC replication and the pg_clickhouse extension for transparent query offloading without rewriting SQL, enabling teams to start with PostgreSQL and add ClickHouse when analytical performance becomes critical. Quick reads #. * Mikhail Zharkov describes building a scalable price distribution pipeline for trading systems using ClickHouse. * Abhinaav Ramesh built Ollama-Local-Serve, a self-hosted LLM server with complete observability, using ClickHouse for time-series analytics, OpenTelemetry instrumentation, FastAPI monitoring APIs, and a React dashboard with streaming chat. * Pranav Mehta describes investigating ClickHouse connection retry warnings in an on-prem environment that initially appeared to be a critical connection leak but turned out to be expected behavior when the connection pool attempts to reuse stale connections after idle periods. * Lionel Palacin redesigned the data pipeline of ClickPy, a ClickHouse-backed service that contains 2.2 trillion rows of Python package analytics. Data was previously ingested using custom batch scripts but has been migrated to ClickPipes and uses ClickHouse's lightweight deletes to correct historical data without rebuilding the entire dataset. * Tom Schreiber explains how ClickHouse optimizes Top-N queries using granule-level data skipping with min/max metadata filtering, achieving 5-10x speedup and 10-100x reduction in data processed. Loading form...

The Software Report
Feb 10th, 2026
ClickHouse Raises $400M Series D to Expand Analytics and AI Infrastructure

ClickHouse raises $400M Series D to expand analytics and AI infrastructure. Published. February 10, 2026 ClickHouse has announced it has raised $400 million in a Series D financing round led by Dragoneer Investment Group, with participation from Bessemer Venture Partners, GIC, Index Ventures, Khosla Ventures, Lightspeed Venture Partners, accounts advised by T. Rowe Price Associates, and WCM Investment Management. The funding follows rapid growth for the company, with more than 3,000 customers now using ClickHouse Cloud and annual recurring revenue increasing over 250% year over year. Become a subscriber. Please purchase a subscription to continue reading this article. Recent adopters and expanded customers include Capital One, Polymarket, Airwallex, and Decagon, alongside an existing base that includes Meta, Sony, Tesla, and Cursor. Aaron Katz, CEO of ClickHouse, said, "This momentum validates our focus on performance and cost efficiency for the most demanding data workloads," adding that the company is expanding into unified transactional and analytical workloads and LLM observability. Alongside the financing, ClickHouse announced the acquisition of Langfuse, an open-source LLM observability platform. The company also introduced an enterprise-grade Postgres service integrated with ClickHouse to support AI applications that require both transactions and real-time analytics. Christian Jensen, Partner at Dragoneer, said, "As models become more capable, the bottleneck moves to data infrastructure, and ClickHouse delivers the performance and reliability required at scale." Marc Klingen, CEO of Langfuse, added, "LLM observability is fundamentally a data problem, and together we can deliver faster insight from production issues to measurable improvement."

Business Wire
Jan 22nd, 2026
ClickHouse launches native Postgres service with Ubicloud for unified real-time and AI apps

ClickHouse has announced a high-performance Postgres service natively integrated with its real-time analytical database, creating a unified data stack for developers building AI-driven applications. The service is built in partnership with Ubicloud, an open-source cloud company led by veterans from Citus Data, Heroku and Microsoft. The integration allows developers to synchronise transactional data from Postgres to ClickHouse with a few clicks, enabling up to 100 times faster analytics. The announcement builds on ClickHouse's 2024 acquisition of PeerDB, whose technology powers real-time data synchronisation for hundreds of enterprise customers. Thousands of companies including GitLab, Instacart and Cloudflare already use both databases for different workloads. The private preview is now available to developers.

Langfuse
Jan 18th, 2026
ClickHouse acquires Langfuse - Langfuse Blog

Our goal continues to be building the best LLM engineering platform

TechCrunch
Jan 16th, 2026
Snowflake, Databricks challenger ClickHouse hits $15B valuation | TechCrunch

The $400 million round was led by Dragoneer.

INACTIVE