Simplify Logo

Full-Time

Senior Software Engineer

Open Source, US

Confirmed live in the last 24 hours

Onehouse

Onehouse

51-200 employees

Data lakehouse solution for efficient data management

Data & Analytics
AI & Machine Learning
Enterprise Software

Senior

Sunnyvale, CA, USA

Hybrid position requiring in-office presence.

Category
Backend Engineering
Full-Stack Engineering
Software Engineering
Required Skills
Data Structures & Algorithms
Java
Linux/Unix
Data Analysis
Requirements
  • 5-7+ years building large-scale data systems.
  • Strong, object-oriented design and coding skills with Java, preferably on a UNIX or Linux platform.
  • Experience with inner workings of distributed (multi-tiered) systems, algorithms, and relational databases.
  • Experience with large scale data compute engines / processing frameworks.
  • Experience building distributed and/or data storage systems or query engines.
  • An ability to prioritize across feature development and tech debt, balancing urgency and speed.
  • An ability to solve complex programming/optimization problems.
  • Robust and clear communication skills.
Responsibilities
  • Build, design and deliver features/improvements to Apache Hudi.
  • Ensure high quality and timely delivery of innovations and improvements in Apache Hudi.
  • Dive deep into the architectural details of data ingestion, data storage, data processing and data querying to ensure that Apache Hudi is built to be the most robust, scalable and interoperable data lakehouse.
  • Own discussions and work with open source partners/vendors to: troubleshoot issues with Hudi, ensure Hudi support in for compute engines like Pretso/Trino and act as the face of Hudi to the community at large via meetups, customer meetings, talks etc.
  • Partner with and mentor engineers on the team.

Onehouse.ai offers a data lakehouse solution that helps businesses manage and optimize their data efficiently. Their main product is a fully managed service that allows clients to organize various types of data seamlessly, using formats like Apache Hudi, Apache Iceberg, and Delta Lake. This service automates data management tasks such as clustering, compaction, and encryption, making it easier for businesses to handle their data without needing extensive engineering resources. Onehouse.ai stands out from competitors with its usage-based pricing model, which can reduce data management costs by 50% or more compared to traditional solutions. The goal of Onehouse.ai is to simplify data management for businesses of all sizes, enabling them to scale their data operations while minimizing costs.

Company Stage

Series A

Total Funding

$33M

Headquarters

San Francisco, California

Founded

2021

Growth & Insights
Headcount

6 month growth

-1%

1 year growth

9%

2 year growth

75%
Simplify Jobs

Simplify's Take

What believers are saying

  • Securing $35M in Series B funding and launching new products enhances Onehouse.ai's ability to innovate and expand its market presence.
  • Partnerships with industry giants like Microsoft and Google for the OneTable project highlight Onehouse.ai's influence and potential for reshaping the cloud data lake landscape.
  • Winning the 2023 Digital Innovator Award from Intellyx underscores Onehouse.ai's leadership and recognition in the digital transformation space.

What critics are saying

  • The competitive landscape in data management and cloud computing is intense, with major players like Snowflake and Databricks posing significant challenges.
  • Reliance on open-source technologies may lead to slower adoption rates among enterprises wary of open-source solutions.

What makes Onehouse unique

  • Onehouse.ai's focus on open storage and interoperability with multiple table formats like Apache Hudi, Iceberg, and Delta Lake sets it apart from competitors who may lock clients into proprietary systems.
  • The usage-based pricing model offers a cost-effective alternative to traditional cloud data warehouses, reducing data management costs by 50% or more.
  • Onehouse.ai's automated data management features, such as clustering, compaction, and encryption, provide a seamless and optimized data experience without extensive engineering resources.