Database/Data Infrastructure Engineer
India
Posted on 7/19/2023
INACTIVE
Cloud-native lakehouse service company
Company Stage
Series A
Total Funding
$33M
Founded
2021
Headquarters
Menlo Park, California
Growth & Insights
Headcount
6 month growth
↑ 20%1 year growth
↑ 73%2 year growth
↑ 300%Locations
Remote
Experience Level
Entry
Junior
Mid
Senior
Expert
Desired Skills
Apache Spark
Data Analysis
Data Structures & Algorithms
Java
Linux/Unix
Kubernetes
CategoriesNew
DevOps & Infrastructure
Software Engineering
Requirements
- Strong, object-oriented design and coding skills (C/C++ and/or Java preferably on a UNIX or Linux platform)
- Experience with inner workings of distributed (multi-tiered) systems, algorithms, and relational databases
- Deal well with ambiguous/undefined problems; ability to think abstractly; articulate technical challenges and solutions
- Speed and hustle → Ability to prioritize across feature development and tech debt
- Ability to solve complex programming/optimization problems
- Ability to quickly prototype optimization solutions and analyze large/complex data
- Good communication skills
Responsibilities
- Design new concurrency control and transactional capabilities, that maximizes throughput for competing writers
- Design and implement new indexing schemes, specifically optimized for incremental data processing and analytical query performance
- Design systems that help scale and streamline metadata and data access from different query/compute engines
- Solve hard optimization problems to improve the efficiency (increase performance and lower cost) of distributed data processing algorithms over a Kubernetes cluster
- Leverage data from existing systems to find inefficiencies, and quickly build and validate prototypes
- Collaborate with other engineers to implement and deploy, safely rollout the optimized solutions in production
Desired Qualifications
- Experience working with database systems, Query Engines or Spark codebases
- Experience in optimization mathematics (linear programming, nonlinear optimization)
- Existing publications of optimizing large-scale data systems in top-tier distributed system conferences
- PhD degree with 2+ industrial experience in solving and delivering high-impact optimization projects