Senior Software Engineer
Data Infrastructure
Updated on 9/22/2023
AI quality solutions
Company Overview
TruEra is on a mission of helping people and machines make better decisions together. The company provides the first AI Quality platform, to help enterprises analyze machine learning, improve model quality and build trust.
Locations
San Carlos, CA, USA
Experience Level
Entry
Junior
Mid
Senior
Expert
Desired Skills
Apache Hive
Apache Spark
Apache Kafka
Data Analysis
Data Structures & Algorithms
Docker
Hadoop
Java
Airflow
Linux/Unix
Postgres
SQL
Apache Flink
Kubernetes
Python
CategoriesNew
DevOps & Infrastructure
Software Engineering
Requirements
- Someone who enjoys having significant ownership of features and systems and pursues results-driven development approaches consistent with pragmatism
- Someone who is set on building systems that balance scalability, availability, and latency
- An advocate for improving engineering efficiency, continuous deployment and automation tooling, monitoring solutions, and self-healing systems that enhance the developer experience
- Good communication skills, mentoring, and a force-multiplying track record
- Experience in ground-up system building
- You have led and mentored others and care about the development of your teammates
- You desire to learn and grow, push yourself and your team, share lessons with others and provide constructive and continuous feedback, and be receptive to feedback from others
- BS in Computer Science or equivalent
- Strong product mindset and 4+ years of proven track record in building and maintaining big data platforms for streaming and batch data processing
- 3+ years of experience in data engineering, building backend systems, and APIs
- Expertise in building data pipelines using open-source frameworks (Hadoop, Spark, Kafka, Airflow, etc)
- Strong data infrastructure experience on-premise or Cloud Infrastructure
- Solid background in the fundamentals of computer science and distributed systems
- Experience in containerized deployment or Kubernetes
- Ability to build systems that balance scalability, availability, and latency
- Advocate for the continuous deployment and automation of tools, monitoring, and self-healing systems
- Strong hands-on coding experience in Java, Python, SQL and comfortable diving into any new language or technology
- Experience with some or similar or all of Spark, Flink, Airflow, Hive, Druid, Presto, PostgreSQL, DBT, ETL, and familiarity with key/value databases, Kafka, and Kubernetes
- Experience working with modern cloud-based microservice architectures
- Good understanding and experience in modern ETL (incremental, one-time) with DAG design patterns, data quality checks etc
Responsibilities
- Lead the design and implementation of complex distributed systems - be it a new service to power new functionality or data pipelines to ingest large volumes of data or implementing state-of-the-art complex algorithms
- Build APIs to backend complex data systems across a range of technologies to support new and improved product functionality
- Partner with data scientists, infrastructure engineers, and product managers to design, build and deliver big data projects and new data platform capabilities
- Debug hard problems - that's a given! When things break -- and they will -- you will find yourself debugging those challenging bugs and will be eager to fix things
- Continuously learn something new, whether it's a new technology or a quirk of a language we otherwise didn't know. On occasion, you may find yourself picking up a new language or working with an unfamiliar platform
- Help define and build the TruEra Data ecosystem as use cases grow
- Build scalable data pipelines to move data from different Storage systems to the Truera Platform
- Participate in early customer engagements and PoCs, and use that context to drive new product features
- Review design and code, and make sure what we ship is awesome
- “Create what's not there”
Desired Qualifications
- Experience building machine learning models or ecosystems
- Experience with Linux and containers using Docker and Kubernetes is a big plus
- Having been a part of an engineering team at an early-stage startup