Facebook pixel

Machine Learning Engineer, Data Platform
Posted on 3/12/2022
INACTIVE
Locations
New York, NY, USA
Experience Level
Entry
Junior
Mid
Senior
Expert
Desired Skills
Apache Hive
Apache Spark
AWS
Data Analysis
Data Science
Kafka
Git
Leadership
MongoDB
Postgres
Pytorch
React.js
Redshift
Scala
SQL
Tensorflow
Terraform
Kubernetes
Python
Scikit-Learn
NoSQL
Requirements
  • Several programming languages (Python, Scala, Go, etc.)
  • Experience with Machine Learning Frameworks and Libraries
  • Experience building a machine learning platform using tools like SparkML, Pytorch, Tensorflow, Scikit-Learn, etc
  • Building scalable data intensive microservices and tools to enable data science
  • Demonstrated knowledge of end-to-end model deployment cycle
  • Streaming data processing frameworks like Kafka, Spark Structured Streaming, or Flink
  • A diverse set of SQL and NoSQL databases like MongoDB, Cassandra, Redshift, Postgres, etc
  • Different storage formats like Parquet, ORC, Avro, Arrow, and JSON
  • Data processing frameworks like Spark or Apache Beam
  • AWS services such as EMR, Lambda, S3, Athena, Glue, IAM, RDS, etc
  • Git and Github
  • CI/CD Pipelines
Responsibilities
  • Constantly think of ways to squeeze better performance out of a machine learning data platform
  • Communicate with stakeholders to discover requirements for designing and building a solution that will scale to their needs
  • Plan effective data storage, security, sharing, and publishing within the organization
  • Design boilerplate architecture that can abstract underlying technology from end users
  • Design, manage, and test disaster recovery procedures for a variety of data platforms
  • Value code simplicity and performance
  • Obsess over data: everything needs to be accounted for and be thoroughly tested
  • Build great things alone, but the greatest things in collaboration with others
  • In three months you will have familiarized yourself with much of our data platform, be making regular contributions to our codebase, will be collaborating regularly with stakeholders to widen your knowledge, and helping to resolve incidents and respond to user requests
  • In six months you will have successfully investigated, scoped, executed, and documented a small to medium sized project and worked with stakeholders to make sure their data needs are satisfied by implementing improvements to our platform
  • In a year you will have become the key person for several projects within the team and will have contributed to the data platform's roadmap. You will have made several sizable contributions to the project and are regularly looking to improve the overall stability and scalability of the architecture
Desired Qualifications
  • You are deeply familiar with Spark and/or Hive
  • You have expert experience with Airflow
  • You have expert experience with different storage format like Parquet, ORC, Avro, Arrow, and JSON
  • You are familiar with deployment and configuration tools such as Kubernetes, Drone, and Terraform
  • You have expert experience building microservices
  • You've built an end-to-end production-grade data platform that runs on cloud infrastructure
  • You have experience building a web frontend using frameworks like React
MongoDB

1,001-5,000 employees

Modern, general-purpose database platform
Company Overview
MongoDB empowers innovators to create, transform, and disrupt industries by unleashing the power of software and data
Benefits
  • Family Support Programs
  • Flexible PTO
  • Fertility and Adoption Assistance
  • Employee Affinity Groups
  • Transgender Benefits and Support
  • Mental Health
  • Wellness Events and Programs
  • Global Mobility
Company Values
  • Think Big, Go Far
  • Make It Matter
  • Build Together
  • Embrace the Power of Difference
  • Be Intellectually Honest
  • Own What You Do