Senior Data Scientist
Posted on 9/2/2023
INACTIVE
First American

10,001+ employees

Title insurance & professional settlement services
Company Overview
First American is on a mission to provide comprehensive title insurance protection and professional closing/settlement services that produce clear property titles and enable the efficient transfer of real estate.
Data & Analytics
Real Estate

Company Stage

N/A

Total Funding

N/A

Founded

1889

Headquarters

Santa Ana, California

Growth & Insights
Headcount

6 month growth

0%

1 year growth

8%

2 year growth

4%
Locations
Orange, CA, USA
Experience Level
Entry
Junior
Mid
Senior
Expert
Desired Skills
Microsoft Azure
Python
Tensorflow
Git
Pytorch
SQL
Docker
AWS
Natural Language Processing (NLP)
Data Analysis
Google Cloud Platform
CategoriesNew
AI & Machine Learning
Requirements
  • PhD in quantitative field such as Mathematics, Statistics, Computer Science with 2+ years related work experience / MS with 5+ years of related work experience
  • Extensive experience developing end-to-end machine learning solutions and leading solution diagnosis, including designing & architecting machine learning models that solve business problems & fit into the overall engineering framework, experimentation, model pipeline build, performance optimization, integration and deployment
  • Proficiency in machine learning, NLP & deep learning techniques for tasks involving Named Entity Recognition, document/sentence embeddings, classification
  • Proficiency with programming languages such as Python & SQL as well as toolkits/frameworks including spaCy, gensim, PyTorch/Tensorflow
  • Strong experience in building end-to-end data pipelines, model performance monitoring processes and continuous model delivery. Knowledge of AWS Lambda, AWS Glue is a big plus
  • Familiarity with MLOps & common MLOps toolkits, e.g. MLflow, Sagemaker
  • Knowledge and experience working with engineering toolkits that are frequently used with machine learning model deployment. e.g., Git, GitHub Actions, Docker, AWS EC2, AWS ECS, AWS ECR etc
  • Familiarity with large scale data processing techniques & tools, e.g., multi-threaded computing, GPU computing, distributed computing in Ray, PySpark, etc
Responsibilities
  • Perform exploratory analysis, construct data pipelines, build machine learning models end-to-end from POC to deployment for large scale production systems
  • Monitor, maintain, optimize and continuously improve the deployed machine learning solutions during day to day operations
  • Deploy models through docker containers on AWS/GCP/Azure that serve real time and batch prediction results for various business functions
  • Design and implement scalable models with continuous monitoring and feedback collection system to enable automatic model training
  • Optimize model and data performance in terms of reduced computation time and cost
  • Establish MDM system to track model performances and generate alerts through MLFlow, AWS Sagemaker, etc