Simplify Logo

Full-Time

Research Scientist

Applied AI/LLM

Updated on 9/7/2024

Databricks

Databricks

5,001-10,000 employees

Unified data, analytics, and AI platform

Data & Analytics
Consulting
Hardware
Industrial & Manufacturing
Consumer Software
Enterprise Software
AI & Machine Learning

Compensation Overview

$142.5k - $180.5kAnnually

+ Bonus + Equity

Mid, Senior

Bellevue, WA, USA

Category
Lab & Research
Interdisciplinary Research
Required Skills
Natural Language Processing (NLP)
Requirements
  • PhD in Computer Science, strongly preferred, or a related field or equivalent practical experience
  • 2+ years of machine learning engineering experience in high-velocity, high-growth companies. Alternatively, a strong background in relevant ML research in academia will be considered as an equivalent qualification.
  • Experience developing AI/ML systems at scale in production or in high-impact research environments.
  • Strong track record of working with language modeling technologies. This could include the following: Developing generative and embedding techniques, modern model architectures, fine tuning / pre-training datasets, and evaluation benchmarks.
  • Strong coding and software engineering skills, and familiarity with software engineering principles around testing, code reviews and deployment.
  • Experience deploying and scaling language models in production; deep understanding of the unique infrastructure challenges posed by training and serving LLMs.
  • Strong understanding of computer science fundamentals.
  • Prior experience with Natural Language Processing and transforming unstructured text into structured code, queries and data is a plus.
  • Contributions to well-used open-source projects.
Responsibilities
  • Shape the direction of our applied ML areas and intelligence features in our products, helping customers translate unstructured text into structured code, queries and data.
  • Drive the development and deployment of state-of-the-art AI models and systems that directly impact the capabilities and performance of Databricks' products and services.
  • Architect and implement robust, scalable ML infrastructure, including data storage, processing, and model serving components, to support seamless integration of AI/ML models into production environments.
  • Develop novel data collection, fine-tuning, and pre-training strategies that achieve optimal performance on specific tasks and domains.
  • Design and implement automated ML pipelines for data preprocessing, feature engineering, model training, hyperparameter tuning, and model evaluation, enabling rapid experimentation and iteration.
  • Implement advanced model compression and optimization techniques to reduce the resource footprint of language models while preserving their performance.
  • Contribute to the broader AI community by publishing research, presenting at conferences, and actively participating in open-source projects, enhancing Databricks' reputation as an industry leader.

Databricks provides a Lakehouse Platform integrating data management, analytics, and AI capabilities across a range of industries. Employing tools like Delta Lake and Databricks SQL enhances operational efficiency and precision. As a workplace, it offers a collaborative and technology-driven environment, promoting growth through cutting-edge machine learning solutions and a commitment to leading the industry in data innovation. This makes it an inspiring place for tech professionals to advance their careers in a vibrant and forward-thinking setting.

Company Stage

Series I

Total Funding

$4.2B

Headquarters

San Francisco, California

Founded

2013

Growth & Insights
Headcount

6 month growth

7%

1 year growth

21%

2 year growth

79%
Simplify Jobs

Simplify's Take

What believers are saying

  • The $1 billion acquisition of Tabular is likely to enhance Databricks' data management capabilities and market reach.
  • The development and launch of the DBRX generative AI model, with a $10 million investment, underscores Databricks' dedication to leading in AI technology.
  • High-profile investments from figures like Nancy Pelosi indicate strong confidence in Databricks' growth potential.

What critics are saying

  • The integration of Tabular's team and technology could face challenges, potentially disrupting operations.
  • The competitive landscape in AI and data analytics is intense, with major players like Google and Microsoft posing significant threats.

What makes Databricks unique

  • Databricks' acquisition of Tabular, founded by the creators of Apache Iceberg, strengthens its position in the open lakehouse market.
  • The launch of DBRX, an open-source LLM that outperforms GPT-3.5 and Llama 2, showcases Databricks' commitment to cutting-edge AI innovation.
  • Strategic partnerships, such as with AVEVA for industrial AI, highlight Databricks' ability to integrate and enhance diverse technological ecosystems.

Benefits

Extended health care including dental and vision

Life/AD&D and disability coverage

Equity awards

Flexible Vacation

Gym reimbursement

Annual personal development fund

Work headphones reimbursement

Employee Assistance Program (EAP)

Business travel accident insurance

Paid Parental Leave