Full-Time

Senior AI Engineer

Updated on 5/8/2026

Ruby Labs

Ruby Labs

51-200 employees

Develops consumer health, edtech, entertainment apps

No salary listed

Europe

Remote

Must reside within approx ±4 hours of CET for overlap with working hours.

Category
Software Engineering (1)
Required Skills
LLM
Node.js
Machine Learning
RAG
Next.js
Requirements
  • Node.js and Next.js: Deep knowledge of the stack to build reliable services and handle complex LLM-generated data.
  • Dynamic prompting skills: Proven experience in building prompts where content is highly dependent on input variables and context injection.
  • OpenRouter experience: Experience working with unified APIs, managing rate limits, and selecting the most cost-effective models for specific tasks.
  • Langfuse (or similar): Understanding of LLM observability principles — setting up tracing, creating test datasets, and integrating scoring systems.
  • Evaluation methodology: Experience with frameworks like Retrieval-Augmented Generation (RAG) or building custom “LLM-as-a-judge” systems.
  • Analytical mindset: Ability to transform raw generation logs into actionable business metrics and technical insights.
  • Iterative mindset: Focus on continuous product improvement through constant feedback loops.
Responsibilities
  • Advanced Prompt Engineering: Designing complex, dynamic prompt templates with conditional logic and efficiently reusing information and context within prompts to maximize generation quality and reasoning.
  • Structured Outputs and Schemas: Implementing various response schemes (JSON mode, function calling, Zod/JSON schemas) to ensure AI outputs are predictable and ready for seamless integration into application logic.
  • Prompt Engineering and Evaluations: Building robust evaluation pipelines and using Langfuse to collect feedback and score the quality of responses in real time.
  • Tracing and Debugging: Performing deep debugging of complex LLM chains using Langfuse traces to identify bottlenecks and optimize for cost, latency, and context window usage.
  • AI A/B Testing: Running systematic experiments across different models via OpenRouter (e.g., comparing Claude 3.5 Sonnet vs. GPT-4o) and analyzing results based on quantitative metrics.
  • Data-Driven Decisions: Making deployment decisions for new prompts or models strictly based on quantitative benchmarks and trace data, rather than intuition.
  • Output Scoring and Analysis: Developing scoring systems to analyze the “Problem → Solution” chain and identify root causes of hallucinations or logic errors using Langfuse analytics.
  • Model Performance and Fine-Tuning: Regularly re-evaluating model performance as new architectures emerge and performing fine-tuning when necessary to meet specific domain requirements.
Desired Qualifications
  • Fine-Tuning: Practical experience in fine-tuning models for specific domain tasks or JSON compliance.
  • RAG Architecture: Understanding how to build and optimize Retrieval-Augmented Generation systems, including indexing, retrieval, and re-ranking.
  • Python: Basic knowledge for working with data science scripts or AI evaluation libraries.

Ruby Labs creates and runs a collection of consumer-facing digital products in health, education, and entertainment. Its apps are built for mobile devices and marketed directly to users, generating revenue from in-app purchases and subscriptions within its own product ecosystem. The company distinguishes itself by managing its own multi-vertical portfolio and operating with a remote-first culture, focusing on broad consumer appeal across wellness, learning, and entertainment needs. The company’s goal is to provide accessible digital platforms that help people improve well-being, acquire new skills, and enjoy engaging entertainment through its direct-to-consumer apps.

Company Size

51-200

Company Stage

N/A

Total Funding

N/A

Headquarters

London, United Kingdom

Founded

2018

Simplify Jobs

Simplify's Take

What believers are saying

  • Bundled platforms yield 34% higher retention amid 2025 subscription fatigue.
  • 100M+ user base fuels AI personalization boosting engagement 28-45%.
  • Southeast Asia expansion drives 2.5x growth over Western markets.

What critics are saying

  • Duolingo's health gamification erodes edtech users in 6-12 months.
  • Calm's meditation acquisition consolidates 70% health share in 12-18 months.
  • UK CMA probe forces unbundling, slashing 30% revenue in 3-6 months.

What makes Ruby Labs unique

  • Ruby Labs operates multi-vertical portfolio spanning health, education, entertainment since 2018.
  • Remote-first London HQ enables rapid scaling across consumer app markets.
  • Direct-to-consumer model bundles wellness, learning, entertainment via subscriptions.

Help us improve and share your feedback! Did you find this helpful?

Your Connections

People at Ruby Labs who can refer or advise you

Benefits

Remote Work Options

Unlimited Paid Time Off

Paid National Holidays

Company-provided MacBook

Flexible Work Hours