Full-Time

Member of technical staff

Inference

Posted on 2/22/2025

HCompany

HCompany

51-200 employees

Enterprise Software
AI & Machine Learning

Mid, Senior

Company Does Not Provide H1B Sponsorship

London, UK + 1 more

More locations: Remote in USA

This role has the potential to be fully remote or hybrid for candidates based in cities where we have an office - currently Paris and London.

Category
Backend Engineering
Software Engineering
Required Skills
Rust
Python
CUDA
Machine Learning
C/C++

You match the following HCompany's candidate preferences

Employers are more likely to interview you if you match these preferences:

Degree
Experience
Requirements
  • MS or PhD in Computer Science, Machine Learning or related fields
  • Proficient in at least one of the following programming languages: Python, Rust or C/C++
  • Experience in GPU programming such as CUDA, Open AI Triton, Metal, etc.
  • Experience in model compression and quantization techniques
  • Collaborative mindset, thriving in dynamic, multidisciplinary teams
  • Strong communication and presentation skills
  • Eager to explore new challenges
Responsibilities
  • Develop scalable, low-latency and cost effective inference pipelines
  • Optimize model performance: memory usage, throughput, and latency, using advanced techniques like distributed computing, model compression, quantization and caching mechanisms
  • Develop specialized GPU kernels for performance-critical tasks like attention mechanisms, matrix multiplications, etc.
  • Collaborate with H research teams on model architectures to enhance efficiency during inference
  • Review state-of-the-art papers to improve memory usage, throughput and latency (Flash attention, Paged Attention, Continuous batching, etc.)
  • Prioritize and implement state-of-the-art inference techniques
Desired Qualifications
  • Experience with LLM serving frameworks such as vLLM, TensorRT-LLM, SGLang, llama.cpp, etc.
  • Experience with CUDA kernel programming and NCCL
  • Experience in deep learning inference framework (Pytorch/execuTorch, ONNX Runtime, GGML, etc.)

Company Size

51-200

Company Stage

N/A

Total Funding

$194.5M

Headquarters

Paris, France

Founded

2023

Simplify Jobs

Simplify's Take

What believers are saying

  • HCompany raised $220 million, indicating strong investor confidence.
  • Paris location offers strategic advantages for AI collaboration and expansion.
  • Focus on complex task-solving AI models could lead to sophisticated solutions.

What critics are saying

  • Increased competition from well-funded startups like Holistic AI.
  • Potential talent acquisition challenges from startups led by ex-DeepMind scientists.
  • Rapid AI innovation could render HCompany's technologies obsolete.

What makes HCompany unique

  • HCompany focuses on agentic AI models for autonomous decision-making.
  • Founded by ex-DeepMind scientists, HCompany has strong expertise in AI innovation.
  • HCompany is based in Paris, a growing hub for AI development.

Help us improve and share your feedback! Did you find this helpful?

Growth & Insights and Company News

Headcount

6 month growth

-19%

1 year growth

-13%

2 year growth

-13%
Sifted
May 21st, 2024
Ex-Deepmind scientists raise $220m seed round to launch Paris-based “agentic” AI startup H

The startup, previously known as Holistic, says its models will be able to solve more complex tasks

Bloomberg
May 8th, 2024
DeepMind Alums In Paris Raise $200 Million For Holistic AI From Accel

Holistic AI, a new startup in Paris working to leapfrog other generative AI models, has closed the first tranche of a $200 million initial financing round, according to people familiar with the deal.

PYMNTS
Jan 19th, 2024
DeepMind Scientists Considering $220 Million Round for AI Startup

A pair of scientists at Google DeepMind, Laurent Sifre and Karl Tuyls, are reportedly in discussions with potential investors to establish their own artificial intelligence (AI) startup in Paris.  The potential startup, currently known as Holistic, aims to develop a new AI model, Bloomberg reported Friday (Jan. 19). The startup is different from the London-based enterprise software business Holistic AI, the […]

Bloomberg
Jan 19th, 2024
Google DeepMind Scientists in Talks to Leave and Form AI Startup

A pair of scientists at Google DeepMind, the Alphabet Inc. artificial intelligence division, have been talking with investors about forming an AI startup in Paris, according to people familiar with the conversations.