Full-Time

Member of Technical Staff

Training Platform

Posted on 5/11/2026

Prime Intellect

Prime Intellect

11-50 employees

Decentralized GPU compute marketplace for AI

Compensation Overview

$150k - $300k/yr

+ Equity

H1B Sponsorship Available

San Francisco, CA, USA

Remote

Remote work option available; SF office also offered.

Category
Software Engineering (2)
,
Required Skills
Kubernetes
FastAPI
Grafana
React.js
Computer Networking
opentelemetry
TypeScript
Prometheus
Terraform
Next.js
Ansible
Grafana Loki
DevOps
Linux/Unix
Helm
Google Cloud Platform
Requirements
  • Strong working knowledge of the modern AI stack - open model families, finetuning techniques (LoRA, QLoRA, full FT, RLHF/RLAIF), inference engines (vLLM, SGLang, TensorRT-LLM)
  • Familiarity with GPU hardware tradeoffs (H100 / H200 / B200, NVLink, interconnects, memory hierarchy) and what they mean for training and inference workloads
  • Understanding of distributed training fundamentals (data/tensor/pipeline/expert parallelism, NCCL, multi-node scheduling)
  • Awareness of what's happening at the frontier - new models, training methods, infra patterns - and the ability to translate that into product decisions
  • Strong Kubernetes operations experience - Helm, CRDs, operators, KEDA, gang scheduling, GPU operator
  • Comfortable debugging real production clusters (kubectl, pod lifecycle, node issues, networking)
  • Cloud platform experience (GCP preferred - GCS, GKE, Cloud Run, Cloud Tasks)
  • Infrastructure automation (Helm, Terraform, Ansible) and a GitOps mindset
  • Observability: Prometheus, Grafana, Loki, OpenTelemetry, DCGM
  • Linux fundamentals: networking, namespaces, performance tuning
  • Strong Python backend development (FastAPI, async, SQLAlchemy)
  • Comfortable building Python control-plane agents that talk to Kubernetes APIs
  • Modern frontend development (TypeScript, React/Next.js, Tailwind, shadcn) - enough to ship product surfaces end-to-end
  • REST and tRPC API design
  • Experience building developer tools, dashboards, and live-monitoring UIs
Responsibilities
  • Design and operate Kubernetes-based training and inference orchestration across multi-cluster, multi-cloud GPU fleets
  • Build and maintain Helm charts that compose trainers, inference servers, environment servers, and supporting services into reproducible Training stacks
  • Develop the Python control-plane agents that watch pods, report run state to the platform, and keep clusters in sync
  • Implement scheduling and autoscaling for heterogeneous hardware (H100/H200/B200) using KEDA, LeaderWorkerSet, taints/tolerations, and gang scheduling
  • Run a tight GitOps workflow - every change ships through PRs, Helm values, and CI
  • Build node-local model caches, checkpoint pipelines, and shared storage for fast cold starts
  • Operate the observability stack (Prometheus, Grafana, Loki, DCGM) and make GPU cluster debugging fast
  • Build the developer-facing surfaces for hosted training: job submission, live run monitoring, logs, metrics, model/adapter management, comparisons
  • Develop FastAPI backend services and REST APIs that bridge the platform to running clusters
  • Build real-time monitoring and debugging tools (streaming logs, step-level metrics, failure analysis)
  • Ship product UI in Next.js / React / TypeScript with shadcn, Tailwind, tRPC, and TanStack Query
  • Interface with the RL trainer, inference servers, and environment servers running inside our clusters
  • Productize new training capabilities (new model architectures, RL algorithms, modes)

Prime Intellect builds a decentralized, peer-to-peer platform for AI development. It operates Prime Intellect Compute, a GPU marketplace that aggregates resources from multiple cloud providers so users can access affordable compute time for AI projects. The Prime Intellect Protocol governs open-source AI with community ownership and governance, enabling anyone to contribute compute, capital, and code for distributed model training. Its goal is to democratize AI development by providing a scalable, marketplace-driven, globally distributed environment for training and deploying advanced models.

Company Size

11-50

Company Stage

Early VC

Total Funding

$20.5M

Headquarters

Dover, Delaware

Founded

2024

Simplify Jobs

Simplify's Take

What believers are saying

  • Prime Compute aggregates 12 providers, offering H100s at $1.5-4/hr instantly.
  • Raised $15M to scale open superintelligence stack for agentic AI.
  • BrowserEnv partnership trains browser agents on real websites reproducibly.

What critics are saying

  • Browserbase failure disrupts BrowserEnv pipeline within 12 months.
  • INTELLECT-2 coordination fails, yielding unusable 32B models by Q4 2026.
  • $15M runway exhausts by Q4 2027 without breakeven transactions.

What makes Prime Intellect unique

  • Prime Intellect Protocol enables peer-to-peer GPU marketplace with TOPLOC verification.
  • INTELLECT-2 launches first 32B-parameter globally decentralized RL training run.
  • Environments Hub hosts hundreds of open-source RL environments for agentic models.

Help us improve and share your feedback! Did you find this helpful?

Your Connections

People at Prime Intellect who can refer or advise you

Benefits

Company Equity

Flexible Work Hours

Remote Work Options

Relocation Assistance

Professional Development Budget

Conference Attendance Budget

Growth & Insights and Company News

Headcount

6 month growth

-5%

1 year growth

-7%

2 year growth

24%
Browserbase
Mar 25th, 2026
Introducing browserenv: train browser agents on real websites.

Introducing browserenv: train browser agents on real websites. Harsehaj Dhami Growth Engineer Kyle Jeong Growth Engineer March 25, 2026 TL;DR: Browserbase and Prime Intellect have partnered to launch BrowserEnv, a reinforcement learning environment for training and evaluating browser agents on real web tasks. Everyone wants AI models that can actually use the browser to get work done, but most models weren't trained to interact with real websites. They were trained on static datasets instead of environments where they can practice navigating pages, clicking elements, and completing multi-step workflows. This is why many browser agents look impressive in demos but struggle in real-world use. The missing piece is a reliable and scalable training environment. Training browser agents requires significant infrastructure when running browsers at scale, interacting with live websites without getting blocked, resetting sessions between tasks, and verifying results. This is the infrastructure frontier labs are already building. For example, Microsoft trained and evaluated their computer-use model Fara-7B using Browserbase, which required reliable access to real websites and scalable browser environments for evaluation and reinforcement learning workflows. Browserbase, Inc. has partnered with Prime Intellect to make this infrastructure accessible to everyone with BrowserEnv. BrowserEnv is a reinforcement learning environment designed specifically for training browser agents. It runs on Browserbase, which provides scalable browser infrastructure and access to real websites. Prime Intellect provides the training platform. Together, they make it possible to train and evaluate computer-use models on real browser tasks without building the infrastructure yourself. All you need is a dataset of tasks. Researchers and developers can train open models like Qwen or other computer-use models using reinforcement learning, while BrowserEnv handles browser orchestration, task execution, and verification. Training Qwen 3 VL on WebVoyager with BrowserEnv. To validate its stack end to end, Browserbase, Inc. fine-tuned Qwen/Qwen3-VL-8B-Instruct on real WebVoyager tasks using BrowserEnv and Prime Intellect. Browserbase, Inc. plugged the prime/webvoyager-no-anti-bot environment into Prime's RL pipeline, so the model could practice real navigation flows across sites like Amazon, Allrecipes, GitHub, Booking, and more without getting stuck on anti bot walls. BrowserEnv handled browser orchestration on Browserbase, Prime handled rollouts and optimization, and WebVoyager provided a standardized benchmark of 600 filtered tasks. Browserbase, Inc. started from the public WebVoyager environment in the Prime hub, switched it to CUA mode, and pointed it at Qwen3-VL-8B-Instruct. The training run used a relatively small but realistic configuration: 200 steps, batch size 32, 8 rollouts per example, learning rate 1e-4, and an oversampling factor of 2, with modest parallelism. model = "Qwen/Qwen3-VL-8B-Instruct" max_steps = 200 batch_size = 32 rollouts_per_example = 8 learning_rate = 0.0001 oversampling_factor = 2 max_async_level = 2 [sampling] max_tokens = 512 [[env]] id = "prime/webvoyager-no-anti-bot" args = {mode = "cua", viewport_width = 800, viewport_height = 600, keep_recent_screenshots = 2} In this setup, each training step created or reused a Browserbase session, loaded a WebVoyager task, and let Qwen3-VL act through coordinate based CUA primitives while a verifier judged task completion and produced reward signals. Over the course of the run, the model improved on multi step tasks such as searching, filtering, and extracting information from live pages, rather than just static HTML. The output of this training run is a LoRA adapter that can be easily deployed to run on the Prime Intellect platform. This training workflow is reproducible by anyone with access to a Browserbase and Prime Intellect account. You can even start from the same ingredients Browserbase, Inc. used: BrowserEnv on Browserbase, the WebVoyager no anti bot environment in Prime, and an open vision language model like Qwen3-VL. Frontier labs are already training browser agents this way, and now anyone with access to the internet can do the same. BrowserEnv is generally available today, learn more at browserenv.com and start training your own browser agents. Train your own custom modelLearn more

Bankless
May 1st, 2025
AI ROLLUP: The AI Experiment That's Been Secretly Manipulating You

Prime Intellect just launched INTELLECT-2, the first globally distributed reinforcement-learning run of a 32-billion-parameter model, with experts predicting community-trained systems in the 70-100 B range by year-end - a potential counterweight to hyperscaler dominance.

Prime Intellect
Mar 4th, 2025
15M to Build a Peer-to-Peer AI Protocol

Prime Intellect is building a peer-to-peer protocol for compute and intelligence, enabling collective creation, ownership, and access to sovereign open-source AI. We’re moving beyond centralized AI to empower anyone—from solo GPU operators to global datacenters—to contribute compute, code, or capital and shape the open and decentralized AI ecosystem.

CO/AI
Oct 12th, 2024
Prime Intellect launches initiative to train open model with decentralized computing

Prime Intellect launches initiative to train open model with decentralized computing.

PR Newswire
Jul 25th, 2024
Coinfund Expands Best-In-Class Team, Launches Podcast As Firm Commemorates Nine Years Of Service

NEW YORK, July 25, 2024 /PRNewswire/ -- CoinFund , one of the world's longest-operating cryptonative investment firms and a registered investment adviser, commemorates the firm's 9th anniversary with the announcement of four strategic hires and the launch of a new podcast , Mined with CoinFund.EXPANDING THE TEAMCoinFund proudly announces the addition of four subject matter specialists to support an uptick in deal operations and the long-term growth of the brand.With the addition of New York City-based Adriana Armstrong as Executive Assistant, the firm broadens its leadership support and employee and workplace experiences. Adriana brings over 10 years of internal operations, office management, and executive support to CoinFund, most recently working as a consultant at Illuminate Ventures and New System Ventures.In an effort to further strengthen its operational foundation, CoinFund has bolstered its Finance team with the addition of New Jersey-based Matt Manley as a Finance Operations Associate. Prior to CoinFund, Manley was a Senior Accountant in Withum's Emerging Technology group serving on the Digital Assets and Blockchain team, and brings more than seven years of combined audit and digital asset experience to the role from best in class providers like Copper.co and KPMG.CoinFund announces that Malaysia-based Walter Teng has joined CoinFund as an Investor on the Liquid Investments team, deepening the firm's connectivity in Asia. Prior to CoinFund, Teng served as a liquid investor at MSA Novo, and spent more than two years at FundStrat, most recently acting as Vice President of Digital Asset Strategy where he focused on DeFi and small cap strategies.The firm has also continued to expand its commitment to post-investment services, growing its Marketing team with the addition of New York City-based Cam Thompson. In her newly created role as Content Manager for CoinFund, Thompson will amplify the firm's storytelling via written and visual content. Most recently she served as a Copywriter and Creator Liaison at Celo Foundation and covered the trends of the crypto industry as a web3 beat reporter for CoinDesk."CoinFund is a state of the art investment firm designed for the next generation of the internet built on decentralization technology and web3," commented Jake Brukhman, Founder, CoinFund