Full-Time

Developer Relations Engineer

Modal

Modal

51-200 employees

Cloud-based on-demand code execution platform

Compensation Overview

$175k - $275k/yr

San Francisco, CA, USA + 1 more

More locations: New York, NY, USA

In Person

Must work in-person at one of our NYC, SF, or Stockholm offices.

Category
Developer Relations
Requirements
  • 3+ years as a software engineer
  • Is energized by the AI developer community and wants to help developers adopt new technologies
  • Loves teaching
  • Has excellent technical communication skills
  • Is metrics-driven and takes quantitative approaches to prioritizing initiatives
  • Is excited about working in-person in the NYC, SF or Stockholm office
Responsibilities
  • Distill the latest advancements in AI technology and educate developers on how to incorporate them
  • Give demos/talks about Modal and adjacent tools at developer events
  • Engage with users in our community, both online (X, LinkedInReddit, Slack) and at in-person events
  • Build relationships, integrations, and joint marketing activities with other developer-focused companies
  • Set objectives that are aligned with the greater GTM team and track the impact of the initiatives you work on
Desired Qualifications
  • Bonus: you're not afraid to think outside the box when it comes to compelling technical content
  • Bonus: you already have a developer following on social media!

Modal provides on-demand cloud compute for developers, data engineers, and ML practitioners. Users write Python and launch hundreds of custom containers in the cloud to run code and data workloads without managing infrastructure, with on-demand GPUs and serverless web endpoints. It charges for compute resources and offers features like defining environments in code, fast container startup, monitoring, logs, and distributed queues. It differentiates by a Python-centric workflow, rapid container startup, and end-to-end cloud execution, aiming to simplify running code in the cloud at scale.

Company Size

51-200

Company Stage

Series B

Total Funding

$128M

Headquarters

New York City, New York

Founded

2021

Simplify Jobs

Simplify's Take

What believers are saying

  • Runway partnership validates latency-critical AI workloads, expanding enterprise inference revenue.
  • Butter acquisition strengthens sandbox capabilities for AI agent development and codegen.
  • $80M funding round enables global infrastructure expansion to meet enterprise demand.

What critics are saying

  • CoreWeave undercuts pricing 25% via OpenAI H100 exclusivity, forcing customer migration.
  • Runway depends on GWM-1 model, vulnerable to Sora 2.0 outperformance.
  • E2B's Rust sandbox outperforms Butter's bVisor by 40% in startup latency.

What makes Modal unique

  • GPU memory snapshotting reduces cold starts by 10x for LLM inference workloads.
  • Multi-node GPU clusters with RDMA networking enable real-time video inference at scale.
  • Python-defined infrastructure allows developers to deploy without managing complex configurations.

Help us improve and share your feedback! Did you find this helpful?

Your Connections

People at Modal who can refer or advise you

Benefits

Health Insurance

Unlimited Paid Time Off

Remote Work Options

Paid Vacation

Flexible Work Hours

401(k) Retirement Plan

401(k) Company Match

Wellness Program

Mental Health Support

Gym Membership

Phone/Internet Stipend

Home Office Stipend

Professional Development Budget

Conference Attendance Budget

Stock Options

Company Equity

Parenting Leave

Family Planning Benefits

Fertility Treatment Support

Adoption Assistance

Relocation Assistance

Commuter Benefits

Employee Referral Bonus

Training Programs

Tuition Reimbursement

Professional Certification Support

Mentorship Program

Meal Benefits

Legal Services

Employee Discounts

Company Social Events

Growth & Insights and Company News

Headcount

6 month growth

-2%

1 year growth

0%

2 year growth

31%
Modal
Mar 26th, 2026
Runway chooses Modal to power real-time inference for Runway Characters.

Runway chooses Modal to power real-time inference for Runway Characters. Today, Modal Labs is announcing that Runway is partnering with Modal to power real-time inference for Runway Characters. Runway Characters is a real-time video agent API that lets developers, startups, enterprises and consumers build fully custom conversational characters. These video agents can have any appearance and any visual style, with full control over voice, personality, knowledge and actions. Built on Runway's general world model, GWM-1, Characters generates expressive digital personas from a single image, with zero fine-tuning required. Thousands of organizations are already using Characters, including Fortune 10 technology companies, major Hollywood studios, global advertising agencies and gaming companies, with use cases ranging from customer support and internal training to experiential advertising and immersive game worlds. Characters represents the first step toward a future of online interaction built around real-time video rather than text. This kind of continuous, expressive, low-latency video generation held across extended conversations and experiences requires infrastructure purpose-built for real-time interaction. Modal's serverless compute platform is designed for exactly this type of workload: GPU-intensive, latency-critical and highly variable in demand. The iteration speeds Modal afforded allowed Runway's team to move from proof of concept to production in under 30 days. "Real-time video inference is a fundamentally different engineering challenge than batch generation, especially given our customers are running these experiences globally," said Kamil Sindi, CTO of Runway. "Runway Characters requires sustained low latency across the full duration of a conversation - expressions, lip-sync, gestures - without degradation. Modal's infrastructure gave us the performance and reliability we need to ship this in every global region, at production scale." Achieving the latency required for real-time interaction means distributing inference across multiple GPUs with high-bandwidth communication between nodes. By adding a single line of code on Modal, Runway can turn their containers into multi-node GPU clusters with RDMA networking, available instantly across every region. Modal deploys these workloads across geographies as a single unified pool, routing them close to users and scaling on demand, so Runway can serve users anywhere without pre-provisioning or managing regional infrastructure directly. "Runway is pushing the frontier for what's possible with world models, which requires running complex models at large scale with very low latency. This is something Modal does extremely well," said Erik Bernhardsson, CEO of Modal. "We're proud to be the infrastructure powering Characters." Runway Characters is available today to all developers and businesses at dev.runwayml.com, and to consumers at runwayml.com. Enterprise teams can reach out to learn more about deploying custom avatar experiences at scale. Ship your first app in minutes. $30 / month free compute

Modal
Dec 2nd, 2025
Modal + Mistral 3: 10x faster cold starts with GPU snapshotting

Modal + Mistral 3: 10x faster cold starts with GPU snapshotting. Today, Mistral launched Mistral 3, a family of open models with frontier-class performance, customization capabilities, and trusted transparency. Modal Labs is proud to offer Day 0 support for running these models on Modal. Modal enables developers to instantly deploy and scale Mistral 3 models without orchestrating compute infrastructure. Beyond a great DevEx and abundant GPU capacity, Modal also offers cutting-edge features like GPU memory snapshotting that can reduce median cold start time for some of these models by almost 10x, from almost two minutes to just ten seconds. About Mistral 3. Mistral 3 is the newest frontier open model family from Mistral. It is a suite of multimodal models with strong multilingual support, and it is available in multiple sizes and capabilities for max flexibility. This blog post focuses on Ministral 3, whose size is well-suited for Modal's serverless infrastructure. Ministral 3 is the small version of the Mistral 3 family of models and is available in 3B, 8B, and 14B sizes. It performs competitively with the Qwen 3-VL model series on benchmarks. This makes Ministral 3 well-suited for companies that are seeking a balance of intelligence and compute efficiency. See the example text for details on deployment with modal deploy. How it works. The basic example above takes advantage of several key Modal features: * Serverless GPUs that automatically scale up and down from 0 based on request volume to the vLLM server. * Volumes, Modal's native, distributed file system, to cache model weights and compilation artifacts from vLLM. * Python-defined infrastructure to keep environment and hardware requirements cleanly in sync with application code. Together, these features allow developers to deploy Mistral 3 without being blocked on acquiring GPU quota or managing complex configuration surface areas. Now, speed up cold starts by almost 10x. Modal recently launched a new GPU snapshotting feature in alpha. This can drastically reduce cold starts for workloads that require heavy initialization work - like spinning up a vLLM server. Modal Labs tested this on the 3B version of Ministral 3 and saw an almost 10x reduction in median cold start time, from ~118s to ~12s. Drastically shorter cold starts means you can deploy Ministral 3 in a way that is both cost-efficient and responsive to user demand. vLLM + Ministral 3 3B. To use this feature, you must enable Sleep Mode for your vLLM server and set experimental_options={"enable_gpu_snapshot": True} in your Modal App. The first time the vLLM server finishes initializing, it will be put to sleep. This shifts most of the contents of GPU memory to CPU memory, which facilitates the snapshotting process. Upon subsequent starts, the vLLM server is restored from this snapshot. Try it out for yourself by deploying the code sample here.

Google
Sep 29th, 2025
Modal Labs Raises $80M in Funding

Modal Labs, an AI infrastructure startup, has raised $80 million in funding, as reported by Bloomberg.com.

Breakit
Sep 29th, 2025
Modal raises over 800 million SEK

Modal, a Swedish-founded company, has raised over $80 million in a funding round led by Lux Capital, with participation from existing investors. The company aims to provide AI firms with access to vast computing power through a global, serverless platform, eliminating the need for companies to manage expensive GPUs. The new funds will be used to expand their product offerings. Modal, founded four years ago, has now raised a total of $111 million. Earlier this year, Modal acquired the data startup Twirl.

Investors Hangout
Jul 2nd, 2025
Modal Secures Major IOS Financing Deal

Modal, a London-based firm specializing in Industrial Outdoor Storage (IOS), has secured major financing from Apollo-managed credit funds to expand its U.K. IOS portfolio. Partnering with Centerbridge Partners, this deal marks a significant milestone in the IOS sector, highlighting its growth potential. The collaboration aims to optimize facilities near transport hubs and urban centers, addressing increased demand for storage solutions.