Full-Time

Research Scientist

Diffusion

Genmo

Genmo

11-50 employees

AI-powered image and video animation platform

No salary listed

San Francisco, CA, USA

In Person

Bay Area-based role; relocation support available.

Category
AI & Machine Learning (1)
Required Skills
Python
Tensorflow
Neural Networks
Pytorch
Requirements
  • Ph.D. in Computer Science, Artificial Intelligence, Machine Learning, or a closely related field
  • Strong publication record in top-tier conferences (e.g., CVPR, ICCV, NeurIPS, ICML) with a focus on generative models, particularly diffusion models
  • Extensive experience implementing and optimizing large-scale generative models for image or video tasks
  • Deep understanding of state-of-the-art techniques in text-to-image and text-to-video generation
  • Proficiency in Python and deep learning frameworks such as PyTorch or TensorFlow
  • Excellent communication skills with the ability to explain complex technical concepts to diverse audiences
  • Proven ability to work collaboratively in a team environment
  • Experience leading research initiatives in advanced diffusion models for text-to-video generation (implied by responsibilities)
  • Ability to design and conduct rigorous experiments to validate new ideas and evaluate model performance (implied by responsibilities)
Responsibilities
  • Lead research initiatives in advanced diffusion models for text-to-video generation, focusing on improving visual quality, temporal consistency, and semantic fidelity
  • Develop and implement state-of-the-art algorithms for translating textual descriptions into dynamic video content
  • Design and conduct rigorous experiments to validate new ideas and evaluate model performance
  • Collaborate with cross-functional teams to integrate research breakthroughs into production pipeline
  • Stay at the cutting edge of the field by regularly reviewing academic literature and attending top-tier conferences
  • Contribute to the research community through high-quality publications and open-source contributions
  • Mentor junior researchers and foster a culture of innovation within the research team
  • Work closely with product teams to align research directions with user needs and market opportunities
Desired Qualifications
  • Postdoctoral or industrial research experience in generative AI for video
  • Hands-on experience with text-to-video generation projects
  • Expertise in other generative model architectures (e.g., GANs, VAEs) and their applications to video
  • Experience working with large-scale datasets and distributed computing environments
  • Track record of successful collaboration with product teams on technology transfers
  • Familiarity with video codecs, compression techniques, and perceptual quality metrics
  • Contributions to open-source projects in the field of generative AI

Genmo provides AI-powered tools for creating and editing multimedia content, including animated images, generated and edited movies, scripts, trailers, and presentations. Its platform lets users upload an image and animate parts or generate full movies with scene planning, transitions, and overlays. It uses a subscription model with tiered access and may offer one-time purchases, serving individuals and businesses. Genmo differentiates itself by offering end-to-end AI-assisted creation across image animation, video production, scripting, and presentation design in one platform, with precise edit controls and understanding user intent to help users produce high-quality visuals faster.

Company Size

11-50

Company Stage

Early VC

Total Funding

$30M

Headquarters

San Francisco, California

Founded

2022

Simplify Jobs

Simplify's Take

What believers are saying

  • Enterprise video generation demand expanding beyond creators into marketing, training, and communications.
  • Subscription tier expansion potential with only 40,000 users against $58.4M total funding.
  • Multi-investor syndicate including NEA, Gold House Ventures, WndrCo signals strong market validation.

What critics are saying

  • Open-source Apache 2.0 license enables Runway and Kling to integrate and improve Mochi 1 directly.
  • October 2024 server crashes hours after launch exposed inadequate compute scaling and infrastructure.
  • Meta's Llama integration of video capabilities absorbs open-source improvements into free ecosystem-scale model.

What makes Genmo unique

  • Mochi 1 achieves cinematic quality with minimal human intervention from prompt to output.
  • Open-source Apache 2.0 model enables community contributions and third-party integrations at scale.
  • Efficient resource utilization delivers comparable results to competitors with significantly fewer resources deployed.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Relocation Assistance

Company News

Upward Dynamism
Nov 1st, 2024
October 2024: 6 Innovative AI Trends You Shouldn't Miss

Video AI startup Genmo has just launched Mochi-1, an open-source model (Apache 2.0) designed to compete with industry leaders like Runway and Kling.

ClubNation
Oct 23rd, 2024
Genmo launches Mochi 1 Text-to-Video Generation Model, But Server Crashes Within Hours

On October 22, Genmo, the AI based video generating platform, released Mochi 1, a new state-of-the-art open-source text-to-video generation model.

Analytics India Magazine
Oct 23rd, 2024
Genmo launches Mochi 1 Text-to-Video Generation Model, But Server Crashes Within Hours

Genmo launches mochi 1 text-to-video generation model, but server crashes within hours.

VentureBeat
Oct 22nd, 2024
AI video startup Genmo launches Mochi 1, an open source rival to Runway, Kling, and others

In tandem with the Mochi 1 preview, Genmo also announced it has raised a $28.4 million Series A funding round, led by NEA, with additional participation from The House Fund, Gold House Ventures, WndrCo, Eastlink Capital Partners, and Essence VC.

Finextra Research
Nov 26th, 2023
Beyond Imagination: The Rise And Evolution Of Generative Ai Tools

Generative AI has revolutionized the way we create and interact with digital content. Since the launch of Dall-E in July 2022 and ChatGPT in November 2022, the field has seen unprecedented growth. This technology, initially popularized by OpenAI’s ChatGPT, has now been embraced by major tech players like Microsoft and Google, as well as a plethora of innovative startups. These advancements offer solutions for generating a diverse range of outputs including text, images, video, audio, and other media from simple prompts.The consumer now has a vast array of options based on their specific output needs and use cases. From generic, large-scale, multi-modal models like OpenAI’s ChatGPT and Google’s Bard to specialized solutions tailored for specific use cases and sectors like finance and legal advice, the choices are vast and varied. For instance, in the financial sector, tools like BloombergGPT (https://www.bloomberg.com/), FinGPT (https://fin-gpt.org/), StockGPT (https://www.askstockgpt.com/) or BeeBee.AI (https://www.beebee.ai/) or in the domain of legal advise, tools like Law Chat GPT (https://lawchatgpt.com/) or LegalFly (https://www.legalfly.ai/), offer niche solutions with heightened accuracy.To give an idea of what is available on the market, here is a short overview of some notable solutions available:General Chatting and Writing Assistance: : Bard (Google): https://bardai.io/ ChatFlash (Neuroflash AI): https://neuroflash.com/chatflash/ ChatGPT (OpenAI): https://chat.openai.com/chat Claude (Anthropic): https://claude.ai/ Copy AI: https://www.copy.ai Easy Peasy AI Chat: https://easy-peasy.ai Fireflies (https://www.unite.ai/goto/fireflies) GrowthBar: https://growthbarseo.com Jasper AI: https://www.jasper.ai Llama 2 (Meta): https://ai.meta.com/llama/ Notion AI: https://www.notion.so/ Perplexity: https://www.perplexity.ai Quillbot: https://quillbot.com Rytr Chat: https://rytr.me Rytr: https://rytr.me/ Wordtune: https://www.wordtune.com/ Writesonic: https://writesonic.comArt and Picture generation : this can be generating new pictures from a prompt, but also adapting existing pictures (like enhancing, removing parts of a picture, inserting object into a picture…​): Adobe Firefly: https://www.adobe.com/sensei/generative-ai/firefly.htm Artbreeder: https://www.artbreeder.com/ Auto Draw: https://autodraw.com BRIA: https://bria.ai/ Canva: https://www.canva.com/ai-image-generator/ Craiyon: https://www.craiyon.com/ Dall-E 2 (OpenAI): https://openai.com/product/dall-e-2 Deep Dream Generator: https://deepdreamgenerator.com/ DeepAI: https://deepai.org/machine-learning-model/text2img Deepswap: https://www.deepswap.ai Flair AI: https://flairai.io Fotor: https://www.fotor.com/features/ai-image-generator/ Images.ai (unite.ai): https://images.ai/ Jasper Art: https://www.jasper.ai/art Leap AI: https://www.tryleap.ai/ Lensa AI: https://prisma-ai.com/lensa MidJourney: https://www.midjourney.com/home/ Neural Doodle: https://neuraldoodle.com/ NightCafe: https://creator.nightcafe.studio PhotoSonic (Writesonic): https://writesonic.com/photosonic-ai-art-generator Pictory: https://pictory.io Pikazo: https://www.pikazoapp.com/ Pixray: https://replicate.com/pixray/text2image Prodia: https://app.prodia.com/ Remini: https://remini.ai/ RocketAI: https://www.rocketai.io/ Simplified: https://simplified.com/ Stable Diffusion (DreamStudio): https://stablediffusionweb.com/ Stablecog: https://stablecog.com/Video generation : this can be based on "prompts" resulting in video, taking a picture and identifying which part should be moving or providing a set of pictures asking the solution to interpolate the movement between 2 consecutive pictures (thus transforming a set of pictures into a movie)