Full-Time

Technical Account Manager

Together AI

Together AI

201-500 employees

Decentralized cloud services for AI development

Compensation Overview

$150k - $220k/yr

+ Equity + Benefits

San Francisco, CA, USA

Hybrid

Hybrid position requiring in-office presence in San Francisco, CA.

Category
🤝Sales & Account Management (1)
Required Skills
Kubernetes
Python
JavaScript
Machine Learning
Docker
Ansible
Requirements
  • 5+ years of experience in a customer-facing technical role with at least 2 years in a post-sales function
  • Strong organizational skills and ability to manage dozens of customer implementations at once
  • Strong technical background, with knowledge of AI, ML, GPU technologies and their integration into high-performance computing (HPC) environments
  • Strong understanding of training, fine-tuning and inference in the context of open source LLMs
  • Proficiency in Python and JavaScript, with experience building and delivering prototypes on API platforms
  • Familiarity with infrastructure services (e.g., Kubernetes, SLURM), infrastructure as code solutions (e.g., Ansible), container infrastructure (Docker)
  • Strong sense of ownership and willingness to learn new skills to ensure both team and customer success
Responsibilities
  • Act as a technical advisor to our most strategic customers, deeply embedding with them to support the ideation and development of training, fine tuning and inference solutions on Together
  • Build educational content and tooling for both internal and external use around Together’s solutions (i.e., playbooks, blogs, demos, etc.)
  • Manage customer state at all times from active support tickets, to current implementation success to identifying potential new technical avenues of growth
  • Build and maintain strong relationships with technical stakeholders within accounts, ensuring the successful deployment and scaling of their applications
  • Deliver high-value feedback to our Product, Engineering, and Research teams, ensuring our platform continues to evolve to meet customer needs

Together AI focuses on enhancing artificial intelligence through open-source contributions. The company offers decentralized cloud services that allow developers and researchers to train, fine-tune, and deploy generative AI models. Their platform is designed to support a wide range of clients, from small startups to large enterprises and academic institutions, by providing cloud-based solutions that simplify the development and deployment of AI models. Unlike many competitors, Together AI emphasizes open and transparent AI systems, which fosters innovation and aims to achieve beneficial outcomes for society. The company's goal is to advance the field of AI while ensuring accessibility and collaboration among users.

Company Size

201-500

Company Stage

Series B

Total Funding

$533.5M

Headquarters

Menlo Park, California

Founded

2022

Simplify Jobs

Simplify's Take

What believers are saying

  • Together AI secured $305M for expansion, enhancing its AI Acceleration Cloud.
  • The rise of open-source AI models aligns with Together AI's focus and strategy.
  • New methods for AI-generated code accuracy can enhance Together AI's offerings.

What critics are saying

  • Increased competition from new entrants like Deep Cogito challenges Together AI's market position.
  • Snowflake's 'AI Hub' may divert talent and resources from Together AI.
  • Rapid advancements in AI coding models require Together AI to continuously innovate.

What makes Together AI unique

  • Together AI focuses on open-source contributions, fostering transparency and innovation.
  • The company offers decentralized cloud services for AI model training and deployment.
  • Together AI's acquisition of Refuel.ai enhances data management for AI applications.

Help us improve and share your feedback! Did you find this helpful?

Benefits

Health Insurance

Company Equity

Growth & Insights and Company News

Headcount

6 month growth

5%

1 year growth

11%

2 year growth

9%
Together AI
May 16th, 2025
Together AI acquires Refuel.ai to unlock data for developers and businesses building production-grade AI applications

If you speak with the largest enterprises today, they will tell you about the hundreds of AI applications they would like to deploy across their organization. However, they run into a consistent challenge – their data is an unstructured mess and they lack the tooling to easily clean and transform this data – and without the right data, AI applications cannot hope to meet production-level quality.

VentureBeat
Apr 23rd, 2025
More Accurate Coding: Researchers Adapt Sequential Monte Carlo For Ai-Generated Code

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More. Coding with the help of AI models continues to gain popularity, but many have highlighted issues that arise when developers rely on coding assistants. However, researchers from MIT, McGill University, ETH Zurich, Johns Hopkins University, Yale and the Mila-Quebec Artificial Intelligence Institute have developed a new method for ensuring that AI-generated codes are more accurate and useful. This method spans various programming languages and instructs the large language model (LLM) to adhere to the rules of each language.The group found that by adapting new sampling methods, AI models can be guided to follow programming language rules and even enhance the performance of small language models (SLMs), which are typically used for code generation, surpassing that of large language models.In the paper, the researchers used Sequential Monte Carlo (SMC) to “tackle a number of challenging semantic parsing problems, guiding generation with incremental static and dynamic analysis.” Sequential Monte Carlo refers to a family of algorithms that help figure out solutions to filtering problems. João Loula, co-lead writer of the paper, said in an interview with MIT’s campus paper that the method “could improve programming assistants, AI-powered data analysis and scientific discovery tools.” It can also cut compute costs and be more efficient than reranking methods. The researchers noted that AI-generated code can be powerful, but it can also often lead to code that disregards the semantic rules of programming languages. Other methods to prevent this can distort models or are too time-consuming. Their method makes the LLM adhere to programming language rules by discarding code outputs that may not work early in the process and “allocate efforts towards outputs that more most likely to be valid and accurate.”Adapting SMC to code generationThe researchers developed an architecture that brings SMC to code generation “under diverse syntactic and semantic constraints.” “Unlike many previous frameworks for constrained decoding, our algorithm can integrate constraints that cannot be incrementally evaluated over the entire token vocabulary, as well as constraints that can only be evaluated at irregular intervals during generation,” the researchers said in the paper. Key features of adapting SMC sampling to model generation include proposal distribution where the token-by-token sampling is guided by cheap constraints, important weights that correct for biases and resampling which reallocates compute effort towards partial generations.The researchers noted that while SMC can guide models towards more correct and useful code, they acknowledged that the method may have some problems.“While importance sampling addresses several shortcomings of local decoding, it too suffers from a major weakness: weight corrections and expensive potentials are not integrated until after a complete sequence has been generated from the proposal. This is even though critical information about whether a sequence can satisfy a constraint is often available much earlier and can be used to avoid large amounts of unnecessary computation,” they said. Model testingTo prove their theory, Loula and his team ran experiments to see if using SMC to engineer more accurate code works. These experiments were:

VentureBeat
Apr 10th, 2025
Deepcoder Delivers Top Coding Performance In Efficient 14B Open Model

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More. Researchers at Together AI and Agentica have released DeepCoder-14B, a new coding model that delivers impressive performance comparable to leading proprietary models like OpenAI’s o3-mini. Built on top of DeepSeek-R1, this model gives more flexibility to integrate high-performance code generation and reasoning capabilities into real-world applications. Importantly, the teams have fully open-sourced the model, its training data, code, logs and system optimizations, which can help researchers improve their work and accelerate progress.Competitive coding capabilities in a smaller packageThe research team’s experiments show that DeepCoder-14B performs strongly across several challenging coding benchmarks, including LiveCodeBench (LCB), Codeforces and HumanEval+.“Our model demonstrates strong performance across all coding benchmarks… comparable to the performance of o3-mini (low) and o1,” the researchers write in a blog post that describes the model.Interestingly, despite being trained primarily on coding tasks, the model shows improved mathematical reasoning, scoring 73.8% on the AIME 2024 benchmark, a 4.1% improvement over its base model (DeepSeek-R1-Distill-Qwen-14B). This suggests that the reasoning skills developed through RL on code can be generalized effectively to other domains.Credit: Together AIThe most striking aspect is achieving this level of performance with only 14 billion parameters. This makes DeepCoder significantly smaller and potentially more efficient to run than many frontier models.Innovations driving DeepCoder’s performanceWhile developing the model, the researchers solved some of the key challenges in training coding models using reinforcement learning (RL).The first challenge was curating the training data

VentureBeat
Apr 8th, 2025
New Open Source Ai Company Deep Cogito Releases First Models And They’Re Already Topping The Charts

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn MoreDeep Cogito, a new AI research startup based in San Francisco, officially emerged from stealth today with Cogito v1, a new line of open source large language models (LLMs) fine-tuned from Meta’s Llama 3.2 and equipped with hybrid reasoning capabilities — the ability to answer quickly and immediately, or “self-reflect” like OpenAI’s “o” series and DeepSeek R1.The company aims to push the boundaries of AI beyond current human-overseer limitations by enabling models to iteratively refine and internalize their own improved reasoning strategies. It’s ultimately on a quest toward developing superintelligence — AI smarter than all humans in all domains — yet the company says that “All models we create will be open sourced.”Deep Cogito’s CEO and co-founder Drishan Arora — a former Senior Software Engineer at Google who says he led the large language model (LLM) modeling for Google’s generative search product —also said in a post on X they are “the strongest open models at their scale – including those from LLaMA, DeepSeek, and Qwen.”The initial model lineup includes five base sizes: 3 billion, 8 billion, 14 billion, 32 billion, and 70 billion parameters, available now on AI code sharing community Hugging Face, Ollama and through application programming interfaces (API) on Fireworks and Together AI.They’re available under the Llama licensing terms which allows for commercial usage — so third-party enterprises could put them to work in paid products — up to 700 million monthly users, at which point they need to obtain a paid license from Meta.The company plans to release even larger models — up to 671 billion parameters — in the coming months.Arora describes the company’s training approach, iterated distillation and amplification (IDA), as a novel alternative to traditional reinforcement learning from human feedback (RLHF) or teacher-model distillation.The core idea behind IDA is to allocate more compute for a model to generate improved solutions, then distill the improved reasoning process into the model’s own parameters — effectively creating a feedback loop for capability growth. Arora likens this approach to Google AlphaGo’s self-play strategy, applied to natural language.The Cogito models are open-source and available for download via Hugging Face and Ollama, or through APIs provided by Fireworks AI and Together AI. Each model supports both a standard mode for direct answers and a reasoning mode, where the model reflects internally before responding.Benchmarks and evaluationsThe company shared a broad set of evaluation results comparing Cogito models to open-source peers across general knowledge, mathematical reasoning, and multilingual tasks. Highlights include:Cogito 3B (Standard) outperforms LLaMA 3.2 3B on MMLU by 6.7 percentage points (65.4% vs

Startup by Doc
Apr 2nd, 2025
Composio Secures $24 Million in Series A Funding: A Leap Forward for Agentic AI - Startup By DOC

In a significant boost to the rapidly evolving field of artificial intelligence, Composio, an emerging leader in Agentic AI, has successfully raised $24