Full-Time
AI-powered platform autonomously resolves production incidents
No salary listed
H1B Sponsorship Available
San Francisco, CA, USA
In Person
Must be based in San Francisco; on-site presence required.
| , |
Resolve provides an AI-powered platform that functions as an automated production engineer to troubleshoot and fix software issues after they are deployed. The system works by integrating with tools like AWS and GitHub to analyze telemetry data and source code, using multiple AI agents to identify root causes and suggest fixes through natural language. Unlike traditional monitoring tools that only alert humans to problems, Resolve is designed to autonomously manage and resolve 80% of production alerts without manual intervention. The company’s goal is to reduce the time spent on manual operations by creating a system that can independently maintain software reliability.
Company Size
51-200
Company Stage
Series A
Total Funding
$160M
Headquarters
San Francisco, California
Founded
2010
Help us improve and share your feedback! Did you find this helpful?
Company Social Events
Enterprise MCP adoption is outpacing security controls. February 27, 2026 AI agents now carry more access and more connections to enterprise systems than any other software in the environment. That makes them a bigger attack surface than anything security teams have had to govern before, and the industry doesn't yet have a framework for it. "If that attack vector gets utilized, it can result in a data breach, or even worse," said Spiros Xanthos, founder and CEO of Resolve AI, speaking at a recent VentureBeat AI Impact Series event. Traditional security frameworks are built around human interactions. There's not yet an agreed-upon construct for AI agents that have personas and can work autonomously, noted Jon Aniano, SVP of product and CRM applications at Zendesk, at the same event. Agentic AI is moving faster than enterprises can build guardrails - and Model Context Protocol (MCP), while decreasing integration complexity, is making the problem worse. "Right now it's an unsolved problem because it's the wild, wild West," Aniano said. "We don't even have a defined technical agent-to-agent protocol that all companies agree on. How do you balance user expectations versus what keeps your platform safe?" MCP still "extremely permissive" Enterprises are increasingly hooking into MCP servers because they simplify integration between agents, tools and data. However, MCP servers tend to be "extremely permissive," he said. They are "actually probably worse than an API," he contended, because APIs at least have more controls in place to impose upon agents. Today's agents are acting on behalf of humans based on explicit permissions, thus establishing human accountability. "But you might have tens, hundreds of agents in the future with their own identity, their own access," said Xanthos. "It becomes a very complex matrix." Even as his startup is developing autonomous AI agents for site reliability engineering (SRE) and system management, he acknowledged that the industry "completely lacks the framework" for autonomous agents. "It's completely on us and to anybody who builds agents to figure out what restrictions to give them," he said. And customers must be able to trust those decisions. Some existing security tools do offer fine-grained access - Splunk, for instance, developed a method to provide access to certain indexes in underlying data stores, he noted - but most are broader and human-oriented. "We're trying to figure this out with existing tools," he said. "But I don't think they're sufficient for the era of agents." Keep Watching Who's accountable when an AI mis-authenticates a user? At Zendesk and other customer relationship management (CRM) platform providers, AI is involved in a number of user interactions, Aniano noted - in fact, now it's at a "volume and a scale that we haven't contemplated as businesses and as a society." It can get tricky when AI is helping out human agents; the audit trail can become a labyrinth. "So now you've got a human talking to a human that's talking to an AI," Aniano noted. "The human tells the AI to take action. Who's at fault if it's the wrong action?" This becomes even more complicated when there are "multiple pieces of AI and multiple humans" in the mix. To prevent agents from going off the rails, Zendesk tends to be "very strict" about access and scope; however, customers can define their own guardrails based on their needs. In most cases, AI can access knowledge sources, but they're not writing code or running commands on servers, Aniano said. If an AI does call an API, it is "declaratively designed" and sanctioned, and actions are specifically called out. However, customer demand is flooding these scenarios and "we're kind of holding the gates right now," he said. The industry must develop concrete standards for agent interactions. "We're entering a world where, with things like MCP that can auto-discover tools, we're going to have to create new methods of safety for deciding what tools these bots can interact with," said Aniano. When it comes to security, enterprises are rightly concerned when AI takes over authentication tasks, such as sending out and processing one-time passwords (OTP), SMS codes, or other two-step verification methods, he said. What happens if an AI mis-authenticates or misidentifies someone? This can lead to sensitive data leakage or open the door for attackers. "There's a spectrum now, and the end of that spectrum today is a human," Aniano said. However, "the end of that spectrum tomorrow might be a specialized agent designed to do the same kind of gut feeling or human-level interaction." Customers themselves are on a spectrum of adoption and comfort. In certain companies - particularly financial services or other highly-regulated environments - humans still must be involved in authentication, Aniano noted. In other cases, legacy companies or old guards only trust humans to authenticate other humans. He noted that Zendesk is experimenting with new AI agents that are "a little more connected to systems," and working with a select group of customers around guardrailing. Standing authorization is coming. In some future, agents may actually be more trusted than humans to do some tasks, and granted permissions "way beyond" what humans have today, Xanthos said. But we're a long way from that, and, for the most part, the fear of something going wrong is what's holding enterprises back. "Which is a good fear, right? I'm not saying that it is a bad thing," he said. Many enterprises simply aren't yet comfortable with an agent doing all steps of a workflow or fully closing the loop by itself. They still want human review. Resolve AI is on the cusp of giving agents standing authorization in a few cases that are "generally safe," such as in coding; from there they'll move to more open-ended scenarios that are not all that risky, Xanthos explained. But he acknowledged that there will always be very risky situations where AI mistakes could "mutate the state of the production system," as he put it. Ultimately, though: "There's no going back, obviously; this is moving faster than maybe even mobile did. So the question is what do we do about it?" What security teams can do now. Both speakers pointed to interim measures available within existing tooling. Xanthos noted that some tools - Splunk among them - already offer fine-grained index-level access controls that can be applied to agents. Aniano described Zendesk's approach as a practical starting point: declaratively designed API calls with explicitly sanctioned actions, strict access and scope limits, and human review before expanding agent permissions. The underlying principle, as Aniano put it: "We're always checking those gates and seeing how we can widen the aperture" - meaning don't grant standing authorization until you've validated each expansion.
AI SRE Resolve AI confirms $125M increase, unicorn valuation. Resolve AI, a startup automating the work of system reliability engineering (SRE), aka troubleshooting system failures, has announced a $125 million Series A at a $1 billion valuation. The round was led by Lightspeed Venture Partners, with participation of existing investors including Greylock Partners, Unusual Ventures, Artisanal Ventures, and A*. The announcement confirms TechCrunch's December report that the startup was raising at a billion-dollar valuation led by Lightspeed. Sources told TechCrunch at the time that the round may have consisted of multiple tranches, at different prices, which could have put the company's actual blended valuation below $1 billion. A spokesperson for Resolve denied that there were multiple tranches in the round, saying that 100% of the equity was purchased at a valuation of $1 billion. As Pixegias, Inc. previously reported, this kind of structure allows certain investors, often the lead, to purchase a significant portion of equity at a lower price. Resolve was co-founded in early 2024 by two former Splunk executives, Spiros Xanthos and Mayank Agarwal. Their previous startup, Omnition, was acquired by Splunk in 2019. Another startup applying AI to identify and resolve system outages is the Sequoia-backed Traversal. The emerging category is known as AI SRE.
The two-year-old startup confirms that it closed a Series A led by Lightspeed at $1 billion valuation.
Resolve AI raises $125M in Series A funding at $1B valuation. Resolve AI, a San Francisco, CA-based SRE & Engineering startup, raised $125M in Series A funding at $1B valuation. The round was led by Lightspeed Venture Partners, with existing investors Greylock Partners, Unusual Ventures, Artisanal Ventures, and A*. The company intends to use the funds to accelerate product development, expand the engineering and go-to-market teams, and support growing enterprise adoption. Founded by observability pioneers Spiros Xanthos and Mayank Agarwal (co-creators of OpenTelemetry; prior exits to Splunk and VMware), Resolve AI focuses on autonomous Site Reliability Engineering (SRE), acting as an "AI Production Engineer" to help software teams manage complex cloud environments. It autonomously investigates production incidents, identifies root causes, and suggests (or executes) remediations. Resolve AI builds a dynamic "knowledge graph" of a company's infrastructure (AWS, Kubernetes, etc.) and uses agentic AI to troubleshoot issues in minutes instead of hours. Customers include Coinbase, DoorDash, Salesforce, Zscaler, MongoDB, and MSCI.
Lightspeed said to lead series A of US AI startup at $1b valuation. Resolve AI, a startup developing an autonomous site reliability engineer (SRE) tool, saw some equity in its series A round led by Lightspeed Venture Partners sold at a US$1 billion valuation, according to three people familiar with the deal. The rest of the round was acquired at a lower price. Founded less than two years ago by former Splunk executive Spiros Xanthos and former Splunk chief architect Mayank Agarwal, the company automates the process of identifying and resolving software system issues. Resolve AI's annual recurring revenue is about US$4 million, according to two people with knowledge of the matter. The startup previously raised US$35 million in seed funding in October 2024 from Greylock and others. Food for thought. $1B headline hides split-tier pricing and cautious bets on AI site reliability engineering (SRE). * Resolve AI's series A used split pricing. Some shares cleared at the $1B headline, others sold cheaper. That signals investors playing it safe on AI SRE adoption (software that automates reliability and incident response tasks) despite the hype. * With about $4M in Annual Recurring Revenue (ARR), the 250x multiple looks extreme even for AI infrastructure. The $1B figure likely priced only a slice of the round, not all new capital. * The two-tier setup echoes late-stage private deals. Headline numbers help recruit and win customers. The real dilution and cash follow tighter terms, which gives institutions downside protection. Observability vendors feel integration pressure as enterprises pilot auto-remediation in 2025. * Resolve AI and Traversal raised over $80M for AI SRE. Many enterprises will trial auto-remediation this year, software that diagnoses and fixes production incidents. That creates demand for guardrails and integration layers that curb cascading failures. * Third-party developers can build incident response automations in the Datadog Marketplace (an app store for Datadog, a cloud monitoring platform). It already lists workflow integrations like Blink 1 and InsightFinder 2 that tie observability data to remediation. * IT consultancies can focus on financial services and healthcare, where compliance requires human-in-the-loop controls (required human approvals for automated changes). Pitch integration work that links AI SRE tools with current incident management frameworks 3 before auto-remediation becomes standard. How would you feel if you could no longer use Tech in Asia?