Full-Time
Platform for improving machine learning models
$26 - $28/hr
Junior
Albany, NY, USA
The role requires spending ~80% of time in the field and at the market's HQ, primarily for field-based patient care and support.
Upload your resume to see how it matches 5 keywords from the job description.
PDF, DOC, DOCX, up to 4 MB
Galileo offers a platform that helps machine learning teams enhance their models and lower annotation costs by using data-centric algorithms for Natural Language Processing. It allows teams to quickly identify and fix data issues that affect model performance and provides a collaborative space to manage models from raw data to production. Unlike competitors, Galileo integrates easily with existing tools and focuses on actionability, security, and privacy, while also streamlining the data labeling process. The company's goal is to equip machine learning teams with efficient tools to improve their models, charging a subscription fee for its services.
Company Size
51-200
Company Stage
Series B
Total Funding
$68.1M
Headquarters
San Francisco, California
Founded
2021
Help us improve and share your feedback! Did you find this helpful?
Health Insurance
Dental Insurance
Vision Insurance
Disability Insurance
Parental Leave
Flexible Work Hours
401(k) Retirement Plan
401(k) Company Match
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More. AI agents have a safety and reliability problem. Agents would allow enterprises to automate more steps in their workflows, but they can take unintended actions while executing a task, are not very flexible, and are difficult to control.Organizations have already sounded the alarm about unreliable agents, worried that once deployed, agents might forget to follow instructions. OpenAI even admitted that ensuring agent reliability would involve working with outside developers, so it opened up its Agents SDK to help solve this issue. But researchers from the Singapore Management University (SMU) have developed a new approach to solving agent reliability. AgentSpec is a domain-specific framework that lets users “define structured rules that incorporate triggers, predicates and enforcement mechanisms.” The researchers said AgentSpec will make agents work only within the parameters that users want.Guiding LLM-based agents with a new approachAgentSpec is not a new LLM but rather an approach to guide LLM-based AI agents. The researchers believe AgentSpec can be used not only for agents in enterprise settings but useful for self-driving applications. The first AgentSpec tests integrated on LangChain frameworks, but the researchers said they designed it to be framework-agnostic, meaning it can also run on ecosystems on AutoGen and Apollo. Experiments using AgentSpec showed it prevented “over 90% of unsafe code executions, ensures full compliance in autonomous driving law-violation scenarios, eliminates hazardous actions in embodied agent tasks, and operates with millisecond-level overhead.” LLM-generated AgentSpec rules, which used OpenAI’s o1, also had a strong performance and enforced 87% of risky code and prevented “law-breaking in 5 out of 8 scenarios.”Current methods are a little lackingAgentSpec is not the only method to help developers bring more control and reliability to agents
Platform Powers End-to-End Continuous Improvement of Agentic ApplicationsSAN FRANCISCO, March 18, 2025 /PRNewswire/ -- Galileo, the AI Evaluation company, today announced an integration with NVIDIA NeMo ™, enabling customers to continuously improve their custom generative AI models. Now, customers can evaluate models comprehensively across the development lifecycle, curating feedback into datasets that power additional fine-tuning. As a result, customers ship GenAI apps that are more reliable, trusted, and cost-effective.Data Flywheel for AIThe majority of enterprises are developing GenAI applications – including agents and RAG-based chatbots – but it can be challenging to ship and scale these applications due to the non-deterministic outputs of Large Language Models (LLMs). There's even more complexity when AI teams wish to test new LLMs, which are constantly evolving in capability and price point. The solution is to build an AI data flywheel, enabling continuous testing and refinement, collecting data about user interactions for subsequent improvement. When AI teams use data to improve outcomes (whether by fine-tuning, prompt engineering, or in-context learning), they gain a competitive advantage.Galileo and NVIDIA accelerate a data flywheel by collecting and curating better data about the interactions of an AI application
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More. One goal for an agentic future is for AI agents from different organizations to freely and seamlessly talk to one another. But getting to that point requires interoperability, and these agents may have been built with different LLMs, data frameworks and code.To achieve interoperability, developers of these agents must agree on how they can communicate with each other. This is a challenging task. A group of companies, including Cisco, LangChain, LlamaIndex, Galileo and Glean, have now created AGNTCY, an open-source collective with the goal of creating an industry-standard agent interoperability language. AGNTCY aims to make it easy for any AI agent to communicate and exchange data with another.Uniting AI Agents“Just like when the cloud and the internet came about and accelerated applications and all social interactions at a global scale, we want to build the Internet of Agents that accelerate all of human work at a global scale,” said Vijoy Pandey, head of Outshift by Cisco, Cisco’s incubation arm, in an interview with VentureBeat. Pandey likened AGNTCY to the advent of the Transmission Control Protocol/Internet Protocol (TCP/IP) and the domain name system (DNS), which helped organize the internet and allowed for interconnections between different computer systems. “The way we are thinking about this problem is that the original internet allowed for humans and servers and web farms to all come together,” he said
When deploying generative artificial intelligence (AI), one of the most fundamental decisions businesses face is whether to choose open-source or proprietary AI models — or aim for a hybrid of the two. “This basic choice between the open source ecosystem and a proprietary setting impacts countless business and technical decision, making it ‘the AI developer’s dilemma,’” according to an Intel Labs blog post. This choice is critical because it affects a company’s AI development, accessibility, security and innovation. Businesses must navigate these options carefully to maximize benefits while mitigating risks
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More. Galileo, a San Francisco-based startup, is betting that the future of artificial intelligence depends on trust. Today, the company launched a new product, Agentic Evaluations, to address a growing challenge in the world of AI: making sure the increasingly complex systems known as AI agents actually work as intended.AI agents—autonomous systems that perform multi-step tasks like generating reports or analyzing customer data—are gaining traction across industries. But their rapid adoption raises a crucial question: How can companies verify these systems remain reliable after deployment? Galileo’s CEO, Vikram Chatterji, believes his company has found the answer.“Over the last six to eight months, we started to see some of our customers trying to adopt agentic systems,” said Chatterji in an interview. “Now LLMs can be used as a smart router to pick and choose the right API calls towards actually completing a task