Work Here?
Industries
Data & Analytics
AI & Machine Learning
Company Size
51-200
Company Stage
Series B
Total Funding
$58.7M
Headquarters
New York City, New York
Founded
2018
Arthur.ai offers a platform for deploying and managing machine learning models (MLMs) and large language models (LLMs) that is flexible and can work with any model on various platforms. Its main product is a monitoring platform that allows businesses to track model performance and access real-time metrics, alerting users when performance thresholds are crossed. The company focuses on ensuring responsible practices in machine learning and promotes collaboration through customizable permissions for team communication. Serving a range of clients from small businesses to large enterprises, Arthur.ai operates in the machine learning operations (MLOps) market with a subscription or usage-based pricing model.
Help us improve and share your feedback! Did you find this helpful?
Total Funding
$58.7M
Above
Industry Average
Funded Over
3 Rounds
Industry standards
Health Insurance
Dental Insurance
Vision Insurance
401(k) Retirement Plan
401(k) Company Match
Professional Development Budget
Wellness Program
Unlimited Paid Time Off
Hybrid Work Options
Generally available today, Recommender System Support vastly improves AI-driven recommender systems, resulting in elevated customer satisfaction levels and increased revenue growth for online businessesNEW YORK, Jan. 23, 2024 /PRNewswire/ -- Arthur , an AI performance platform trusted by some of the largest organizations in the world to ensure that their AI systems are well-managed and safely deployed, today introduced a powerful addition to its suite of AI monitoring tools: Recommender System Support. This new technology is set to revolutionize the way online businesses utilize recommender systems in the digital economy, enabling them to drive customer satisfaction levels and increase revenue growth.A vast portion of the modern internet economy is driven by AI-based recommender systems. For example, recommender systems are the engine behind the songs that play on Spotify, the movies that are suggested on Netflix, and what products are recommended on the Amazon homepage. Every advertising email delivered to an inbox, every social media post in a feed, and even which news articles are featured on a homepage are impacted by a recommender system. These systems, which analyze extensive data to predict and offer tailored product recommendations, can significantly boost customer satisfaction and revenue growth for e-commerce platforms, as well as engagement for streaming services and content providers.A major issue that exists for companies that rely on recommender systems without a good monitoring solution in place is that these systems are prone to performance problems as well as an incredible amount of data drift
ニューヨークを拠点とする人工知能(AI)スタートアップ Arthur は、OpenAI の「GPT-3.5 Turbo」や Meta の「LLaMA 2」などの大規模言語モデル(LLM)の性能を評価・比較するためのオープンソースツール「Arthur Bench」を公開した。. Arthur の CEO 兼共同設立者 Adam Wenchel 氏は声明で次のように述べた。. Bench では、LLM プロバイダ間の違い、プロンプティングやオーグメンテーション戦略の違い、カスタムトレーニングレジメなどをチームが深く理解できるよう、オープンソースのツールを作りました。
Founded in 2019, Arthur has secured over $60M in funding from several firms, including Acrew, Greycroft, Index Ventures, BAM Elevate, Work-Bench, and Plexo Capital.
Arthur Bench introduces an extensive suite of evaluation features.
Arthur also unveiled The Generative Assessment Project (GAP) — a research initiative tracking the strengths and weaknesses of language model offerings of OpenAI, Anthropic, Meta, and others as they evolve over timeNEW YORK, Aug. 17, 2023 /PRNewswire/ -- Arthur , an AI performance platform trusted by some of the largest organizations in the world to ensure that their AI systems are well-managed and deployed in a responsible manner, today introduced Arthur Bench, an open-source evaluation tool for comparing large language models (LLMs), prompts, and hyperparameters for generative text models. This open-source tool will enable businesses to evaluate how different LLMs will perform in real-world scenarios so they can make informed, data-driven decisions when integrating the latest AI technologies into their operations.In conjunction with Arthur Bench, Arthur also unveiled The Generative Assessment Project (GAP), a research initiative ranking the strengths and weaknesses of language model offerings from industry leaders like OpenAI, Anthropic, and Meta. Notably, Arthur's research suggests that Anthropic may be gaining a slight competitive edge against OpenAI's GPT-4 on measures of "reliability" within specific domains. For example, while GPT-4 was the most successful when answering math questions, Anthropic's Claude-2 model was stronger at avoiding hallucinated factual mistakes and answering "I don't know" at appropriate times when answering history questions. Through GAP, Arthur will continue to share discoveries about behavior differences and best practices with the public in its journey to make LLMs work for everyone."As our GAP research clearly shows, understanding the differences in performance between LLMs can have an incredible amount of nuance
Find jobs on Simplify and start your career today
Industries
Data & Analytics
AI & Machine Learning
Company Size
51-200
Company Stage
Series B
Total Funding
$58.7M
Headquarters
New York City, New York
Founded
2018
Find jobs on Simplify and start your career today