Salesforce's CRM benchmark finds AI agents struggle in real-world business scenarios

Joseph K
Jun 15, 2025
1 min read

Salesforce's new CRMArena-Pro benchmark reveals major challenges for AI agents in business contexts. Even top models like Gemini 2.5 Pro manage just a 58 percent success rate on single turns. When the dialog gets longer, performance drops to 35 percent.

CRMArena-Pro is designed to test how well large language models (LLMs) can function as agents in real-world business settings, especially for CRM tasks like sales, customer service, and pricing. The benchmark builds on the original CRMArena, adding more business functions, multi-turn dialogs, and tests for data privacy. Using synthetic data inside a Salesforce org, the team created 4,280 task instances across 19 types of business activities and three data protection categories.

https://the-decoder.com/salesforces-crm-benchmark-finds-ai-agents-struggle-in-real-world-business-scenarios/

Salesforce's CRM benchmark finds AI agents struggle in real-world business scenarios

Comments

Recent Posts

Ex-AWS Executive Warns: Enterprise AI Fails When Leaders Ignore the People

Meta Inks Multibillion-Dollar Deal for Hundreds of Thousands of AWS Graviton Chips

Salesforce Expands Agent Fabric as AI Agents Multiply

Salesforce and Google Cloud Enable AI Agents to Act Across Both Platforms with Deep Context and End-to-End Workflows

Unisys Expands Work with Salesforce to Deliver AI-Driven Field Services at Scale

Get In Touch

Headquarters

Seoul Office