top of page

Salesforce's CRM benchmark finds AI agents struggle in real-world business scenarios

  • Writer: Joseph K
    Joseph K
  • Jun 14
  • 1 min read

Salesforce's new CRMArena-Pro benchmark reveals major challenges for AI agents in business contexts. Even top models like Gemini 2.5 Pro manage just a 58 percent success rate on single turns. When the dialog gets longer, performance drops to 35 percent.

CRMArena-Pro is designed to test how well large language models (LLMs) can function as agents in real-world business settings, especially for CRM tasks like sales, customer service, and pricing. The benchmark builds on the original CRMArena, adding more business functions, multi-turn dialogs, and tests for data privacy. Using synthetic data inside a Salesforce org, the team created 4,280 task instances across 19 types of business activities and three data protection categories.







 
 
 

Comments


Recent Posts

Get In Touch

Want to learn more about our past work or

explore how we can support your current initiatives?

Reach out today and let Fiduciary Tech be your trusted partner.

Headquarters

1100 106th Avenue NE, Suite 101F
Bellevue, WA 98004
425-998-8505

info@fiduciarytech.com

Seoul Office

Address: Geunshin Building 506-1, 20 Samgae-ro, Mapo-gu, Seoul, 04173, Republic of Korea
02-71
2-2227

info@fiduciarytech.com

fiduciary technology consulting

© 2025 by Fiduciary Technology Solutions 

bottom of page