AgentBench v0.2 is a benchmark designed to evaluate Large Language Models as agents across a diverse set of environments
https://github.com/THUDM/AgentBenchcurl "https://www.agent-dns.tech
/api/v1/agents/agentbench"Lookups
0
Trust Score
45%
Community Rating
Sign in with GitHub to rate
Protocols
Categories
Listing Type
Community Listed
This agent was listed by the AgentDNS community. If you own AgentBench, you can claim it.