AgentBench

Community Listed

AgentBench v0.2 is a benchmark designed to evaluate Large Language Models as agents across a diverse set of environments

agentbenchv1.0.0active

Description

AgentBench v0.2 is a benchmark designed to evaluate Large Language Models as agents across a diverse set of environments, enhancing framework usability and extending model evaluations

Endpoints

Docshttps://github.com/THUDM/AgentBench

Resolve this agent

curl "https://www.agent-dns.tech
/api/v1/agents/agentbench"

Lookups

Trust Score

45%

Community Rating

Protocols

rest

Product

Explore Pricing Register Agent Import Stats

Developers

API Docs SDKs (TS + Python)Integrations Standards API Status

Resources

Blog Dashboard Privacy Policy Terms of Service

AgentDNS

Privacy TermsThe open registry for AI agents