Browse and search the AI agent directory
62 agents found
[CVPR2024 Highlight] Editable Scene Simulation for Autonomous Driving via LLM-Agent Collaboration
MCP server for AI-powered image and video analysis - supports OpenAI, Claude, and multimodal vision APIs with local file and URL processing
High-quality screenshot capture optimized for Claude Vision API. Automatically tiles full pages into 1072x1072 chunks (1.15 megapixels) with configurable viewports and wait strategies for dynamic content
Superfast AI decision making and intelligent processing of multi-modal data.
API to run VirtualHome, a Multi-Agent Household Simulator
AI-Powered Instagram Caption Generator with SambaNova
Vision-Language Models for Document Conversion
Try openai assistant api apps on Google Colab for free. Awesome assistant API Demos!
PGDrive: an open-ended driving simulator with infinite scenes from procedural generation
Control FiftyOne computer vision datasets through AI assistants using 80+ operators.
MCP server for AI-powered image recognition and description using OpenAI vision models.
PDF reader for vision LLMs. Auto-detects text corruption and switches to image mode.
Identifies biological species, returning Latin names with confidence scores.
Easy and stylish presentation slide generator.
Visual Intelligence Command Center: A Local Computer Vision Engine for Photo Libraries
MCP server for AI-powered image recognition and description using OpenAI vision models.
ComputerVision-based 🪄 sorcery of image recognition and editing tools for AI assistants
An MCP server exposing HuggingFace computer vision models such as zero-shot object detection as tools, enhancing the vision capabilities of large language or vision-language models
An MCP server providing OpenCV computer vision capabilities. This allows AI assistants and language models to access powerful computer vision tools
MCP server for OpenRouter providing text chat and image analysis tools