A newer version of the Gradio SDK is available:
6.1.0
title: MCP Progressive Disclosure - Protocol Extension
emoji: π
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 6.0.0
app_file: app.py
pinned: false
tags:
- building-mcp-track-enterprise
- building-mcp-track-customer
- building-mcp-track-creative
MCP Progressive Disclosure π
Track 1: Building MCP - Protocol Extension
MCP 1st Birthday Hackathon Submission
πΊ Social Post & Demo Video: https://x.com/AppRushAI/status/1995274123279536330
π― The Problem
Standard MCP servers send ALL tool descriptions to the LLM at connection time. For enterprise servers with 50-100+ tools (AWS, Jira, Kubernetes, Salesforce), this results in:
- 30,000-50,000 tokens loaded before the user even asks a question
- Wasted context window space on tools that may never be used
- Reduced space for actual conversation and reasoning
- Poor scalability as servers add more tools
Example: An AWS MCP server with 100 tools Γ 400 tokens each = 40,000 tokens of pure overhead.
π‘ The Solution: Progressive Disclosure
We've created a protocol extension for MCP that implements lazy-loading of tool descriptions through a two-stage discovery process:
Stage 1: Selection (Minimal Descriptions)
Server sends ultra-minimal 1-sentence descriptions via tools/list:
{
"name": "aws_ec2_launch_instance",
"description": "Launches a new AWS EC2 instance with specified configuration.",
"inputSchema": {"type": "object", "properties": {}}
}
Stage 2: Usage (Full Descriptions On-Demand)
Agent fetches full descriptions only when needed via tool_descriptions resource:
resource:///tool_descriptions?tools=aws_ec2_launch_instance
Returns complete schema, examples, error handling, and authorizes the tool for the session.
Result: 80-90% Token Reduction
- Standard Mode: 40,000 tokens loaded upfront
- Progressive Mode:
500 tokens initial + ~400 per tool fetched = **1,700 tokens** for 3 tools used - Savings: 96% reduction in typical workflows!
π¬ Live Demo
This Space lets you experience the difference:
- Choose Mode: Standard (load all) vs. Progressive (lazy load)
- Start Server: Click "Start/Restart Server"
- Observe Metrics: Watch the "Initial Token Load" difference
- Ask Questions: Try queries like "What time is it?" or "Deploy a Kubernetes pod"
- Compare Logs:
- Standard: Immediate tool execution
- Progressive: Fetches description β Executes tool
Sample Queries (Try These!)
- "What time is it?"
- "Calculate 125 multiplied by 8"
- "Generate a random number between 1 and 100"
- "Reverse the string 'hackathon'"
- "Encode 'Hello World' to base64"
Note: Complex enterprise tools (AWS, Kubernetes, Jira, etc.) are simulation placeholders that demonstrate how the protocol handles verbose tool descriptions with realistic token counts.
π Protocol Documentation
The full specification is available in this repository:
- Protocol Spec v2.0 - Complete protocol definition
- Implementation Guide - Practical implementation advice
Key Features
β
Standards-Compliant: Uses only standard MCP primitives (no new protocol methods)
β
Session-Based Auth: Ensures descriptions are fetched before tool use
β
Backward Compatible: Servers remain fully MCP-compliant
β
Production-Ready: Includes error handling, security considerations, and best practices
π οΈ Technical Implementation
Server-Side
- Dual-mode MCP server (Standard vs. Progressive)
- 20 tools: 9 working (time, math, encoding) + 11 enterprise simulations (AWS, K8s, Jira)
- Session-based authorization tracking
- Resource endpoint for lazy-loading tool descriptions
Client-Side
- OpenAI GPT-4o agent with tool calling
- Real-time token counting and metrics
- Progressive disclosure workflow detection and enforcement
- Gradio 6 UI for interactive comparison
Architecture
User Query β Agent (with system prompt) β MCP Client
β
[Progressive Mode]
1. List tools (minimal)
2. Select tool
3. Fetch full description via resource
4. Authorize tool
5. Execute tool
π― Real-World Impact
This extension enables:
- Enterprise MCP Servers with 100+ tools without context bloat
- Multi-Domain Assistants that only load relevant tool details
- Cost Optimization by reducing input tokens per request
- Better Context Utilization leaving more room for reasoning and conversation
Use Cases
- AWS/Cloud infrastructure management (50-100 tools)
- Enterprise SaaS integrations (Salesforce, Jira, Slack, etc.)
- Developer tooling (Git, CI/CD, monitoring)
- Multi-domain knowledge systems
π Demo Tools Included
Fully Working Tools (9):
get_current_time- Returns current server timecalculate- Performs arithmetic operationsgenerate_random- Generates random numbersecho_text,reverse_string- Text manipulationbase64_encode,base64_decode- Encoding utilitiescount_words,convert_units- Analysis and conversion
Enterprise Tool Simulations (11): These tools have realistic, verbose descriptions (400+ tokens each) to demonstrate token savings, but return simulation responses:
aws_ec2_launch_instance,aws_s3_create_bucketdb_postgres_query,db_mongo_findjira_create_issue,github_create_pull_requestslack_send_message,stripe_create_chargek8s_deploy_pod,salesforce_update_leaddatadog_query_metrics
Purpose: The combination of working + simulated tools shows how Progressive Disclosure scales to enterprise scenarios with 50-100+ tools.
π Hackathon Submission
Track: Track 1 - Building MCP (Protocol Extension)
Category: Protocol Extension / Optimization
Author: Michael Martin
Social Post: https://x.com/AppRushAI/status/1995274123279536330
π Getting Started Locally
# Install dependencies
pip install -r requirements.txt
# Set your OpenAI API key
export OPENAI_API_KEY=sk-your-key-here
# Run the demo
python3 app.py
π License
MIT License - See LICENSE file for details.
π Acknowledgments
Built for the MCP 1st Birthday Hackathon hosted by Anthropic and Gradio on Hugging Face.
Special thanks to the MCP community for feedback and the Anthropic team for creating the Model Context Protocol.