Spaces:

MCP-1st-Birthday
/

mcp-extension-progressive-disclosure

Running

App Files Files Community

mcp-extension-progressive-disclosure / README.md

ysharma HF Staff

Update README.md (#1)

5f558fd verified 8 days ago

preview code

raw

history blame contribute delete

6.59 kB

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

metadata

title: MCP Progressive Disclosure - Protocol Extension
emoji: 🚀
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 6.0.0
app_file: app.py
pinned: false
tags:
  - building-mcp-track-enterprise
  - building-mcp-track-customer
  - building-mcp-track-creative

MCP Progressive Disclosure 🚀

Track 1: Building MCP - Protocol Extension
MCP 1st Birthday Hackathon Submission

📺 Social Post & Demo Video: https://x.com/AppRushAI/status/1995274123279536330

🎯 The Problem

Standard MCP servers send ALL tool descriptions to the LLM at connection time. For enterprise servers with 50-100+ tools (AWS, Jira, Kubernetes, Salesforce), this results in:

30,000-50,000 tokens loaded before the user even asks a question
Wasted context window space on tools that may never be used
Reduced space for actual conversation and reasoning
Poor scalability as servers add more tools

Example: An AWS MCP server with 100 tools × 400 tokens each = 40,000 tokens of pure overhead.

💡 The Solution: Progressive Disclosure

We've created a protocol extension for MCP that implements lazy-loading of tool descriptions through a two-stage discovery process:

Stage 1: Selection (Minimal Descriptions)

Server sends ultra-minimal 1-sentence descriptions via tools/list:

{
  "name": "aws_ec2_launch_instance",
  "description": "Launches a new AWS EC2 instance with specified configuration.",
  "inputSchema": {"type": "object", "properties": {}}
}

Stage 2: Usage (Full Descriptions On-Demand)

Agent fetches full descriptions only when needed via tool_descriptions resource:

resource:///tool_descriptions?tools=aws_ec2_launch_instance

Returns complete schema, examples, error handling, and authorizes the tool for the session.

Result: 80-90% Token Reduction

Standard Mode: 40,000 tokens loaded upfront
Progressive Mode: 500 tokens initial + ~400 per tool fetched = **1,700 tokens** for 3 tools used
Savings: 96% reduction in typical workflows!

🎬 Live Demo

This Space lets you experience the difference:

Choose Mode: Standard (load all) vs. Progressive (lazy load)
Start Server: Click "Start/Restart Server"
Observe Metrics: Watch the "Initial Token Load" difference
Ask Questions: Try queries like "What time is it?" or "Deploy a Kubernetes pod"
Compare Logs:
- Standard: Immediate tool execution
- Progressive: Fetches description → Executes tool

Sample Queries (Try These!)

"What time is it?"
"Calculate 125 multiplied by 8"
"Generate a random number between 1 and 100"
"Reverse the string 'hackathon'"
"Encode 'Hello World' to base64"

Note: Complex enterprise tools (AWS, Kubernetes, Jira, etc.) are simulation placeholders that demonstrate how the protocol handles verbose tool descriptions with realistic token counts.

📚 Protocol Documentation

The full specification is available in this repository:

Protocol Spec v2.0 - Complete protocol definition
Implementation Guide - Practical implementation advice

Key Features

✅ Standards-Compliant: Uses only standard MCP primitives (no new protocol methods)
✅ Session-Based Auth: Ensures descriptions are fetched before tool use
✅ Backward Compatible: Servers remain fully MCP-compliant
✅ Production-Ready: Includes error handling, security considerations, and best practices

🛠️ Technical Implementation

Server-Side

Dual-mode MCP server (Standard vs. Progressive)
20 tools: 9 working (time, math, encoding) + 11 enterprise simulations (AWS, K8s, Jira)
Session-based authorization tracking
Resource endpoint for lazy-loading tool descriptions

Client-Side

OpenAI GPT-4o agent with tool calling
Real-time token counting and metrics
Progressive disclosure workflow detection and enforcement
Gradio 6 UI for interactive comparison

Architecture

User Query → Agent (with system prompt) → MCP Client
                ↓
         [Progressive Mode]
    1. List tools (minimal)
    2. Select tool
    3. Fetch full description via resource
    4. Authorize tool
    5. Execute tool

🎯 Real-World Impact

This extension enables:

Enterprise MCP Servers with 100+ tools without context bloat
Multi-Domain Assistants that only load relevant tool details
Cost Optimization by reducing input tokens per request
Better Context Utilization leaving more room for reasoning and conversation

Use Cases

AWS/Cloud infrastructure management (50-100 tools)
Enterprise SaaS integrations (Salesforce, Jira, Slack, etc.)
Developer tooling (Git, CI/CD, monitoring)
Multi-domain knowledge systems

📊 Demo Tools Included

Fully Working Tools (9):

get_current_time - Returns current server time
calculate - Performs arithmetic operations
generate_random - Generates random numbers
echo_text, reverse_string - Text manipulation
base64_encode, base64_decode - Encoding utilities
count_words, convert_units - Analysis and conversion

Enterprise Tool Simulations (11): These tools have realistic, verbose descriptions (400+ tokens each) to demonstrate token savings, but return simulation responses:

aws_ec2_launch_instance, aws_s3_create_bucket
db_postgres_query, db_mongo_find
jira_create_issue, github_create_pull_request
slack_send_message, stripe_create_charge
k8s_deploy_pod, salesforce_update_lead
datadog_query_metrics

Purpose: The combination of working + simulated tools shows how Progressive Disclosure scales to enterprise scenarios with 50-100+ tools.

🏆 Hackathon Submission

Track: Track 1 - Building MCP (Protocol Extension)
Category: Protocol Extension / Optimization
Author: Michael Martin
Social Post: https://x.com/AppRushAI/status/1995274123279536330

🚀 Getting Started Locally

# Install dependencies
pip install -r requirements.txt

# Set your OpenAI API key
export OPENAI_API_KEY=sk-your-key-here

# Run the demo
python3 app.py

📝 License

MIT License - See LICENSE file for details.

🙏 Acknowledgments

Built for the MCP 1st Birthday Hackathon hosted by Anthropic and Gradio on Hugging Face.

Special thanks to the MCP community for feedback and the Anthropic team for creating the Model Context Protocol.