NACC: Proving That AI Can Orchestrate Real Infrastructure at Scale

Community Article Published November 30, 2025

How I built a multi-node agentic system that turned cloud platform constraints into innovation—and what this means for the future of infrastructure

🚀 The Problem Nobody's Solving

Every day, engineers at scale face the same nightmare:

Developers managing 50+ microservices across multiple clouds
DevOps teams SSH-ing between servers, manually running commands
Security teams manually coordinating vulnerability scans across environments
MLOps engineers deploying models to 100+ edge devices via scripts
Result: Bottleneck, errors, and massive operational overhead

We've automated everything EXCEPT orchestrating and reasoning about multiple machines at once.

That's the problem NACC solves: Making AI agents the orchestration layer for distributed infrastructure.

💡 The Core Innovation: AI as Infrastructure Brain

Instead of building another monitoring dashboard or automation script, NACC asks: What if you could just talk to all your infrastructure in plain English?

User: "Check all prod servers for the Xz vulnerability and generate a report"

NACC:
1. Identifies 47 production nodes
2. Plans vulnerability scan workflow
3. Coordinates execution across all nodes
4. Aggregates results
5. Generates actionable report

All autonomously. Using natural language.

This isn't a dashboard. This isn't a script. This is AI as your infrastructure co-pilot.

🏗️ Why This Matters (Business Perspective)

The TAM (Total Addressable Market)

DevOps/Infrastructure: $50B+ market
Enterprise Automation: $200B+ market
Cloud Management: Growing 25%+ YoY
Security Operations: $60B+ market

NACC's Opportunity: Become the conversational interface for ALL infrastructure management.

Instead of Terraform (code), Ansible (YAML), or Kubernetes (manifests)—just talk to your infrastructure.

Competitive Advantages

Approach	Current State	With NACC
DevOps Tasks	Manual SSH or scripts	Natural language commands
Deployment Coordination	Scheduled jobs + manual oversight	AI-driven autonomous orchestration
Incident Response	Page on-call engineers	AI automatically diagnostics + fixes
Multi-Cloud Management	Separate tools per cloud	Unified agentic interface

🛠️ What I Built (Technical Excellence)

The Two-Space Architecture

Faced with the constraint that HuggingFace Spaces run in isolated containers, I pioneered a breakthrough: Two spaces communicating via HTTP-based MCP protocol.

Main Space: Orchestrator + AI brain (using Blaxel for <25ms LLM inference)
VM Space: Simulated production node with command execution + file management
Communication: Custom JSON-RPC implementation over HTTPS

Result: Fully functional distributed system demo on a free platform.

Custom MCP Implementation

Built the entire MCP stack from scratch:

class MCPServer:
    def register_tool(self, name, schema, handler):
        """Register MCP-compliant tools"""
        self.tools[name] = {
            "name": name,
            "description": schema["description"],
            "inputSchema": schema["parameters"],
            "handler": handler
        }
    
    async def handle_call_tool(self, name, arguments):
        """Execute tools via MCP protocol"""
        return await self.tools[name]["handler"](**arguments)

Why from scratch?: Security, performance, and to deeply understand the protocol. No black boxes on critical infrastructure.

Cross-Space Node Management

The innovation: Abstract multiple HF Spaces as "nodes" in a distributed system.

class NodeManager:
    nodes = {
        "hf-space-local": LocalNode(),
        "vm-node-01": RemoteNode("https://huggingface.co/spaces/.../NACC-VM")
    }
    
    async def route_command(self, node_id, command):
        """Seamlessly execute commands on any node"""
        node = self.nodes[node_id]
        return await node.execute(command)

What this proves: You can build real distributed system demos without expensive infrastructure.

🎯 Real-World Applications

For Enterprises

Multi-cloud deployment orchestration: "Deploy frontend to AWS, backend to GCP, run tests"
Security automation: "Scan all prod systems, compile vulnerability report"
Incident response: "Find root cause, collect logs, trigger alerts"

For DevOps Teams

Kubernetes orchestration: Manage clusters through conversation
Database migrations: Coordinate across multiple database instances
Infrastructure provisioning: "Set up 3-tier app on AWS, configure auto-scaling"

For AI/ML Engineers

Model deployment: "Deploy v2.0 to 50 edge devices, rollback if accuracy < 90%"
Distributed training: Orchestrate multi-node training jobs
A/B testing: "Run experiment on 25% of prod, monitor metrics"

📊 The Technical Challenges (What I Learned)

Challenge 1: State Management Across Boundaries

Problem: Each HF Space is stateless. How do you maintain context when commands span multiple spaces?

Solution: Session-based state persistence with context injection

class SessionManager:
    sessions = {}
    
    def get_context(self, session_id):
        return {
            "current_node": "vm-node-01",  # Remember which node user is on
            "current_path": "/app/src",    # Remember directory
            "last_command": "ls",           # Remember history
        }

Impact: Users can seamlessly switch between nodes—the system remembers context.

Challenge 2: Security Without Sacrificing Functionality

Problem: Can't give full root access in a public demo, but also can't cripple the system.

Solution: Intelligent whitelisting + path restrictions

ALLOWED_COMMANDS = {"ls", "cat", "python3", "find", "grep", "wc"}
ROOT_DIR = "/app"  # All operations sandboxed
TIMEOUT = 30       # Prevent runaway processes

Result: Safe for public use, still demonstrates core capabilities.

Challenge 3: MCP Protocol Compliance at Scale

Problem: Implementing MCP for multi-node orchestration wasn't in any documentation.

Solution: Custom tool definitions that scale horizontally

tools.register("execute_command", execute_on_node)
tools.register("read_file", read_from_node)
tools.register("switch_node", change_active_node)
tools.register("sync_files", multi_node_sync)  # Custom innovation

Learning: Standard protocols need custom extensions for novel use cases. That's okay—it shows deep understanding.

🎓 Why This Matters for Hiring/Collaboration

What NACC Demonstrates

✅ Full-Stack Capability: Architected, implemented, deployed, and documented a complex system
✅ Problem-Solving Under Constraints: Turned platform limitations into innovation
✅ Deep Protocol Understanding: Built MCP from scratch to prove comprehension
✅ Production Thinking: Security, scalability, state management—not just "does it work"
✅ Communication: Explained complex tech to multiple audiences
✅ Grit: Balanced hackathon coding with semester exams and still shipped

For Potential Collaborators

NACC is open for contributions. If you're interested in:

Kubernetes integration
Real-time monitoring dashboard
Security hardening
Blockchain node orchestration
Plugin system for custom tools

Let's build this together. This is just the v1 demo.

🚀 The Vision: Generational Shift

In 5 years, I believe:

Infrastructure code becomes infrastructure conversation
- No more Terraform for beginners—just ask AI agents
- Democratizes DevOps for junior engineers
AI agents become first-class infrastructure citizens
- Like Docker changed deployment, agentic interfaces change orchestration
- Every company runs autonomous AI orchestration
MCP becomes the standard infrastructure protocol
- HTTP standardized the web
- MCP will standardize AI-to-infrastructure communication

NACC is my bet that this shift is coming, and it starts with proving it's possible.

🔗 Get Involved

Try NACC

Main Space: Live Demo
GitHub: Open Source
Quick Commands: list nodes → switch to vm-node-01 → read file demo.txt

Collaborate

Interested in working on NACC together?

Internship opportunities: I'm open to summer/full-time roles in infrastructure/DevOps/AI
Collaboration: Let's build the orchestration future
Discussion: Questions about agentic systems, MCP, or distributed infrastructure?

Connect on LinkedIn: @vasanthadithya-mundrathi
Code on GitHub: @Vasanthadithya-mundrathi

📝 The Bottom Line

NACC isn't just a hackathon project. It's proof that AI can be the reasoning layer for infrastructure orchestration.

For recruiters: I ship complex systems. I learn fast. I think deeply about problems.
For collaborators: Let's build the future of infrastructure together.
For the industry: This is coming. Will you be ready?

Built by: Vasanthadithya Mundrathi (3rd year CS student, CBIT Hyderabad)
For: MCP 1st Birthday Hackathon
With: Blaxel, HuggingFace, Anthropic, Gradio
Goal: Prove that conversational AI is the future of infrastructure

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote