NACC: Proving That AI Can Orchestrate Real Infrastructure at Scale
How I built a multi-node agentic system that turned cloud platform constraints into innovationβand what this means for the future of infrastructure
π The Problem Nobody's Solving
Every day, engineers at scale face the same nightmare:
- Developers managing 50+ microservices across multiple clouds
- DevOps teams SSH-ing between servers, manually running commands
- Security teams manually coordinating vulnerability scans across environments
- MLOps engineers deploying models to 100+ edge devices via scripts
- Result: Bottleneck, errors, and massive operational overhead
We've automated everything EXCEPT orchestrating and reasoning about multiple machines at once.
That's the problem NACC solves: Making AI agents the orchestration layer for distributed infrastructure.
π‘ The Core Innovation: AI as Infrastructure Brain
Instead of building another monitoring dashboard or automation script, NACC asks: What if you could just talk to all your infrastructure in plain English?
User: "Check all prod servers for the Xz vulnerability and generate a report"
NACC:
1. Identifies 47 production nodes
2. Plans vulnerability scan workflow
3. Coordinates execution across all nodes
4. Aggregates results
5. Generates actionable report
All autonomously. Using natural language.
This isn't a dashboard. This isn't a script. This is AI as your infrastructure co-pilot.
ποΈ Why This Matters (Business Perspective)
The TAM (Total Addressable Market)
- DevOps/Infrastructure: $50B+ market
- Enterprise Automation: $200B+ market
- Cloud Management: Growing 25%+ YoY
- Security Operations: $60B+ market
NACC's Opportunity: Become the conversational interface for ALL infrastructure management.
Instead of Terraform (code), Ansible (YAML), or Kubernetes (manifests)βjust talk to your infrastructure.
Competitive Advantages
| Approach | Current State | With NACC |
|---|---|---|
| DevOps Tasks | Manual SSH or scripts | Natural language commands |
| Deployment Coordination | Scheduled jobs + manual oversight | AI-driven autonomous orchestration |
| Incident Response | Page on-call engineers | AI automatically diagnostics + fixes |
| Multi-Cloud Management | Separate tools per cloud | Unified agentic interface |
π οΈ What I Built (Technical Excellence)
The Two-Space Architecture
Faced with the constraint that HuggingFace Spaces run in isolated containers, I pioneered a breakthrough: Two spaces communicating via HTTP-based MCP protocol.
- Main Space: Orchestrator + AI brain (using Blaxel for <25ms LLM inference)
- VM Space: Simulated production node with command execution + file management
- Communication: Custom JSON-RPC implementation over HTTPS
Result: Fully functional distributed system demo on a free platform.
Custom MCP Implementation
Built the entire MCP stack from scratch:
class MCPServer:
def register_tool(self, name, schema, handler):
"""Register MCP-compliant tools"""
self.tools[name] = {
"name": name,
"description": schema["description"],
"inputSchema": schema["parameters"],
"handler": handler
}
async def handle_call_tool(self, name, arguments):
"""Execute tools via MCP protocol"""
return await self.tools[name]["handler"](**arguments)
Why from scratch?: Security, performance, and to deeply understand the protocol. No black boxes on critical infrastructure.
Cross-Space Node Management
The innovation: Abstract multiple HF Spaces as "nodes" in a distributed system.
class NodeManager:
nodes = {
"hf-space-local": LocalNode(),
"vm-node-01": RemoteNode("https://huggingface.co/spaces/.../NACC-VM")
}
async def route_command(self, node_id, command):
"""Seamlessly execute commands on any node"""
node = self.nodes[node_id]
return await node.execute(command)
What this proves: You can build real distributed system demos without expensive infrastructure.
π― Real-World Applications
For Enterprises
- Multi-cloud deployment orchestration: "Deploy frontend to AWS, backend to GCP, run tests"
- Security automation: "Scan all prod systems, compile vulnerability report"
- Incident response: "Find root cause, collect logs, trigger alerts"
For DevOps Teams
- Kubernetes orchestration: Manage clusters through conversation
- Database migrations: Coordinate across multiple database instances
- Infrastructure provisioning: "Set up 3-tier app on AWS, configure auto-scaling"
For AI/ML Engineers
- Model deployment: "Deploy v2.0 to 50 edge devices, rollback if accuracy < 90%"
- Distributed training: Orchestrate multi-node training jobs
- A/B testing: "Run experiment on 25% of prod, monitor metrics"
π The Technical Challenges (What I Learned)
Challenge 1: State Management Across Boundaries
Problem: Each HF Space is stateless. How do you maintain context when commands span multiple spaces?
Solution: Session-based state persistence with context injection
class SessionManager:
sessions = {}
def get_context(self, session_id):
return {
"current_node": "vm-node-01", # Remember which node user is on
"current_path": "/app/src", # Remember directory
"last_command": "ls", # Remember history
}
Impact: Users can seamlessly switch between nodesβthe system remembers context.
Challenge 2: Security Without Sacrificing Functionality
Problem: Can't give full root access in a public demo, but also can't cripple the system.
Solution: Intelligent whitelisting + path restrictions
ALLOWED_COMMANDS = {"ls", "cat", "python3", "find", "grep", "wc"}
ROOT_DIR = "/app" # All operations sandboxed
TIMEOUT = 30 # Prevent runaway processes
Result: Safe for public use, still demonstrates core capabilities.
Challenge 3: MCP Protocol Compliance at Scale
Problem: Implementing MCP for multi-node orchestration wasn't in any documentation.
Solution: Custom tool definitions that scale horizontally
tools.register("execute_command", execute_on_node)
tools.register("read_file", read_from_node)
tools.register("switch_node", change_active_node)
tools.register("sync_files", multi_node_sync) # Custom innovation
Learning: Standard protocols need custom extensions for novel use cases. That's okayβit shows deep understanding.
π Why This Matters for Hiring/Collaboration
What NACC Demonstrates
β
Full-Stack Capability: Architected, implemented, deployed, and documented a complex system
β
Problem-Solving Under Constraints: Turned platform limitations into innovation
β
Deep Protocol Understanding: Built MCP from scratch to prove comprehension
β
Production Thinking: Security, scalability, state managementβnot just "does it work"
β
Communication: Explained complex tech to multiple audiences
β
Grit: Balanced hackathon coding with semester exams and still shipped
For Potential Collaborators
NACC is open for contributions. If you're interested in:
- Kubernetes integration
- Real-time monitoring dashboard
- Security hardening
- Blockchain node orchestration
- Plugin system for custom tools
Let's build this together. This is just the v1 demo.
π The Vision: Generational Shift
In 5 years, I believe:
Infrastructure code becomes infrastructure conversation
- No more Terraform for beginnersβjust ask AI agents
- Democratizes DevOps for junior engineers
AI agents become first-class infrastructure citizens
- Like Docker changed deployment, agentic interfaces change orchestration
- Every company runs autonomous AI orchestration
MCP becomes the standard infrastructure protocol
- HTTP standardized the web
- MCP will standardize AI-to-infrastructure communication
NACC is my bet that this shift is coming, and it starts with proving it's possible.
π Get Involved
Try NACC
- Main Space: Live Demo
- GitHub: Open Source
- Quick Commands:
list nodesβswitch to vm-node-01βread file demo.txt
Collaborate
Interested in working on NACC together?
- Internship opportunities: I'm open to summer/full-time roles in infrastructure/DevOps/AI
- Collaboration: Let's build the orchestration future
- Discussion: Questions about agentic systems, MCP, or distributed infrastructure?
Connect on LinkedIn: @vasanthadithya-mundrathi
Code on GitHub: @Vasanthadithya-mundrathi
π The Bottom Line
NACC isn't just a hackathon project. It's proof that AI can be the reasoning layer for infrastructure orchestration.
For recruiters: I ship complex systems. I learn fast. I think deeply about problems.
For collaborators: Let's build the future of infrastructure together.
For the industry: This is coming. Will you be ready?
Built by: Vasanthadithya Mundrathi (3rd year CS student, CBIT Hyderabad)
For: MCP 1st Birthday Hackathon
With: Blaxel, HuggingFace, Anthropic, Gradio
Goal: Prove that conversational AI is the future of infrastructure