Features: - Automated datacenter documentation generation - MCP integration for device connectivity - Auto-remediation engine with safety checks - Multi-factor reliability scoring (0-100%) - Human feedback learning loop - Pattern recognition and continuous improvement - Agentic chat support with AI - API for ticket resolution - Frontend React with Material-UI - CI/CD pipelines (GitLab + Gitea) - Docker & Kubernetes deployment - Complete documentation and guides v2.0 Highlights: - Auto-remediation with write operations (disabled by default) - Reliability calculator with 4-factor scoring - Human feedback system for continuous learning - Pattern-based progressive automation - Approval workflow for critical actions - Full audit trail and rollback capability
11 KiB
11 KiB
🚀 Datacenter Documentation System - Complete Integration
Sistema completo per la gestione automatizzata della documentazione datacenter con:
- ✅ MCP Integration - Connessione ai dispositivi via Model Context Protocol
- ✅ API REST - Risoluzione automatica ticket
- ✅ Chat Agentica - Supporto tecnico AI-powered
- ✅ CI/CD Pipelines - GitLab e Gitea
- ✅ Container Ready - Docker e Kubernetes
- ✅ Production Ready - Monitoring, logging, scalability
📐 Architettura Sistema
┌─────────────────────────────────────────────────────────────┐
│ External Systems │
│ Ticket Systems │ Monitoring │ Users │ Chat Interface │
└─────────────────┬───────────────────────┬───────────────────┘
│ │
┌────────▼────────┐ ┌────────▼────────┐
│ API Service │ │ Chat Service │
│ (FastAPI) │ │ (WebSocket) │
└────────┬────────┘ └────────┬────────┘
│ │
┌──────▼───────────────────────▼──────┐
│ Documentation Agent (AI) │
│ - Vector Search (ChromaDB) │
│ - Claude Sonnet 4.5 │
│ - Autonomous Doc Retrieval │
└──────┬──────────────────────────────┘
│
┌────────▼────────┐
│ MCP Client │
└────────┬────────┘
│
┌─────────────▼──────────────┐
│ MCP Server │
│ (Device Connectivity) │
└────┬────┬────┬────┬────┬───┘
│ │ │ │ │
┌────▼┐ ┌─▼──┐ ┌▼─┐ ┌▼──┐ ┌▼───┐
│VMware│ │K8s │ │OS│ │Net│ │Stor│
└─────┘ └────┘ └──┘ └───┘ └────┘
🎯 Features Principali
1️⃣ API per Risoluzione Ticket
# Invia ticket automaticamente
curl -X POST https://docs.company.local/api/v1/tickets \
-H "Content-Type: application/json" \
-d '{
"ticket_id": "INC-12345",
"title": "Network connectivity issue",
"description": "Cannot ping 10.0.20.5 from VLAN 100",
"priority": "high",
"category": "network"
}'
# Response
{
"ticket_id": "INC-12345",
"status": "resolved",
"resolution": "Check switch port configuration...",
"suggested_actions": [
"Verify VLAN 100 configuration on core switch",
"Check inter-VLAN routing",
"Verify ACLs on firewall"
],
"confidence_score": 0.92,
"related_docs": [...]
}
2️⃣ Chat Agentica
// WebSocket connection
const ws = new WebSocket('wss://docs.company.local/chat');
ws.send(JSON.stringify({
type: 'message',
content: 'How do I check UPS battery status?'
}));
// AI searches documentation autonomously and responds
ws.onmessage = (event) => {
const response = JSON.parse(event.data);
// {
// message: "To check UPS battery status...",
// related_docs: [...],
// confidence: 0.95
// }
};
3️⃣ MCP Integration
from datacenter_docs.mcp.client import MCPClient, MCPCollector
async with MCPClient(
server_url="https://mcp.company.local",
api_key="your-api-key"
) as mcp:
# Query VMware
vms = await mcp.query_vmware("vcenter-01", "list_vms")
# Query Kubernetes
pods = await mcp.query_kubernetes("prod-cluster", "all", "pods")
# Execute network commands
output = await mcp.exec_network_command(
"core-sw-01",
["show vlan brief"]
)
🛠️ Setup e Deploy
Prerequisites
- Python 3.10+
- Poetry 1.7+
- Docker & Docker Compose
- Kubernetes cluster (per production)
- MCP Server running
- Anthropic API key
1. Local Development
# Clone repository
git clone https://git.company.local/infrastructure/datacenter-docs.git
cd datacenter-docs
# Setup con Poetry
poetry install
# Configurazione
cp .env.example .env
# Edita .env con le tue credenziali
# Start database e redis
docker-compose up -d postgres redis
# Run migrations
poetry run alembic upgrade head
# Index documentation
poetry run datacenter-docs index-docs --path ./output
# Start API
poetry run uvicorn datacenter_docs.api.main:app --reload
# Start Chat (in un altro terminale)
poetry run python -m datacenter_docs.chat.server
# Start Worker (in un altro terminale)
poetry run celery -A datacenter_docs.workers.celery_app worker --loglevel=info
2. Docker Compose (All-in-one)
# Build e start tutti i servizi
docker-compose up -d
# Check logs
docker-compose logs -f api chat worker
# Access services
# API: http://localhost:8000
# Chat: http://localhost:8001
# Frontend: http://localhost
# Flower (Celery monitoring): http://localhost:5555
3. Kubernetes Production
# Apply manifests
kubectl apply -f deploy/kubernetes/namespace.yaml
kubectl apply -f deploy/kubernetes/secrets.yaml # Create this first
kubectl apply -f deploy/kubernetes/configmap.yaml
kubectl apply -f deploy/kubernetes/deployment.yaml
kubectl apply -f deploy/kubernetes/service.yaml
kubectl apply -f deploy/kubernetes/ingress.yaml
# Check status
kubectl get pods -n datacenter-docs
kubectl logs -n datacenter-docs deployment/api
# Scale
kubectl scale deployment api --replicas=5 -n datacenter-docs
🔄 CI/CD Pipelines
GitLab CI
# .gitlab-ci.yml
stages: [lint, test, build, deploy]
# Automatic on push to main:
# - Lint code
# - Run tests
# - Build Docker images
# - Deploy to staging
# - Manual deploy to production
Gitea Actions
# .gitea/workflows/ci.yml
# Triggers:
# - Push to main/develop
# - Pull requests
# - Schedule (ogni 6 ore per docs generation)
# Actions:
# - Lint, test, security scan
# - Build multi-arch images
# - Deploy to K8s
# - Generate documentation
📡 API Endpoints
Ticket Management
POST /api/v1/tickets Create & process ticket
GET /api/v1/tickets/{ticket_id} Get ticket status
GET /api/v1/stats/tickets Get statistics
Documentation
POST /api/v1/documentation/search Search docs
POST /api/v1/documentation/generate/{sec} Generate section
GET /api/v1/documentation/sections List sections
Health & Monitoring
GET /health Health check
GET /metrics Prometheus metrics
🤖 Chat Interface Usage
Web Chat
Accedi a https://docs.company.local/chat
Features:
- 💬 Real-time chat con AI
- 📚 Ricerca autonoma documentazione
- 🎯 Suggerimenti contestuali
- 📎 Upload file/ticket
- 💾 Cronologia conversazioni
Integration con External Systems
# Python example
import requests
response = requests.post(
'https://docs.company.local/api/v1/tickets',
json={
'ticket_id': 'EXT-12345',
'title': 'Storage issue',
'description': 'Datastore running out of space',
'category': 'storage'
}
)
resolution = response.json()
print(resolution['resolution'])
print(resolution['suggested_actions'])
// JavaScript example
const response = await fetch('https://docs.company.local/api/v1/tickets', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
ticket_id: 'EXT-12345',
title: 'Storage issue',
description: 'Datastore running out of space',
category: 'storage'
})
});
const resolution = await response.json();
🔐 Security
Authentication
- API Key based authentication
- JWT tokens per chat sessions
- MCP server credentials secured in vault
Secrets Management
# Kubernetes secrets
kubectl create secret generic datacenter-secrets \
--from-literal=database-url='postgresql://...' \
--from-literal=redis-url='redis://...' \
--from-literal=mcp-api-key='...' \
--from-literal=anthropic-api-key='...' \
-n datacenter-docs
# Docker secrets
docker secret create mcp_api_key ./mcp_key.txt
Network Security
- All communications over TLS
- Network policies in Kubernetes
- Rate limiting enabled
- CORS properly configured
📊 Monitoring & Observability
Metrics (Prometheus)
# Exposed at /metrics
datacenter_docs_tickets_total
datacenter_docs_tickets_resolved_total
datacenter_docs_resolution_confidence_score
datacenter_docs_processing_time_seconds
datacenter_docs_api_requests_total
Logging
# Structured logging in JSON
{
"timestamp": "2025-01-15T10:30:00Z",
"level": "INFO",
"service": "api",
"event": "ticket_resolved",
"ticket_id": "INC-12345",
"confidence": 0.92,
"processing_time": 2.3
}
Tracing
- OpenTelemetry integration
- Distributed tracing across services
- Jaeger UI for visualization
🧪 Testing
# Unit tests
poetry run pytest tests/unit -v --cov
# Integration tests
poetry run pytest tests/integration -v
# E2E tests
poetry run pytest tests/e2e -v
# Load testing
poetry run locust -f tests/load/locustfile.py
🔧 Configuration
Environment Variables
# Core
DATABASE_URL=postgresql://user:pass@host:5432/db
REDIS_URL=redis://:pass@host:6379/0
# MCP
MCP_SERVER_URL=https://mcp.company.local
MCP_API_KEY=your_mcp_key
# AI
ANTHROPIC_API_KEY=your_anthropic_key
# Optional
LOG_LEVEL=INFO
DEBUG=false
WORKERS=4
MAX_TOKENS=4096
📚 Documentation
/docs- API documentation (Swagger/OpenAPI)/redoc- Alternative API documentationQUICK_START.md- Quick start guideARCHITECTURE.md- System architectureDEPLOYMENT.md- Deployment guide
🤝 Contributing
- Create feature branch:
git checkout -b feature/amazing-feature - Commit changes:
git commit -m 'Add amazing feature' - Push to branch:
git push origin feature/amazing-feature - Open Pull Request
- CI/CD runs automatically
- Merge after approval
📝 License
MIT License - see LICENSE file
🆘 Support
- Email: automation-team@company.local
- Slack: #datacenter-automation
- Issues: https://git.company.local/infrastructure/datacenter-docs/issues
🎯 Roadmap
- MCP Integration
- API per ticket resolution
- Chat agentica
- CI/CD pipelines
- Docker & Kubernetes
- Multi-language support
- Advanced analytics dashboard
- Mobile app
- Voice interface
- Automated remediation
Powered by Claude Sonnet 4.5 & MCP 🚀