Initial commit: LLM Automation Docs & Remediation Engine v2.0
Features: - Automated datacenter documentation generation - MCP integration for device connectivity - Auto-remediation engine with safety checks - Multi-factor reliability scoring (0-100%) - Human feedback learning loop - Pattern recognition and continuous improvement - Agentic chat support with AI - API for ticket resolution - Frontend React with Material-UI - CI/CD pipelines (GitLab + Gitea) - Docker & Kubernetes deployment - Complete documentation and guides v2.0 Highlights: - Auto-remediation with write operations (disabled by default) - Reliability calculator with 4-factor scoring - Human feedback system for continuous learning - Pattern-based progressive automation - Approval workflow for critical actions - Full audit trail and rollback capability
This commit is contained in:
464
README_COMPLETE_SYSTEM.md
Normal file
464
README_COMPLETE_SYSTEM.md
Normal file
@@ -0,0 +1,464 @@
|
||||
# 🚀 Datacenter Documentation System - Complete Integration
|
||||
|
||||
Sistema completo per la gestione automatizzata della documentazione datacenter con:
|
||||
- ✅ **MCP Integration** - Connessione ai dispositivi via Model Context Protocol
|
||||
- ✅ **API REST** - Risoluzione automatica ticket
|
||||
- ✅ **Chat Agentica** - Supporto tecnico AI-powered
|
||||
- ✅ **CI/CD Pipelines** - GitLab e Gitea
|
||||
- ✅ **Container Ready** - Docker e Kubernetes
|
||||
- ✅ **Production Ready** - Monitoring, logging, scalability
|
||||
|
||||
---
|
||||
|
||||
## 📐 Architettura Sistema
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ External Systems │
|
||||
│ Ticket Systems │ Monitoring │ Users │ Chat Interface │
|
||||
└─────────────────┬───────────────────────┬───────────────────┘
|
||||
│ │
|
||||
┌────────▼────────┐ ┌────────▼────────┐
|
||||
│ API Service │ │ Chat Service │
|
||||
│ (FastAPI) │ │ (WebSocket) │
|
||||
└────────┬────────┘ └────────┬────────┘
|
||||
│ │
|
||||
┌──────▼───────────────────────▼──────┐
|
||||
│ Documentation Agent (AI) │
|
||||
│ - Vector Search (ChromaDB) │
|
||||
│ - Claude Sonnet 4.5 │
|
||||
│ - Autonomous Doc Retrieval │
|
||||
└──────┬──────────────────────────────┘
|
||||
│
|
||||
┌────────▼────────┐
|
||||
│ MCP Client │
|
||||
└────────┬────────┘
|
||||
│
|
||||
┌─────────────▼──────────────┐
|
||||
│ MCP Server │
|
||||
│ (Device Connectivity) │
|
||||
└────┬────┬────┬────┬────┬───┘
|
||||
│ │ │ │ │
|
||||
┌────▼┐ ┌─▼──┐ ┌▼─┐ ┌▼──┐ ┌▼───┐
|
||||
│VMware│ │K8s │ │OS│ │Net│ │Stor│
|
||||
└─────┘ └────┘ └──┘ └───┘ └────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Features Principali
|
||||
|
||||
### 1️⃣ API per Risoluzione Ticket
|
||||
```bash
|
||||
# Invia ticket automaticamente
|
||||
curl -X POST https://docs.company.local/api/v1/tickets \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"ticket_id": "INC-12345",
|
||||
"title": "Network connectivity issue",
|
||||
"description": "Cannot ping 10.0.20.5 from VLAN 100",
|
||||
"priority": "high",
|
||||
"category": "network"
|
||||
}'
|
||||
|
||||
# Response
|
||||
{
|
||||
"ticket_id": "INC-12345",
|
||||
"status": "resolved",
|
||||
"resolution": "Check switch port configuration...",
|
||||
"suggested_actions": [
|
||||
"Verify VLAN 100 configuration on core switch",
|
||||
"Check inter-VLAN routing",
|
||||
"Verify ACLs on firewall"
|
||||
],
|
||||
"confidence_score": 0.92,
|
||||
"related_docs": [...]
|
||||
}
|
||||
```
|
||||
|
||||
### 2️⃣ Chat Agentica
|
||||
```javascript
|
||||
// WebSocket connection
|
||||
const ws = new WebSocket('wss://docs.company.local/chat');
|
||||
|
||||
ws.send(JSON.stringify({
|
||||
type: 'message',
|
||||
content: 'How do I check UPS battery status?'
|
||||
}));
|
||||
|
||||
// AI searches documentation autonomously and responds
|
||||
ws.onmessage = (event) => {
|
||||
const response = JSON.parse(event.data);
|
||||
// {
|
||||
// message: "To check UPS battery status...",
|
||||
// related_docs: [...],
|
||||
// confidence: 0.95
|
||||
// }
|
||||
};
|
||||
```
|
||||
|
||||
### 3️⃣ MCP Integration
|
||||
```python
|
||||
from datacenter_docs.mcp.client import MCPClient, MCPCollector
|
||||
|
||||
async with MCPClient(
|
||||
server_url="https://mcp.company.local",
|
||||
api_key="your-api-key"
|
||||
) as mcp:
|
||||
# Query VMware
|
||||
vms = await mcp.query_vmware("vcenter-01", "list_vms")
|
||||
|
||||
# Query Kubernetes
|
||||
pods = await mcp.query_kubernetes("prod-cluster", "all", "pods")
|
||||
|
||||
# Execute network commands
|
||||
output = await mcp.exec_network_command(
|
||||
"core-sw-01",
|
||||
["show vlan brief"]
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🛠️ Setup e Deploy
|
||||
|
||||
### Prerequisites
|
||||
- Python 3.10+
|
||||
- Poetry 1.7+
|
||||
- Docker & Docker Compose
|
||||
- Kubernetes cluster (per production)
|
||||
- MCP Server running
|
||||
- Anthropic API key
|
||||
|
||||
### 1. Local Development
|
||||
|
||||
```bash
|
||||
# Clone repository
|
||||
git clone https://git.company.local/infrastructure/datacenter-docs.git
|
||||
cd datacenter-docs
|
||||
|
||||
# Setup con Poetry
|
||||
poetry install
|
||||
|
||||
# Configurazione
|
||||
cp .env.example .env
|
||||
# Edita .env con le tue credenziali
|
||||
|
||||
# Start database e redis
|
||||
docker-compose up -d postgres redis
|
||||
|
||||
# Run migrations
|
||||
poetry run alembic upgrade head
|
||||
|
||||
# Index documentation
|
||||
poetry run datacenter-docs index-docs --path ./output
|
||||
|
||||
# Start API
|
||||
poetry run uvicorn datacenter_docs.api.main:app --reload
|
||||
|
||||
# Start Chat (in un altro terminale)
|
||||
poetry run python -m datacenter_docs.chat.server
|
||||
|
||||
# Start Worker (in un altro terminale)
|
||||
poetry run celery -A datacenter_docs.workers.celery_app worker --loglevel=info
|
||||
```
|
||||
|
||||
### 2. Docker Compose (All-in-one)
|
||||
|
||||
```bash
|
||||
# Build e start tutti i servizi
|
||||
docker-compose up -d
|
||||
|
||||
# Check logs
|
||||
docker-compose logs -f api chat worker
|
||||
|
||||
# Access services
|
||||
# API: http://localhost:8000
|
||||
# Chat: http://localhost:8001
|
||||
# Frontend: http://localhost
|
||||
# Flower (Celery monitoring): http://localhost:5555
|
||||
```
|
||||
|
||||
### 3. Kubernetes Production
|
||||
|
||||
```bash
|
||||
# Apply manifests
|
||||
kubectl apply -f deploy/kubernetes/namespace.yaml
|
||||
kubectl apply -f deploy/kubernetes/secrets.yaml # Create this first
|
||||
kubectl apply -f deploy/kubernetes/configmap.yaml
|
||||
kubectl apply -f deploy/kubernetes/deployment.yaml
|
||||
kubectl apply -f deploy/kubernetes/service.yaml
|
||||
kubectl apply -f deploy/kubernetes/ingress.yaml
|
||||
|
||||
# Check status
|
||||
kubectl get pods -n datacenter-docs
|
||||
kubectl logs -n datacenter-docs deployment/api
|
||||
|
||||
# Scale
|
||||
kubectl scale deployment api --replicas=5 -n datacenter-docs
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔄 CI/CD Pipelines
|
||||
|
||||
### GitLab CI
|
||||
```yaml
|
||||
# .gitlab-ci.yml
|
||||
stages: [lint, test, build, deploy]
|
||||
|
||||
# Automatic on push to main:
|
||||
# - Lint code
|
||||
# - Run tests
|
||||
# - Build Docker images
|
||||
# - Deploy to staging
|
||||
# - Manual deploy to production
|
||||
```
|
||||
|
||||
### Gitea Actions
|
||||
```yaml
|
||||
# .gitea/workflows/ci.yml
|
||||
# Triggers:
|
||||
# - Push to main/develop
|
||||
# - Pull requests
|
||||
# - Schedule (ogni 6 ore per docs generation)
|
||||
|
||||
# Actions:
|
||||
# - Lint, test, security scan
|
||||
# - Build multi-arch images
|
||||
# - Deploy to K8s
|
||||
# - Generate documentation
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📡 API Endpoints
|
||||
|
||||
### Ticket Management
|
||||
```
|
||||
POST /api/v1/tickets Create & process ticket
|
||||
GET /api/v1/tickets/{ticket_id} Get ticket status
|
||||
GET /api/v1/stats/tickets Get statistics
|
||||
```
|
||||
|
||||
### Documentation
|
||||
```
|
||||
POST /api/v1/documentation/search Search docs
|
||||
POST /api/v1/documentation/generate/{sec} Generate section
|
||||
GET /api/v1/documentation/sections List sections
|
||||
```
|
||||
|
||||
### Health & Monitoring
|
||||
```
|
||||
GET /health Health check
|
||||
GET /metrics Prometheus metrics
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🤖 Chat Interface Usage
|
||||
|
||||
### Web Chat
|
||||
Accedi a `https://docs.company.local/chat`
|
||||
|
||||
Features:
|
||||
- 💬 Real-time chat con AI
|
||||
- 📚 Ricerca autonoma documentazione
|
||||
- 🎯 Suggerimenti contestuali
|
||||
- 📎 Upload file/ticket
|
||||
- 💾 Cronologia conversazioni
|
||||
|
||||
### Integration con External Systems
|
||||
|
||||
```python
|
||||
# Python example
|
||||
import requests
|
||||
|
||||
response = requests.post(
|
||||
'https://docs.company.local/api/v1/tickets',
|
||||
json={
|
||||
'ticket_id': 'EXT-12345',
|
||||
'title': 'Storage issue',
|
||||
'description': 'Datastore running out of space',
|
||||
'category': 'storage'
|
||||
}
|
||||
)
|
||||
|
||||
resolution = response.json()
|
||||
print(resolution['resolution'])
|
||||
print(resolution['suggested_actions'])
|
||||
```
|
||||
|
||||
```javascript
|
||||
// JavaScript example
|
||||
const response = await fetch('https://docs.company.local/api/v1/tickets', {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({
|
||||
ticket_id: 'EXT-12345',
|
||||
title: 'Storage issue',
|
||||
description: 'Datastore running out of space',
|
||||
category: 'storage'
|
||||
})
|
||||
});
|
||||
|
||||
const resolution = await response.json();
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔐 Security
|
||||
|
||||
### Authentication
|
||||
- API Key based authentication
|
||||
- JWT tokens per chat sessions
|
||||
- MCP server credentials secured in vault
|
||||
|
||||
### Secrets Management
|
||||
```bash
|
||||
# Kubernetes secrets
|
||||
kubectl create secret generic datacenter-secrets \
|
||||
--from-literal=database-url='postgresql://...' \
|
||||
--from-literal=redis-url='redis://...' \
|
||||
--from-literal=mcp-api-key='...' \
|
||||
--from-literal=anthropic-api-key='...' \
|
||||
-n datacenter-docs
|
||||
|
||||
# Docker secrets
|
||||
docker secret create mcp_api_key ./mcp_key.txt
|
||||
```
|
||||
|
||||
### Network Security
|
||||
- All communications over TLS
|
||||
- Network policies in Kubernetes
|
||||
- Rate limiting enabled
|
||||
- CORS properly configured
|
||||
|
||||
---
|
||||
|
||||
## 📊 Monitoring & Observability
|
||||
|
||||
### Metrics (Prometheus)
|
||||
```
|
||||
# Exposed at /metrics
|
||||
datacenter_docs_tickets_total
|
||||
datacenter_docs_tickets_resolved_total
|
||||
datacenter_docs_resolution_confidence_score
|
||||
datacenter_docs_processing_time_seconds
|
||||
datacenter_docs_api_requests_total
|
||||
```
|
||||
|
||||
### Logging
|
||||
```bash
|
||||
# Structured logging in JSON
|
||||
{
|
||||
"timestamp": "2025-01-15T10:30:00Z",
|
||||
"level": "INFO",
|
||||
"service": "api",
|
||||
"event": "ticket_resolved",
|
||||
"ticket_id": "INC-12345",
|
||||
"confidence": 0.92,
|
||||
"processing_time": 2.3
|
||||
}
|
||||
```
|
||||
|
||||
### Tracing
|
||||
- OpenTelemetry integration
|
||||
- Distributed tracing across services
|
||||
- Jaeger UI for visualization
|
||||
|
||||
---
|
||||
|
||||
## 🧪 Testing
|
||||
|
||||
```bash
|
||||
# Unit tests
|
||||
poetry run pytest tests/unit -v --cov
|
||||
|
||||
# Integration tests
|
||||
poetry run pytest tests/integration -v
|
||||
|
||||
# E2E tests
|
||||
poetry run pytest tests/e2e -v
|
||||
|
||||
# Load testing
|
||||
poetry run locust -f tests/load/locustfile.py
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Configuration
|
||||
|
||||
### Environment Variables
|
||||
```bash
|
||||
# Core
|
||||
DATABASE_URL=postgresql://user:pass@host:5432/db
|
||||
REDIS_URL=redis://:pass@host:6379/0
|
||||
|
||||
# MCP
|
||||
MCP_SERVER_URL=https://mcp.company.local
|
||||
MCP_API_KEY=your_mcp_key
|
||||
|
||||
# AI
|
||||
ANTHROPIC_API_KEY=your_anthropic_key
|
||||
|
||||
# Optional
|
||||
LOG_LEVEL=INFO
|
||||
DEBUG=false
|
||||
WORKERS=4
|
||||
MAX_TOKENS=4096
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📚 Documentation
|
||||
|
||||
- `/docs` - API documentation (Swagger/OpenAPI)
|
||||
- `/redoc` - Alternative API documentation
|
||||
- `QUICK_START.md` - Quick start guide
|
||||
- `ARCHITECTURE.md` - System architecture
|
||||
- `DEPLOYMENT.md` - Deployment guide
|
||||
|
||||
---
|
||||
|
||||
## 🤝 Contributing
|
||||
|
||||
1. Create feature branch: `git checkout -b feature/amazing-feature`
|
||||
2. Commit changes: `git commit -m 'Add amazing feature'`
|
||||
3. Push to branch: `git push origin feature/amazing-feature`
|
||||
4. Open Pull Request
|
||||
5. CI/CD runs automatically
|
||||
6. Merge after approval
|
||||
|
||||
---
|
||||
|
||||
## 📝 License
|
||||
|
||||
MIT License - see LICENSE file
|
||||
|
||||
---
|
||||
|
||||
## 🆘 Support
|
||||
|
||||
- **Email**: automation-team@company.local
|
||||
- **Slack**: #datacenter-automation
|
||||
- **Issues**: https://git.company.local/infrastructure/datacenter-docs/issues
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Roadmap
|
||||
|
||||
- [x] MCP Integration
|
||||
- [x] API per ticket resolution
|
||||
- [x] Chat agentica
|
||||
- [x] CI/CD pipelines
|
||||
- [x] Docker & Kubernetes
|
||||
- [ ] Multi-language support
|
||||
- [ ] Advanced analytics dashboard
|
||||
- [ ] Mobile app
|
||||
- [ ] Voice interface
|
||||
- [ ] Automated remediation
|
||||
|
||||
---
|
||||
|
||||
**Powered by Claude Sonnet 4.5 & MCP** 🚀
|
||||
Reference in New Issue
Block a user