Features: - Automated datacenter documentation generation - MCP integration for device connectivity - Auto-remediation engine with safety checks - Multi-factor reliability scoring (0-100%) - Human feedback learning loop - Pattern recognition and continuous improvement - Agentic chat support with AI - API for ticket resolution - Frontend React with Material-UI - CI/CD pipelines (GitLab + Gitea) - Docker & Kubernetes deployment - Complete documentation and guides v2.0 Highlights: - Auto-remediation with write operations (disabled by default) - Reliability calculator with 4-factor scoring - Human feedback system for continuous learning - Pattern-based progressive automation - Approval workflow for critical actions - Full audit trail and rollback capability
444 lines
8.8 KiB
Markdown
444 lines
8.8 KiB
Markdown
# 🚀 Deployment Guide - Datacenter Documentation System
|
|
|
|
## Quick Deploy Options
|
|
|
|
### Option 1: Docker Compose (Recommended for Development/Small Scale)
|
|
|
|
```bash
|
|
# 1. Clone repository
|
|
git clone https://git.company.local/infrastructure/datacenter-docs.git
|
|
cd datacenter-docs
|
|
|
|
# 2. Configure environment
|
|
cp .env.example .env
|
|
nano .env # Edit with your credentials
|
|
|
|
# 3. Start all services
|
|
docker-compose up -d
|
|
|
|
# 4. Check health
|
|
curl http://localhost:8000/health
|
|
|
|
# 5. Access services
|
|
# API: http://localhost:8000/api/docs
|
|
# Chat: http://localhost:8001
|
|
# Frontend: http://localhost
|
|
# Flower: http://localhost:5555
|
|
```
|
|
|
|
### Option 2: Kubernetes (Production)
|
|
|
|
```bash
|
|
# 1. Create namespace
|
|
kubectl apply -f deploy/kubernetes/namespace.yaml
|
|
|
|
# 2. Create secrets
|
|
kubectl create secret generic datacenter-secrets \
|
|
--from-literal=database-url='postgresql://user:pass@host:5432/db' \
|
|
--from-literal=redis-url='redis://:pass@host:6379/0' \
|
|
--from-literal=mcp-api-key='your-mcp-key' \
|
|
--from-literal=anthropic-api-key='your-claude-key' \
|
|
-n datacenter-docs
|
|
|
|
# 3. Create configmap
|
|
kubectl create configmap datacenter-config \
|
|
--from-literal=mcp-server-url='https://mcp.company.local' \
|
|
-n datacenter-docs
|
|
|
|
# 4. Deploy services
|
|
kubectl apply -f deploy/kubernetes/deployment.yaml
|
|
kubectl apply -f deploy/kubernetes/service.yaml
|
|
kubectl apply -f deploy/kubernetes/ingress.yaml
|
|
|
|
# 5. Check deployment
|
|
kubectl get pods -n datacenter-docs
|
|
kubectl logs -n datacenter-docs deployment/api
|
|
```
|
|
|
|
### Option 3: GitLab CI/CD (Automated)
|
|
|
|
```bash
|
|
# 1. Push to GitLab
|
|
git push origin main
|
|
|
|
# 2. Pipeline runs automatically:
|
|
# - Lint & Test
|
|
# - Build Docker images
|
|
# - Deploy to staging (manual approval)
|
|
# - Deploy to production (manual, on tags)
|
|
|
|
# 3. Monitor pipeline
|
|
# Visit: https://gitlab.company.local/infrastructure/datacenter-docs/-/pipelines
|
|
```
|
|
|
|
### Option 4: Gitea Actions (Automated)
|
|
|
|
```bash
|
|
# 1. Push to Gitea
|
|
git push origin main
|
|
|
|
# 2. Workflow triggers:
|
|
# - On push: Build & deploy to staging
|
|
# - On tag: Deploy to production
|
|
# - On schedule: Generate docs every 6h
|
|
|
|
# 3. Monitor workflow
|
|
# Visit: https://gitea.company.local/infrastructure/datacenter-docs/actions
|
|
```
|
|
|
|
---
|
|
|
|
## Configuration Details
|
|
|
|
### Environment Variables (.env)
|
|
|
|
```bash
|
|
# Database
|
|
DATABASE_URL=postgresql://docs_user:CHANGE_ME@postgres:5432/datacenter_docs
|
|
|
|
# Redis
|
|
REDIS_URL=redis://:CHANGE_ME@redis:6379/0
|
|
|
|
# MCP Server (CRITICAL - Required for device connectivity)
|
|
MCP_SERVER_URL=https://mcp.company.local
|
|
MCP_API_KEY=your_mcp_api_key_here
|
|
|
|
# Anthropic Claude API (CRITICAL - Required for AI)
|
|
ANTHROPIC_API_KEY=sk-ant-api03-xxxxx
|
|
|
|
# CORS (Adjust for your domain)
|
|
CORS_ORIGINS=http://localhost:3000,https://docs.company.local
|
|
|
|
# Optional
|
|
LOG_LEVEL=INFO
|
|
DEBUG=false
|
|
WORKERS=4
|
|
MAX_TOKENS=4096
|
|
```
|
|
|
|
### Kubernetes Secrets (secrets.yaml)
|
|
|
|
```yaml
|
|
apiVersion: v1
|
|
kind: Secret
|
|
metadata:
|
|
name: datacenter-secrets
|
|
namespace: datacenter-docs
|
|
type: Opaque
|
|
stringData:
|
|
database-url: "postgresql://user:pass@postgresql.default:5432/datacenter_docs"
|
|
redis-url: "redis://:pass@redis.default:6379/0"
|
|
mcp-api-key: "your-mcp-key"
|
|
anthropic-api-key: "sk-ant-api03-xxxxx"
|
|
```
|
|
|
|
---
|
|
|
|
## Post-Deployment Steps
|
|
|
|
### 1. Database Migrations
|
|
|
|
```bash
|
|
# Docker Compose
|
|
docker-compose exec api poetry run alembic upgrade head
|
|
|
|
# Kubernetes
|
|
kubectl exec -n datacenter-docs deployment/api -- \
|
|
poetry run alembic upgrade head
|
|
```
|
|
|
|
### 2. Index Initial Documentation
|
|
|
|
```bash
|
|
# Docker Compose
|
|
docker-compose exec api poetry run datacenter-docs index-docs \
|
|
--path /app/output
|
|
|
|
# Kubernetes
|
|
kubectl exec -n datacenter-docs deployment/api -- \
|
|
poetry run datacenter-docs index-docs --path /app/output
|
|
```
|
|
|
|
### 3. Generate Documentation
|
|
|
|
```bash
|
|
# Manual trigger
|
|
curl -X POST http://localhost:8000/api/v1/documentation/generate/infrastructure
|
|
|
|
# Or run full generation
|
|
docker-compose exec worker poetry run datacenter-docs generate-all
|
|
```
|
|
|
|
### 4. Test API
|
|
|
|
```bash
|
|
# Health check
|
|
curl http://localhost:8000/health
|
|
|
|
# Create test ticket
|
|
curl -X POST http://localhost:8000/api/v1/tickets \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"ticket_id": "TEST-001",
|
|
"title": "Test ticket",
|
|
"description": "Testing auto-resolution",
|
|
"category": "network"
|
|
}'
|
|
|
|
# Get ticket status
|
|
curl http://localhost:8000/api/v1/tickets/TEST-001
|
|
|
|
# Search documentation
|
|
curl -X POST http://localhost:8000/api/v1/documentation/search \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"query": "UPS battery status", "limit": 5}'
|
|
```
|
|
|
|
---
|
|
|
|
## Monitoring
|
|
|
|
### Prometheus Metrics
|
|
|
|
```bash
|
|
# Metrics endpoint
|
|
curl http://localhost:8000/metrics
|
|
|
|
# Example metrics:
|
|
# datacenter_docs_tickets_total
|
|
# datacenter_docs_tickets_resolved_total
|
|
# datacenter_docs_resolution_confidence_score
|
|
# datacenter_docs_processing_time_seconds
|
|
```
|
|
|
|
### Grafana Dashboards
|
|
|
|
Import dashboard from: `deploy/grafana/dashboard.json`
|
|
|
|
### Logs
|
|
|
|
```bash
|
|
# Docker Compose
|
|
docker-compose logs -f api chat worker
|
|
|
|
# Kubernetes
|
|
kubectl logs -n datacenter-docs deployment/api -f
|
|
kubectl logs -n datacenter-docs deployment/chat -f
|
|
kubectl logs -n datacenter-docs deployment/worker -f
|
|
```
|
|
|
|
### Celery Flower (Task Monitoring)
|
|
|
|
Access: http://localhost:5555 (Docker Compose) or https://docs.company.local/flower (K8s)
|
|
|
|
---
|
|
|
|
## Scaling
|
|
|
|
### Horizontal Scaling
|
|
|
|
```bash
|
|
# Docker Compose (increase replicas in docker-compose.yml)
|
|
docker-compose up -d --scale worker=5
|
|
|
|
# Kubernetes
|
|
kubectl scale deployment api --replicas=5 -n datacenter-docs
|
|
kubectl scale deployment worker --replicas=10 -n datacenter-docs
|
|
```
|
|
|
|
### Vertical Scaling
|
|
|
|
Edit resource limits in `deploy/kubernetes/deployment.yaml`:
|
|
|
|
```yaml
|
|
resources:
|
|
requests:
|
|
memory: "1Gi"
|
|
cpu: "500m"
|
|
limits:
|
|
memory: "2Gi"
|
|
cpu: "2000m"
|
|
```
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### API not starting
|
|
|
|
```bash
|
|
# Check logs
|
|
docker-compose logs api
|
|
|
|
# Common issues:
|
|
# - Database not accessible
|
|
# - Missing environment variables
|
|
# - MCP server not reachable
|
|
|
|
# Test database connection
|
|
docker-compose exec api python -c "
|
|
from datacenter_docs.utils.database import get_db
|
|
next(get_db())
|
|
print('DB OK')
|
|
"
|
|
```
|
|
|
|
### Chat not connecting
|
|
|
|
```bash
|
|
# Check WebSocket connection
|
|
# Browser console should show: WebSocket connection established
|
|
|
|
# Test from curl
|
|
curl -N -H "Connection: Upgrade" -H "Upgrade: websocket" \
|
|
http://localhost:8001/socket.io/
|
|
```
|
|
|
|
### Worker not processing jobs
|
|
|
|
```bash
|
|
# Check Celery status
|
|
docker-compose exec worker celery -A datacenter_docs.workers.celery_app status
|
|
|
|
# Check Redis connection
|
|
docker-compose exec worker python -c "
|
|
import redis
|
|
r = redis.from_url('redis://:pass@redis:6379/0')
|
|
print(r.ping())
|
|
"
|
|
```
|
|
|
|
### MCP Connection Issues
|
|
|
|
```bash
|
|
# Test MCP connectivity
|
|
docker-compose exec api python -c "
|
|
import asyncio
|
|
from datacenter_docs.mcp.client import MCPClient
|
|
|
|
async def test():
|
|
async with MCPClient(
|
|
server_url='https://mcp.company.local',
|
|
api_key='your-key'
|
|
) as client:
|
|
resources = await client.list_resources()
|
|
print(f'Found {len(resources)} resources')
|
|
|
|
asyncio.run(test())
|
|
"
|
|
```
|
|
|
|
---
|
|
|
|
## Backup & Recovery
|
|
|
|
### Database Backup
|
|
|
|
```bash
|
|
# Docker Compose
|
|
docker-compose exec postgres pg_dump -U docs_user datacenter_docs > backup.sql
|
|
|
|
# Kubernetes
|
|
kubectl exec -n datacenter-docs postgresql-0 -- \
|
|
pg_dump -U docs_user datacenter_docs > backup.sql
|
|
```
|
|
|
|
### Documentation Backup
|
|
|
|
```bash
|
|
# Backup generated docs
|
|
tar -czf docs-backup-$(date +%Y%m%d).tar.gz output/
|
|
|
|
# Backup vector store
|
|
tar -czf vectordb-backup-$(date +%Y%m%d).tar.gz data/chroma_db/
|
|
```
|
|
|
|
### Restore
|
|
|
|
```bash
|
|
# Database
|
|
docker-compose exec -T postgres psql -U docs_user datacenter_docs < backup.sql
|
|
|
|
# Documentation
|
|
tar -xzf docs-backup-20250115.tar.gz
|
|
tar -xzf vectordb-backup-20250115.tar.gz
|
|
```
|
|
|
|
---
|
|
|
|
## Security Checklist
|
|
|
|
- [ ] All secrets stored in vault/secrets manager
|
|
- [ ] TLS enabled for all services
|
|
- [ ] API rate limiting configured
|
|
- [ ] CORS properly configured
|
|
- [ ] Network policies applied (K8s)
|
|
- [ ] Regular security scans scheduled
|
|
- [ ] Audit logging enabled
|
|
- [ ] Backup encryption enabled
|
|
|
|
---
|
|
|
|
## Performance Tuning
|
|
|
|
### API Optimization
|
|
|
|
```python
|
|
# Increase workers (in .env)
|
|
WORKERS=8 # 2x CPU cores
|
|
|
|
# Adjust max tokens
|
|
MAX_TOKENS=8192 # Higher for complex queries
|
|
```
|
|
|
|
### Database Optimization
|
|
|
|
```sql
|
|
-- Add indexes
|
|
CREATE INDEX idx_tickets_status ON tickets(status);
|
|
CREATE INDEX idx_tickets_created_at ON tickets(created_at);
|
|
```
|
|
|
|
### Redis Caching
|
|
|
|
```python
|
|
# Adjust cache TTL (in code)
|
|
CACHE_TTL = {
|
|
'documentation': 3600, # 1 hour
|
|
'metrics': 300, # 5 minutes
|
|
'tickets': 60 # 1 minute
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Maintenance
|
|
|
|
### Regular Tasks
|
|
|
|
```bash
|
|
# Weekly
|
|
- Review and clean old logs
|
|
- Check disk usage
|
|
- Review failed tickets
|
|
- Update dependencies
|
|
|
|
# Monthly
|
|
- Database vacuum/optimize
|
|
- Security patches
|
|
- Performance review
|
|
- Backup verification
|
|
```
|
|
|
|
### Scheduled Maintenance
|
|
|
|
```bash
|
|
# Schedule in crontab
|
|
0 2 * * 0 /opt/scripts/weekly-maintenance.sh
|
|
0 3 1 * * /opt/scripts/monthly-maintenance.sh
|
|
```
|
|
|
|
---
|
|
|
|
**For support**: automation-team@company.local
|