Compare commits

..

9 Commits

Author SHA1 Message Date
8d11321835 Update SCHEME.md
Some checks failed
Build / Code Quality Checks (push) Successful in 17m40s
Build / Build & Push Docker Images (frontend) (push) Failing after 2m22s
Build / Build & Push Docker Images (worker) (push) Successful in 20m15s
Build / Build & Push Docker Images (chat) (push) Failing after 45m14s
Build / Build & Push Docker Images (api) (push) Failing after 45m14s
2025-10-28 11:42:17 +00:00
a43f11a496 Update SCHEME.md
Some checks failed
Build / Build & Push Docker Images (api) (push) Has been cancelled
Build / Build & Push Docker Images (chat) (push) Has been cancelled
Build / Build & Push Docker Images (frontend) (push) Has been cancelled
Build / Build & Push Docker Images (worker) (push) Has been cancelled
Build / Code Quality Checks (push) Has been cancelled
2025-10-28 11:38:33 +00:00
6519a4856b Update scheme.md
Some checks failed
Build / Build & Push Docker Images (api) (push) Has been cancelled
Build / Build & Push Docker Images (chat) (push) Has been cancelled
Build / Build & Push Docker Images (frontend) (push) Has been cancelled
Build / Build & Push Docker Images (worker) (push) Has been cancelled
Build / Code Quality Checks (push) Has been cancelled
2025-10-28 11:38:04 +00:00
40824d991f Update scheme.md
Some checks failed
Build / Build & Push Docker Images (api) (push) Has been cancelled
Build / Build & Push Docker Images (chat) (push) Has been cancelled
Build / Build & Push Docker Images (frontend) (push) Has been cancelled
Build / Build & Push Docker Images (worker) (push) Has been cancelled
Build / Code Quality Checks (push) Has started running
2025-10-28 11:27:44 +00:00
5b94e0a046 Add scheme.md
Some checks failed
Build / Build & Push Docker Images (api) (push) Has been cancelled
Build / Build & Push Docker Images (chat) (push) Has been cancelled
Build / Build & Push Docker Images (frontend) (push) Has been cancelled
Build / Build & Push Docker Images (worker) (push) Has been cancelled
Build / Code Quality Checks (push) Has been cancelled
2025-10-28 11:16:45 +00:00
2719cfff59 Add Helm chart, Docs, and Config conversion script
Some checks failed
Build / Code Quality Checks (push) Successful in 15m11s
Build / Build & Push Docker Images (worker) (push) Successful in 13m44s
Build / Build & Push Docker Images (frontend) (push) Successful in 5m8s
Build / Build & Push Docker Images (chat) (push) Failing after 30m7s
Build / Build & Push Docker Images (api) (push) Failing after 21m39s
2025-10-22 14:35:21 +02:00
ba9900bd57 fix: remove COPY output/ and fix PYTHONPATH in Dockerfile.chat
Some checks failed
Build / Code Quality Checks (push) Successful in 15m41s
Build / Build & Push Docker Images (chat) (push) Failing after 36s
Build / Build & Push Docker Images (frontend) (push) Successful in 7m48s
Build / Build & Push Docker Images (worker) (push) Has been cancelled
Build / Build & Push Docker Images (api) (push) Has been cancelled
Resolve chat service build failure caused by missing output directory:

**Root Cause:**
- output/ directory is in .gitignore and not included in Docker build context
- COPY output/ /app/output/ failed with "not found" error
- PYTHONPATH still had undefined $PYTHONPATH variable

**Solution:**
- Remove COPY output/ line (directory created later with mkdir)
- Fix PYTHONPATH: /app/src:$PYTHONPATH → /app/src
- output/ directory already created at line 51: mkdir -p /app/output

**Errors Fixed:**
- ERROR: "/output": not found
- WARNING: Usage of undefined variable '$PYTHONPATH'

Successfully tested Docker build.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-21 16:09:08 +02:00
4a8372f0d1 fix: upgrade Poetry to 2.2.1 for poetry.lock compatibility
Some checks failed
Build / Code Quality Checks (push) Successful in 9m31s
Build / Build & Push Docker Images (chat) (push) Failing after 45s
Build / Build & Push Docker Images (frontend) (push) Successful in 1m3s
Build / Build & Push Docker Images (api) (push) Waiting to run
Build / Build & Push Docker Images (worker) (push) Failing after 15m16s
Resolve Docker build failure caused by poetry.lock incompatibility:

**Root Cause:**
- Local Poetry version: 2.2.1
- Dockerfile Poetry version: 1.8.0
- poetry.lock generated with 2.2.1 not compatible with 1.8.0
- Build failed: "Dependency walk failed at triton (==3.5.0)"

**Solution:**
- Upgrade Poetry to 2.2.1 in all Dockerfiles (api, chat, worker)
- Update CI/CD pipeline to match (POETRY_VERSION: 2.2.1)
- Successfully tested Docker build with new version

**Files Modified:**
- deploy/docker/Dockerfile.api
- deploy/docker/Dockerfile.chat
- deploy/docker/Dockerfile.worker
- .gitea/workflows/build.yml

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-21 15:55:15 +02:00
4d2bf99d12 feat: add comprehensive caching to CI/CD pipeline
Optimize Gitea Actions pipeline with multi-layer caching strategy:

**Lint Job:**
- Cache Poetry installation (~/.local) - avoids reinstalling Poetry
- Cache Poetry dependencies (.venv + ~/.cache/pypoetry) - reuses installed packages
- Cache key based on poetry.lock hash for automatic invalidation on dependency changes

**Build Job:**
- Cache Docker Buildx layers (/tmp/.buildx-cache) - speeds up incremental builds
- Dual cache strategy: local filesystem + container registry
- Cache rotation to prevent unlimited growth
- Per-component cache keys for optimal reuse

**Expected Performance:**
- Lint job: ~2-3x faster after first run (skip Poetry + deps installation)
- Build job: ~3-5x faster on incremental builds (reuse Docker layers)
- First run unchanged, subsequent runs significantly faster

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-21 15:13:48 +02:00
36 changed files with 5223 additions and 9 deletions

View File

@@ -10,7 +10,7 @@ on:
branches: [ main ]
env:
POETRY_VERSION: 1.8.0
POETRY_VERSION: 2.2.1
PYTHON_VERSION: "3.12"
REGISTRY: ${{ vars.PACKAGES_REGISTRY }}
IMAGE_NAME: ${{ gitea.repository }}
@@ -28,13 +28,28 @@ jobs:
uses: actions/setup-python@v5
with:
python-version: ${{ env.PYTHON_VERSION }}
cache: 'pip'
- name: Cache Poetry installation
uses: actions/cache@v4
with:
path: ~/.local
key: poetry-${{ runner.os }}-${{ env.POETRY_VERSION }}
- name: Install Poetry
run: |
curl -sSL https://install.python-poetry.org | python3 -
echo "$HOME/.local/bin" >> $GITHUB_PATH
- name: Cache Poetry dependencies
uses: actions/cache@v4
with:
path: |
.venv
~/.cache/pypoetry
key: poetry-deps-${{ runner.os }}-${{ env.PYTHON_VERSION }}-${{ hashFiles('**/poetry.lock') }}
restore-keys: |
poetry-deps-${{ runner.os }}-${{ env.PYTHON_VERSION }}-
- name: Install dependencies
run: |
poetry config virtualenvs.in-project true
@@ -64,6 +79,17 @@ jobs:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
with:
buildkitd-flags: --debug
- name: Cache Docker layers
uses: actions/cache@v4
with:
path: /tmp/.buildx-cache
key: buildx-${{ matrix.component }}-${{ github.sha }}
restore-keys: |
buildx-${{ matrix.component }}-
buildx-
- name: Log in to Container Registry
uses: docker/login-action@v3
@@ -90,5 +116,14 @@ jobs:
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=registry,ref=${{ vars.PACKAGES_REGISTRY }}/${{ env.IMAGE_NAME }}/${{ matrix.component }}:buildcache
cache-to: type=registry,ref=${{ vars.PACKAGES_REGISTRY }}/${{ env.IMAGE_NAME }}/${{ matrix.component }}:buildcache,mode=max
cache-from: |
type=local,src=/tmp/.buildx-cache
type=registry,ref=${{ vars.PACKAGES_REGISTRY }}/${{ env.IMAGE_NAME }}/${{ matrix.component }}:buildcache
cache-to: |
type=local,dest=/tmp/.buildx-cache-new,mode=max
type=registry,ref=${{ vars.PACKAGES_REGISTRY }}/${{ env.IMAGE_NAME }}/${{ matrix.component }}:buildcache,mode=max
- name: Rotate Docker cache
run: |
rm -rf /tmp/.buildx-cache
mv /tmp/.buildx-cache-new /tmp/.buildx-cache || true

511
CONFIGURATION.md Normal file
View File

@@ -0,0 +1,511 @@
# Configuration Guide
This guide explains how to configure the Datacenter Documentation & Remediation Engine using the various configuration files available.
## Configuration Files Overview
The project supports multiple configuration methods to suit different deployment scenarios:
### 1. `.env` File (Docker Compose)
- **Location**: Root of the project
- **Format**: Environment variables
- **Use case**: Local development, Docker Compose deployments
- **Template**: `.env.example`
### 2. `values.yaml` File (Structured Configuration)
- **Location**: Root of the project
- **Format**: YAML
- **Use case**: General configuration, Helm deployments, configuration management
- **Template**: `values.yaml`
### 3. Helm Chart Values (Kubernetes)
- **Location**: `deploy/helm/datacenter-docs/values.yaml`
- **Format**: YAML (Helm-specific)
- **Use case**: Kubernetes deployments via Helm
- **Variants**:
- `values.yaml` - Default configuration
- `values-development.yaml` - Development settings
- `values-production.yaml` - Production example
## Quick Start
### For Docker Compose Development
1. Copy the environment template:
```bash
cp .env.example .env
```
2. Edit `.env` with your configuration:
```bash
nano .env
```
3. Update the following required values:
- `MONGO_ROOT_PASSWORD` - MongoDB password
- `LLM_API_KEY` - Your LLM provider API key
- `LLM_BASE_URL` - LLM provider endpoint
- `MCP_API_KEY` - MCP server API key
4. Start the services:
```bash
cd deploy/docker
docker-compose -f docker-compose.dev.yml up -d
```
### For Kubernetes/Helm Deployment
1. Copy and customize the values file:
```bash
cp values.yaml my-values.yaml
```
2. Edit `my-values.yaml` with your configuration
3. Deploy with Helm:
```bash
helm install my-release deploy/helm/datacenter-docs -f my-values.yaml
```
## Configuration Mapping
Here's how the `.env` variables map to `values.yaml`:
| .env Variable | values.yaml Path | Description |
|---------------|------------------|-------------|
| `MONGO_ROOT_USER` | `mongodb.auth.rootUsername` | MongoDB root username |
| `MONGO_ROOT_PASSWORD` | `mongodb.auth.rootPassword` | MongoDB root password |
| `MONGODB_URL` | `mongodb.url` | MongoDB connection URL |
| `MONGODB_DATABASE` | `mongodb.auth.database` | Database name |
| `REDIS_PASSWORD` | `redis.auth.password` | Redis password |
| `REDIS_URL` | `redis.url` | Redis connection URL |
| `MCP_SERVER_URL` | `mcp.server.url` | MCP server endpoint |
| `MCP_API_KEY` | `mcp.server.apiKey` | MCP API key |
| `PROXMOX_HOST` | `proxmox.host` | Proxmox server hostname |
| `PROXMOX_USER` | `proxmox.auth.user` | Proxmox username |
| `PROXMOX_PASSWORD` | `proxmox.auth.password` | Proxmox password |
| `LLM_BASE_URL` | `llm.baseUrl` | LLM API endpoint |
| `LLM_API_KEY` | `llm.apiKey` | LLM API key |
| `LLM_MODEL` | `llm.model` | LLM model name |
| `LLM_TEMPERATURE` | `llm.generation.temperature` | Generation temperature |
| `LLM_MAX_TOKENS` | `llm.generation.maxTokens` | Max tokens per request |
| `API_HOST` | `api.host` | API server host |
| `API_PORT` | `api.port` | API server port |
| `WORKERS` | `api.workers` | Number of API workers |
| `CORS_ORIGINS` | `cors.origins` | Allowed CORS origins |
| `LOG_LEVEL` | `application.logging.level` | Logging level |
| `DEBUG` | `application.debug` | Debug mode |
| `CELERY_BROKER_URL` | `celery.broker.url` | Celery broker URL |
| `CELERY_RESULT_BACKEND` | `celery.result.backend` | Celery result backend |
| `VECTOR_STORE_PATH` | `vectorStore.chroma.path` | Vector store path |
| `EMBEDDING_MODEL` | `vectorStore.embedding.model` | Embedding model name |
## Configuration Sections
### 1. Database Configuration
#### MongoDB
```yaml
mongodb:
auth:
rootUsername: admin
rootPassword: "your-secure-password"
database: datacenter_docs
url: "mongodb://admin:password@mongodb:27017"
```
**Security Note**: Always use strong passwords in production!
#### Redis
```yaml
redis:
auth:
password: "your-redis-password"
url: "redis://redis:6379/0"
```
### 2. LLM Provider Configuration
The system supports multiple LLM providers through OpenAI-compatible APIs:
#### OpenAI
```yaml
llm:
provider: openai
baseUrl: "https://api.openai.com/v1"
apiKey: "sk-your-key"
model: "gpt-4-turbo-preview"
```
#### Anthropic Claude
```yaml
llm:
provider: anthropic
baseUrl: "https://api.anthropic.com/v1"
apiKey: "sk-ant-your-key"
model: "claude-sonnet-4-20250514"
```
#### Local (Ollama)
```yaml
llm:
provider: ollama
baseUrl: "http://localhost:11434/v1"
apiKey: "ollama"
model: "llama3"
```
### 3. Auto-Remediation Configuration
Control how the system handles automated problem resolution:
```yaml
autoRemediation:
enabled: true
minReliabilityScore: 85.0
requireApprovalThreshold: 90.0
maxActionsPerHour: 100
dryRun: false # Set to true for testing
```
**Important**: Start with `dryRun: true` to test without making actual changes!
### 4. Infrastructure Collectors
Enable/disable different infrastructure data collectors:
```yaml
collectors:
vmware:
enabled: true
host: "vcenter.example.com"
kubernetes:
enabled: true
proxmox:
enabled: true
```
### 5. Security Settings
```yaml
security:
authentication:
enabled: true
method: "jwt"
rateLimit:
enabled: true
requestsPerMinute: 100
```
## Environment-Specific Configuration
### Development
For development, use minimal resources and verbose logging:
```yaml
application:
logging:
level: "DEBUG"
debug: true
environment: "development"
autoRemediation:
dryRun: true # Never make real changes in dev
llm:
baseUrl: "http://localhost:11434/v1" # Use local Ollama
```
### Production
For production, use secure settings and proper resource limits:
```yaml
application:
logging:
level: "INFO"
debug: false
environment: "production"
autoRemediation:
enabled: true
minReliabilityScore: 95.0 # Higher threshold
requireApprovalThreshold: 98.0
dryRun: false
security:
authentication:
enabled: true
rateLimit:
enabled: true
```
## Configuration Best Practices
### 1. Secret Management
**Never commit secrets to version control!**
For development:
- Use `.env` (add to `.gitignore`)
- Use default passwords (change in production)
For production:
- Use Kubernetes Secrets
- Use external secret managers (Vault, AWS Secrets Manager, etc.)
- Rotate secrets regularly
Example with Kubernetes Secret:
```bash
kubectl create secret generic datacenter-docs-secrets \
--from-literal=mongodb-password="$(openssl rand -base64 32)" \
--from-literal=llm-api-key="your-actual-key"
```
### 2. Resource Limits
Always set appropriate resource limits:
```yaml
resources:
api:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "2Gi"
cpu: "1000m"
```
### 3. High Availability
For production deployments:
```yaml
api:
replicaCount: 3 # Multiple replicas
mongodb:
persistence:
enabled: true
size: 50Gi
storageClass: "fast-ssd"
```
### 4. Monitoring
Enable monitoring and observability:
```yaml
monitoring:
metrics:
enabled: true
health:
enabled: true
tracing:
enabled: true
provider: "jaeger"
```
### 5. Backup Configuration
Configure regular backups:
```yaml
backup:
enabled: true
schedule: "0 2 * * *" # Daily at 2 AM
retention:
daily: 7
weekly: 4
monthly: 12
```
## Validation
### Validate .env File
```bash
# Check for required variables
grep -E "^(MONGODB_URL|LLM_API_KEY|MCP_API_KEY)=" .env
```
### Validate values.yaml
```bash
# Install yq (YAML processor)
# brew install yq # macOS
# sudo apt install yq # Ubuntu
# Validate YAML syntax
yq eval '.' values.yaml > /dev/null && echo "Valid YAML" || echo "Invalid YAML"
# Check specific values
yq eval '.llm.apiKey' values.yaml
yq eval '.mongodb.auth.rootPassword' values.yaml
```
### Validate Helm Values
```bash
# Lint the Helm chart
helm lint deploy/helm/datacenter-docs -f my-values.yaml
# Dry-run installation
helm install test deploy/helm/datacenter-docs -f my-values.yaml --dry-run --debug
```
## Troubleshooting
### Common Issues
#### 1. MongoDB Connection Failed
Check:
- MongoDB URL is correct
- Password matches in both MongoDB and application config
- MongoDB service is running
```bash
# Test MongoDB connection
docker exec -it datacenter-docs-mongodb mongosh \
-u admin -p admin123 --authenticationDatabase admin
```
#### 2. LLM API Errors
Check:
- API key is valid
- Base URL is correct
- Model name is supported by the provider
- Network connectivity to LLM provider
```bash
# Test LLM API
curl -H "Authorization: Bearer $LLM_API_KEY" \
$LLM_BASE_URL/models
```
#### 3. Redis Connection Issues
Check:
- Redis URL is correct
- Redis service is running
- Password is correct (if enabled)
```bash
# Test Redis connection
docker exec -it datacenter-docs-redis redis-cli ping
```
## Converting Between Formats
### From .env to values.yaml
We provide a conversion script:
```bash
# TODO: Create conversion script
# python scripts/env_to_values.py .env > my-values.yaml
```
Manual conversion example:
```bash
# .env
MONGODB_URL=mongodb://admin:pass@mongodb:27017
# values.yaml
mongodb:
url: "mongodb://admin:pass@mongodb:27017"
```
### From values.yaml to .env
```bash
# Extract specific values
echo "MONGODB_URL=$(yq eval '.mongodb.url' values.yaml)" >> .env
echo "LLM_API_KEY=$(yq eval '.llm.apiKey' values.yaml)" >> .env
```
## Examples
### Example 1: Local Development with Ollama
```yaml
# values-local.yaml
llm:
provider: ollama
baseUrl: "http://localhost:11434/v1"
apiKey: "ollama"
model: "llama3"
application:
debug: true
logging:
level: "DEBUG"
autoRemediation:
dryRun: true
```
### Example 2: Production with OpenAI
```yaml
# values-prod.yaml
llm:
provider: openai
baseUrl: "https://api.openai.com/v1"
apiKey: "sk-prod-key-from-secret-manager"
model: "gpt-4-turbo-preview"
application:
debug: false
logging:
level: "INFO"
autoRemediation:
enabled: true
minReliabilityScore: 95.0
dryRun: false
security:
authentication:
enabled: true
rateLimit:
enabled: true
```
### Example 3: Multi-Environment Setup
```bash
# Development
helm install dev deploy/helm/datacenter-docs \
-f values.yaml \
-f values-development.yaml
# Staging
helm install staging deploy/helm/datacenter-docs \
-f values.yaml \
-f values-staging.yaml
# Production
helm install prod deploy/helm/datacenter-docs \
-f values.yaml \
-f values-production.yaml
```
## Related Documentation
- [Main README](README.md)
- [Docker Deployment](deploy/docker/README.md)
- [Helm Chart](deploy/helm/README.md)
- [Environment Variables](.env.example)
- [Project Repository](https://git.commandware.com/it-ops/llm-automation-docs-and-remediation-engine)
## Support
For configuration help:
- Open an issue: https://git.commandware.com/it-ops/llm-automation-docs-and-remediation-engine/issues
- Check the documentation
- Review example configurations in `deploy/` directory

744
SCHEME.md Normal file
View File

@@ -0,0 +1,744 @@
# 📚 Automated Infrastructure Documentation System
Sistema automatizzato per la generazione e mantenimento della documentazione tecnica dell'infrastruttura aziendale tramite LLM locale con validazione umana e pubblicazione GitOps.
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![Redis](https://img.shields.io/badge/Redis-7.2+-red.svg)](https://redis.io/)
## 📋 Indice
- [Overview](#overview)
- [Architettura](#architettura)
- [Schema Architetturale](#schema-architetturale)
- [Schema Tecnico](#schema-tecnico)
- [Contatti](#contatti)
## 🎯 Overview
Sistema progettato per **automatizzare la creazione e l'aggiornamento della documentazione tecnica** di sistemi infrastrutturali complessi (VMware, Kubernetes, Linux, Cisco, ecc.) utilizzando un Large Language Model locale (Qwen).
### Caratteristiche Principali
-**Raccolta dati asincrona** da molteplici sistemi infrastrutturali
-**Isolamento di sicurezza**: LLM non accede mai ai sistemi live
-**Change Detection**: Documentazione generata solo su modifiche rilevate
-**Redis Cache** per storage dati e performance
-**LLM locale on-premise** (Qwen) tramite MCP Server
-**Human-in-the-loop validation** con workflow GitOps
-**CI/CD automatizzato** per pubblicazione
## 🏗️ Architettura
Il sistema è suddiviso in **3 flussi principali**:
1. **Raccolta Dati (Background)**: Connettori interrogano periodicamente i sistemi infrastrutturali tramite API e aggiornano Redis
2. **Change Detection**: Sistema di rilevamento modifiche che attiva la generazione documentazione solo quando necessario
3. **Generazione e Pubblicazione (Triggered)**: LLM locale (Qwen) genera markdown leggendo da Redis, seguito da review umana e deploy automatico
> **Principio di Sicurezza**: L'LLM non ha mai accesso diretto ai sistemi infrastrutturali. Tutti i dati sono letti da Redis.
> **Principio di Efficienza**: La documentazione viene generata solo quando il sistema rileva modifiche nella configurazione infrastrutturale.
---
## 📊 Schema Architetturale
### Management View
Schema semplificato per presentazioni executive e management.
```mermaid
graph TB
%% Styling
classDef infrastructure fill:#e1f5ff,stroke:#01579b,stroke-width:3px,color:#333
classDef cache fill:#f3e5f5,stroke:#4a148c,stroke-width:3px,color:#333
classDef change fill:#fff3e0,stroke:#e65100,stroke-width:3px,color:#333
classDef llm fill:#e8f5e9,stroke:#1b5e20,stroke-width:3px,color:#333
classDef git fill:#fce4ec,stroke:#880e4f,stroke-width:3px,color:#333
classDef human fill:#fff9c4,stroke:#f57f17,stroke-width:3px,color:#333
%% ========================================
%% FLUSSO 1: RACCOLTA DATI (Background)
%% ========================================
INFRA[("🏢 SISTEMI<br/>INFRASTRUTTURALI<br/><br/>VMware | K8s | Linux | Cisco")]:::infrastructure
CONN["🔌 CONNETTORI<br/>Polling Automatico"]:::infrastructure
REDIS[("💾 REDIS CACHE<br/>Configurazione<br/>Infrastruttura")]:::cache
INFRA -->|"API Polling<br/>Continuo"| CONN
CONN -->|"Update<br/>Configurazione"| REDIS
%% ========================================
%% CHANGE DETECTION
%% ========================================
CHANGE["🔍 CHANGE DETECTOR<br/>Rileva Modifiche<br/>Configurazione"]:::change
REDIS -->|"Monitor<br/>Changes"| CHANGE
%% ========================================
%% FLUSSO 2: GENERAZIONE DOCUMENTAZIONE (Triggered)
%% ========================================
TRIGGER["⚡ TRIGGER<br/>Solo se modifiche"]:::change
USER["👤 UTENTE<br/>Richiesta Manuale"]:::human
LLM["🤖 LLM ENGINE<br/>Qwen (Locale)"]:::llm
MCP["🔧 MCP SERVER<br/>API Control Platform"]:::llm
DOC["📄 DOCUMENTO<br/>Markdown Generato"]:::llm
CHANGE -->|"Modifiche<br/>Rilevate"| TRIGGER
USER -.->|"Opzionale"| TRIGGER
TRIGGER -->|"Avvia<br/>Generazione"| LLM
LLM -->|"Tool Call"| MCP
MCP -->|"Query"| REDIS
REDIS -->|"Dati Config"| MCP
MCP -->|"Context"| LLM
LLM -->|"Genera"| DOC
%% ========================================
%% FLUSSO 3: VALIDAZIONE E PUBBLICAZIONE
%% ========================================
GIT["📦 GITLAB<br/>Repository"]:::git
PR["🔀 PULL REQUEST<br/>Review Automatica"]:::git
TECH["👨‍💼 TEAM TECNICO<br/>Validazione Umana"]:::human
PIPELINE["⚡ CI/CD PIPELINE<br/>GitLab Runner"]:::git
MKDOCS["📚 MKDOCS<br/>Static Site Generator"]:::git
WEB["🌐 DOCUMENTAZIONE<br/>GitLab Pages<br/>(Pubblicata)"]:::git
DOC -->|"Push +<br/>Branch"| GIT
GIT -->|"Crea"| PR
PR -->|"Notifica"| TECH
TECH -->|"Approva +<br/>Merge"| GIT
GIT -->|"Trigger"| PIPELINE
PIPELINE -->|"Build"| MKDOCS
MKDOCS -->|"Deploy"| WEB
%% ========================================
%% ANNOTAZIONI
%% ========================================
SECURITY["🔒 SICUREZZA<br/>LLM isolato dai sistemi live"]:::human
EFFICIENCY["⚡ EFFICIENZA<br/>Doc generata solo<br/>su modifiche"]:::change
LLM -.->|"NESSUN<br/>ACCESSO"| INFRA
SECURITY -.-> LLM
EFFICIENCY -.-> CHANGE
```
---
## 🔧 Schema Tecnico
### Implementation View
Schema dettagliato per il team tecnico con specifiche implementative.
```mermaid
graph TB
%% Styling tecnico
classDef infra fill:#e1f5ff,stroke:#01579b,stroke-width:2px,color:#333,font-size:11px
classDef connector fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#333,font-size:11px
classDef cache fill:#f3e5f5,stroke:#4a148c,stroke-width:2px,color:#333,font-size:11px
classDef change fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#333,font-size:11px
classDef llm fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px,color:#333,font-size:11px
classDef git fill:#fce4ec,stroke:#880e4f,stroke-width:2px,color:#333,font-size:11px
classDef monitor fill:#fff8e1,stroke:#f57f17,stroke-width:2px,color:#333,font-size:11px
%% =====================================
%% LAYER 1: SISTEMI SORGENTE
%% =====================================
subgraph SOURCES["🏢 INFRASTRUCTURE SOURCES"]
VCENTER["VMware vCenter<br/>API: vSphere REST 7.0+<br/>Port: 443/HTTPS<br/>Auth: API Token"]:::infra
K8S_API["Kubernetes API<br/>API: v1.28+<br/>Port: 6443/HTTPS<br/>Auth: ServiceAccount + RBAC"]:::infra
LINUX["Linux Servers<br/>Protocol: SSH/Ansible<br/>Port: 22<br/>Auth: SSH Keys"]:::infra
CISCO["Cisco Devices<br/>Protocol: NETCONF/RESTCONF<br/>Port: 830/443<br/>Auth: AAA"]:::infra
end
%% =====================================
%% LAYER 2: CONNETTORI
%% =====================================
subgraph CONNECTORS["🔌 DATA COLLECTORS (Python/Go)"]
CONN_VM["VMware Collector<br/>Lang: Python 3.11<br/>Lib: pyvmomi<br/>Schedule: */15 * * * *<br/>Output: JSON → Redis"]:::connector
CONN_K8S["K8s Collector<br/>Lang: Python 3.11<br/>Lib: kubernetes-client<br/>Schedule: */5 * * * *<br/>Resources: pods,svc,ing,deploy"]:::connector
CONN_LNX["Linux Collector<br/>Lang: Python 3.11<br/>Lib: paramiko/ansible<br/>Schedule: */30 * * * *<br/>Data: sysinfo,packages,services"]:::connector
CONN_CSC["Cisco Collector<br/>Lang: Python 3.11<br/>Lib: ncclient<br/>Schedule: */30 * * * *<br/>Data: interfaces,routing,vlans"]:::connector
end
VCENTER -->|"GET /api/vcenter/vm"| CONN_VM
K8S_API -->|"kubectl proxy<br/>API calls"| CONN_K8S
LINUX -->|"SSH batch<br/>commands"| CONN_LNX
CISCO -->|"NETCONF<br/>get-config"| CONN_CSC
%% =====================================
%% LAYER 3: REDIS STORAGE
%% =====================================
subgraph STORAGE["💾 REDIS CLUSTER"]
REDIS_CLUSTER["Redis Cluster<br/>Mode: Cluster (6 nodes)<br/>Port: 6379<br/>Persistence: RDB + AOF<br/>Memory: 64GB<br/>Eviction: allkeys-lru"]:::cache
REDIS_KEYS["Key Structure:<br/>• vmware:vcenter-id:vms:hash<br/>• k8s:cluster:namespace:resource:hash<br/>• linux:hostname:info:hash<br/>• cisco:device-id:config:hash<br/>• changelog:timestamp:diff<br/>TTL: 30d for data, 90d for changelog"]:::cache
end
CONN_VM -->|"HSET/HMSET<br/>+ Hash Storage"| REDIS_CLUSTER
CONN_K8S -->|"HSET/HMSET<br/>+ Hash Storage"| REDIS_CLUSTER
CONN_LNX -->|"HSET/HMSET<br/>+ Hash Storage"| REDIS_CLUSTER
CONN_CSC -->|"HSET/HMSET<br/>+ Hash Storage"| REDIS_CLUSTER
REDIS_CLUSTER --> REDIS_KEYS
%% =====================================
%% LAYER 4: CHANGE DETECTION
%% =====================================
subgraph CHANGE_DETECTION["🔍 CHANGE DETECTION SYSTEM"]
DETECTOR["Change Detector Service<br/>Lang: Python 3.11<br/>Lib: redis-py<br/>Algorithm: Hash comparison<br/>Check interval: */5 * * * *"]:::change
DIFF_ENGINE["Diff Engine<br/>• Deep object comparison<br/>• JSON diff generation<br/>• Change classification<br/>• Severity assessment"]:::change
CHANGE_LOG["Change Log Store<br/>Key: changelog:*<br/>Data: diff JSON + metadata<br/>Indexed by: timestamp, resource"]:::change
NOTIFIER["Change Notifier<br/>• Webhook triggers<br/>• Slack notifications<br/>• Event emission<br/>Target: LLM trigger"]:::change
end
REDIS_CLUSTER -->|"Monitor<br/>key changes"| DETECTOR
DETECTOR --> DIFF_ENGINE
DIFF_ENGINE -->|"Store diff"| CHANGE_LOG
CHANGE_LOG --> REDIS_CLUSTER
DIFF_ENGINE -->|"Notify if<br/>significant"| NOTIFIER
%% =====================================
%% LAYER 5: LLM TRIGGER & GENERATION
%% =====================================
subgraph TRIGGER_SYSTEM["⚡ TRIGGER SYSTEM"]
TRIGGER_SVC["Trigger Service<br/>Lang: Python 3.11<br/>Listen: Webhook + Redis Pub/Sub<br/>Debounce: 5 min<br/>Batch: multiple changes"]:::change
QUEUE["Generation Queue<br/>Type: Redis List<br/>Priority: High/Medium/Low<br/>Processing: FIFO"]:::change
end
NOTIFIER -->|"Trigger event"| TRIGGER_SVC
TRIGGER_SVC -->|"Enqueue<br/>generation task"| QUEUE
subgraph LLM_LAYER["🤖 AI GENERATION LAYER"]
LLM_ENGINE["LLM Engine<br/>Model: Qwen (Locale)<br/>API: Ollama/vLLM/LM Studio<br/>Port: 11434<br/>Temp: 0.3<br/>Max Tokens: 4096<br/>Timeout: 120s"]:::llm
MCP_SERVER["MCP Server<br/>Lang: TypeScript/Node.js<br/>Port: 3000<br/>Protocol: JSON-RPC 2.0<br/>Auth: JWT tokens"]:::llm
MCP_TOOLS["MCP Tools:<br/>• getVMwareInventory(vcenter)<br/>• getK8sResources(cluster,ns,type)<br/>• getLinuxSystemInfo(hostname)<br/>• getCiscoConfig(device,section)<br/>• getChangelog(start,end,resource)<br/>Return: JSON + Metadata"]:::llm
end
QUEUE -->|"Dequeue<br/>task"| LLM_ENGINE
LLM_ENGINE <-->|"Tool calls<br/>JSON-RPC"| MCP_SERVER
MCP_SERVER --> MCP_TOOLS
MCP_TOOLS -->|"HGETALL/MGET<br/>Read data"| REDIS_CLUSTER
REDIS_CLUSTER -->|"Config data<br/>+ Changelog"| MCP_TOOLS
MCP_TOOLS -->|"Structured Data<br/>+ Context"| LLM_ENGINE
subgraph OUTPUT["📝 DOCUMENT GENERATION"]
TEMPLATE["Template Engine<br/>Format: Jinja2<br/>Templates: markdown/*.j2<br/>Variables: from LLM"]:::llm
MARKDOWN["Markdown Output<br/>Format: CommonMark<br/>Metadata: YAML frontmatter<br/>Change summary included<br/>Assets: diagrams in mermaid"]:::llm
VALIDATOR["Doc Validator<br/>• Markdown linting<br/>• Link checking<br/>• Schema validation<br/>• Change verification"]:::llm
end
LLM_ENGINE --> TEMPLATE
TEMPLATE --> MARKDOWN
MARKDOWN --> VALIDATOR
%% =====================================
%% LAYER 6: GITOPS
%% =====================================
subgraph GITOPS["🔄 GITOPS WORKFLOW"]
GIT_REPO["GitLab Repository<br/>URL: gitlab.com/docs/infra<br/>Branch strategy: main + feature/*<br/>Protected: main (require approval)"]:::git
GIT_API["GitLab API<br/>API: v4<br/>Auth: Project Access Token<br/>Permissions: api, write_repo"]:::git
PR_AUTO["Automated PR Creator<br/>Lang: Python 3.11<br/>Lib: python-gitlab<br/>Template: .gitlab/merge_request.md<br/>Include: change summary"]:::git
end
VALIDATOR -->|"git add/commit/push"| GIT_REPO
GIT_REPO <--> GIT_API
GIT_API --> PR_AUTO
REVIEWER["👨‍💼 Technical Reviewer<br/>Role: Maintainer/Owner<br/>Review: diff + validation<br/>Check: change correlation<br/>Approve: required (min 1)"]:::monitor
PR_AUTO -->|"Notification<br/>Email + Slack"| REVIEWER
REVIEWER -->|"Merge to main"| GIT_REPO
%% =====================================
%% LAYER 7: CI/CD & PUBLISH
%% =====================================
subgraph CICD["⚡ CI/CD PIPELINE"]
GITLAB_CI["GitLab CI/CD<br/>Runner: docker<br/>Image: python:3.11-alpine<br/>Stages: build, test, deploy"]:::git
PIPELINE_JOBS["Pipeline Jobs:<br/>1. lint (markdownlint-cli)<br/>2. build (mkdocs build)<br/>3. test (link-checker)<br/>4. deploy (rsync/s3)"]:::git
MKDOCS_CFG["MkDocs Config<br/>Theme: material<br/>Plugins: search, tags, mermaid<br/>Extensions: admonition, codehilite"]:::git
end
GIT_REPO -->|"on: push to main<br/>Webhook trigger"| GITLAB_CI
GITLAB_CI --> PIPELINE_JOBS
PIPELINE_JOBS --> MKDOCS_CFG
subgraph PUBLISH["🌐 PUBLICATION"]
STATIC_SITE["Static Site<br/>Generator: MkDocs<br/>Output: HTML/CSS/JS<br/>Assets: optimized images"]:::git
CDN["GitLab Pages / S3 + CloudFront<br/>URL: docs.company.com<br/>SSL: Let's Encrypt<br/>Cache: 1h"]:::git
SEARCH["Search Index<br/>Engine: Algolia/Meilisearch<br/>Update: on publish<br/>API: REST"]:::git
end
MKDOCS_CFG -->|"mkdocs build<br/>--strict"| STATIC_SITE
STATIC_SITE --> CDN
STATIC_SITE --> SEARCH
%% =====================================
%% LAYER 8: MONITORING & OBSERVABILITY
%% =====================================
subgraph OBSERVABILITY["📊 MONITORING & LOGGING"]
PROMETHEUS["Prometheus<br/>Metrics: collector updates, changes detected<br/>Scrape: 30s<br/>Retention: 15d"]:::monitor
GRAFANA["Grafana Dashboards<br/>• Collector status<br/>• Redis performance<br/>• Change detection rate<br/>• LLM response times<br/>• Pipeline success rate"]:::monitor
ELK["ELK Stack<br/>Logs: all components<br/>Index: daily rotation<br/>Retention: 30d"]:::monitor
ALERTS["Alerting<br/>• Collector failures<br/>• Redis issues<br/>• Change detection errors<br/>• Pipeline failures<br/>Channel: Slack + PagerDuty"]:::monitor
end
CONN_VM -.->|"metrics"| PROMETHEUS
CONN_K8S -.->|"metrics"| PROMETHEUS
REDIS_CLUSTER -.->|"metrics"| PROMETHEUS
DETECTOR -.->|"metrics"| PROMETHEUS
MCP_SERVER -.->|"metrics"| PROMETHEUS
GITLAB_CI -.->|"metrics"| PROMETHEUS
PROMETHEUS --> GRAFANA
CONN_VM -.->|"logs"| ELK
DETECTOR -.->|"logs"| ELK
MCP_SERVER -.->|"logs"| ELK
GITLAB_CI -.->|"logs"| ELK
GRAFANA --> ALERTS
%% =====================================
%% SECURITY & EFFICIENCY ANNOTATIONS
%% =====================================
SEC1["🔒 SECURITY:<br/>• All APIs use TLS 1.3<br/>• Secrets in Vault/K8s Secrets<br/>• Network: private VPC<br/>• LLM has NO direct access"]:::monitor
SEC2["🔐 AUTHENTICATION:<br/>• API Tokens rotated 90d<br/>• RBAC enforced<br/>• Audit logs enabled<br/>• MFA required for Git"]:::monitor
EFF1["⚡ EFFICIENCY:<br/>• Doc generation only on changes<br/>• Debounce prevents spam<br/>• Hash-based change detection<br/>• Batch processing"]:::change
SEC1 -.-> MCP_SERVER
SEC2 -.-> GIT_REPO
EFF1 -.-> DETECTOR
```
---
## 💬 Sistema RAG Conversazionale
### Interrogazione Documentazione con AI
Sistema per "parlare" con la documentazione utilizzando Retrieval Augmented Generation (RAG). Permette agli utenti di porre domande in linguaggio naturale e ricevere risposte accurate basate sulla documentazione, con citazioni delle fonti.
#### Caratteristiche Principali
-**Semantic Search**: Ricerca vettoriale per comprendere l'intento della query
-**Scalabilità**: Gestione di grandi volumi di documentazione (100k+ documenti)
-**Performance**: Risposte in <3 secondi con caching intelligente
- **Accuratezza**: Re-ranking e source attribution per risposte precise
- **LLM Locale**: Qwen on-premise per privacy e controllo
### Schema RAG - Management View
```mermaid
graph TB
%% Styling
classDef docs fill:#e3f2fd,stroke:#1565c0,stroke-width:3px,color:#333
classDef process fill:#f3e5f5,stroke:#4a148c,stroke-width:3px,color:#333
classDef vector fill:#fff3e0,stroke:#e65100,stroke-width:3px,color:#333
classDef llm fill:#e8f5e9,stroke:#1b5e20,stroke-width:3px,color:#333
classDef user fill:#fff9c4,stroke:#f57f17,stroke-width:3px,color:#333
classDef cache fill:#fce4ec,stroke:#880e4f,stroke-width:3px,color:#333
%% ========================================
%% INGESTION PIPELINE (Offline)
%% ========================================
subgraph INGESTION["📚 INGESTION PIPELINE (Offline Process)"]
DOCS["📄 DOCUMENTAZIONE<br/>MkDocs Output<br/>Markdown Files"]:::docs
CHUNKER["✂️ DOCUMENT CHUNKER<br/>Split & Overlap<br/>Metadata Extraction"]:::process
EMBEDDER["🧠 EMBEDDING MODEL<br/>Text → Vectors<br/>Dimensione: 768/1024"]:::process
VECTORDB[("🗄️ VECTOR DATABASE<br/>Qdrant/Milvus<br/>Sharded & Replicated")]:::vector
end
DOCS -->|"Parse<br/>Markdown"| CHUNKER
CHUNKER -->|"Text Chunks<br/>+ Metadata"| EMBEDDER
EMBEDDER -->|"Store<br/>Embeddings"| VECTORDB
%% ========================================
%% QUERY PIPELINE (Real-time)
%% ========================================
subgraph QUERY["💬 QUERY PIPELINE (Real-time)"]
USER["👤 UTENTE<br/>Domanda/Query"]:::user
QUERY_EMBED["🧠 QUERY EMBEDDING<br/>Query → Vector"]:::process
SEARCH["🔍 SEMANTIC SEARCH<br/>Vector Similarity<br/>Top-K Results"]:::vector
RERANK["📊 RE-RANKING<br/>Context Scoring<br/>Relevance Filter"]:::process
CONTEXT["📋 CONTEXT BUILDER<br/>Assemble Chunks<br/>Add Metadata"]:::process
end
USER -->|"Natural Language<br/>Question"| QUERY_EMBED
QUERY_EMBED -->|"Query Vector"| SEARCH
SEARCH -->|"Search"| VECTORDB
VECTORDB -->|"Top-K Chunks<br/>+ Scores"| SEARCH
SEARCH -->|"Initial Results"| RERANK
RERANK -->|"Filtered<br/>Chunks"| CONTEXT
%% ========================================
%% GENERATION (LLM)
%% ========================================
subgraph GENERATION["🤖 ANSWER GENERATION"]
LLM_RAG["🤖 LLM ENGINE<br/>Qwen (Locale)<br/>+ RAG Context"]:::llm
ANSWER["💡 RISPOSTA<br/>Generated Answer<br/>+ Source Citations"]:::llm
end
CONTEXT -->|"Context<br/>+ Sources"| LLM_RAG
LLM_RAG -->|"Generate"| ANSWER
ANSWER -->|"Display"| USER
%% ========================================
%% CACHING & OPTIMIZATION
%% ========================================
CACHE[("💾 REDIS CACHE<br/>Query Cache<br/>Embedding Cache")]:::cache
QUERY_EMBED -.->|"Check Cache"| CACHE
CACHE -.->|"Cached<br/>Embedding"| SEARCH
SEARCH -.->|"Cache<br/>Results"| CACHE
%% ========================================
%% SCALING & UPDATE
%% ========================================
UPDATE["🔄 INCREMENTAL UPDATE<br/>On Doc Changes<br/>Auto Re-index"]:::docs
DOCS -.->|"Doc Updated"| UPDATE
UPDATE -.->|"Re-process<br/>Changed Docs"| CHUNKER
%% ========================================
%% ANNOTATIONS
%% ========================================
SCALE["📈 SCALABILITÀ<br/>• Vector DB sharding<br/>• Horizontal scaling<br/>• Load balancing"]:::vector
PERF["⚡ PERFORMANCE<br/>• Query cache<br/>• Embedding cache<br/>• Async processing"]:::cache
QUALITY["✅ QUALITY<br/>• Re-ranking<br/>• Relevance scoring<br/>• Source citations"]:::process
SCALE -.-> VECTORDB
PERF -.-> CACHE
QUALITY -.-> RERANK
```
### Schema RAG - Technical View
```mermaid
graph TB
%% Styling
classDef docs fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#333,font-size:11px
classDef process fill:#f3e5f5,stroke:#4a148c,stroke-width:2px,color:#333,font-size:11px
classDef vector fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#333,font-size:11px
classDef llm fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px,color:#333,font-size:11px
classDef user fill:#fff9c4,stroke:#f57f17,stroke-width:2px,color:#333,font-size:11px
classDef cache fill:#fce4ec,stroke:#880e4f,stroke-width:2px,color:#333,font-size:11px
classDef monitor fill:#fff8e1,stroke:#f57f17,stroke-width:2px,color:#333,font-size:11px
%% =====================================
%% LAYER 1: DOCUMENTATION SOURCE
%% =====================================
subgraph DOCSOURCE["📚 DOCUMENTATION SOURCE"]
MKDOCS_OUT["MkDocs Static Site<br/>Path: /site/<br/>Format: HTML + Markdown<br/>Assets: images, diagrams<br/>Update: on Git merge"]:::docs
DOC_WATCHER["Document Watcher<br/>Lang: Python 3.11<br/>Lib: watchdog<br/>Trigger: file system events<br/>Debounce: 30s"]:::docs
DOC_PARSER["Document Parser<br/>HTML → Plain Text<br/>Preserve structure<br/>Extract metadata<br/>Clean formatting"]:::docs
end
MKDOCS_OUT --> DOC_WATCHER
DOC_WATCHER -->|"New/Modified<br/>Docs"| DOC_PARSER
%% =====================================
%% LAYER 2: CHUNKING STRATEGY
%% =====================================
subgraph CHUNKING["✂️ INTELLIGENT CHUNKING"]
CHUNK_ENGINE["Chunking Engine<br/>Lang: Python 3.11<br/>Lib: langchain/llama-index<br/>Strategy: Recursive Character"]:::process
CHUNK_CONFIG["Chunking Config:<br/>• Chunk Size: 512 tokens<br/>• Overlap: 128 tokens<br/>• Separators: \\n\\n, \\n, . , ' '<br/>• Min chunk: 100 tokens<br/>• Max chunk: 1024 tokens"]:::process
METADATA_EXTRACTOR["Metadata Extractor<br/>Extract:<br/>• Document title<br/>• Section headers<br/>• Tags/keywords<br/>• Creation date<br/>• File path<br/>• Doc type"]:::process
end
DOC_PARSER -->|"Parsed Text"| CHUNK_ENGINE
CHUNK_ENGINE --> CHUNK_CONFIG
CHUNK_ENGINE --> METADATA_EXTRACTOR
%% =====================================
%% LAYER 3: EMBEDDING GENERATION
%% =====================================
subgraph EMBEDDING["🧠 EMBEDDING GENERATION"]
EMBED_MODEL["Embedding Model<br/>Model: all-MiniLM-L6-v2 / BGE-M3<br/>Dim: 384/768/1024<br/>API: sentence-transformers<br/>Batch size: 32<br/>GPU: CUDA acceleration"]:::process
EMBED_CACHE["Embedding Cache<br/>Type: Redis Hash<br/>Key: hash(text)<br/>TTL: 30d<br/>Hit rate target: >80%"]:::cache
EMBED_QUEUE["Processing Queue<br/>Type: Redis List<br/>Workers: 4-8<br/>Rate: 100 chunks/s<br/>Retry: 3 attempts"]:::process
end
METADATA_EXTRACTOR -->|"Chunks<br/>+ Metadata"| EMBED_QUEUE
EMBED_QUEUE --> EMBED_MODEL
EMBED_MODEL <-.->|"Cache<br/>Check/Store"| EMBED_CACHE
%% =====================================
%% LAYER 4: VECTOR DATABASE
%% =====================================
subgraph VECTORDB["🗄️ VECTOR DATABASE CLUSTER"]
QDRANT["Qdrant Cluster<br/>Version: 1.7+<br/>Nodes: 3-6 (replicated)<br/>Shards: auto per collection<br/>Port: 6333/6334"]:::vector
COLLECTIONS["Collections:<br/>• docs_main (dim: 768)<br/>• docs_code (dim: 768)<br/>• docs_api (dim: 768)<br/>Distance: Cosine<br/>Index: HNSW (M=16, ef=100)"]:::vector
SHARD_STRATEGY["Sharding Strategy:<br/>• Auto-sharding enabled<br/>• Shard size: 100k vectors<br/>• Replication factor: 2<br/>• Load balancing: Round-robin"]:::vector
end
EMBED_MODEL -->|"Store<br/>Vectors"| QDRANT
QDRANT --> COLLECTIONS
QDRANT --> SHARD_STRATEGY
%% =====================================
%% LAYER 5: QUERY PROCESSING
%% =====================================
subgraph QUERYPROC["💬 QUERY PROCESSING PIPELINE"]
USER_INPUT["User Input<br/>Interface: Web UI / API<br/>Auth: JWT tokens<br/>Rate limit: 20 req/min<br/>Timeout: 30s"]:::user
QUERY_PREPROCESS["Query Preprocessor<br/>• Spelling correction<br/>• Intent detection<br/>• Query expansion<br/>• Language detection"]:::process
QUERY_EMBEDDER["Query Embedder<br/>Same model as docs<br/>Cache: Redis<br/>Latency: <50ms"]:::process
HYBRID_SEARCH["Hybrid Search<br/>1. Vector search (semantic)<br/>2. Keyword search (BM25)<br/>3. Fusion: RRF algorithm<br/>Top-K: 20 initial results"]:::vector
end
USER_INPUT -->|"Natural<br/>Language"| QUERY_PREPROCESS
QUERY_PREPROCESS --> QUERY_EMBEDDER
QUERY_EMBEDDER <-.->|"Cache"| EMBED_CACHE
QUERY_EMBEDDER -->|"Query<br/>Vector"| HYBRID_SEARCH
HYBRID_SEARCH -->|"Search"| QDRANT
%% =====================================
%% LAYER 6: RE-RANKING & FILTERING
%% =====================================
subgraph RERANK["📊 RE-RANKING & FILTERING"]
RERANKER["Cross-Encoder Re-ranker<br/>Model: ms-marco-MiniLM<br/>Purpose: Fine-grained relevance<br/>Process: Top-20 → Top-5<br/>Latency: 100-200ms"]:::process
FILTER_ENGINE["Filter Engine<br/>• Relevance threshold: >0.7<br/>• Deduplication<br/>• Diversity scoring<br/>• Metadata filtering"]:::process
CONTEXT_BUILDER["Context Builder<br/>• Assemble top chunks<br/>• Add source citations<br/>• Format for LLM<br/>• Max context: 4k tokens"]:::process
end
QDRANT -->|"Top-K<br/>Results"| RERANKER
RERANKER --> FILTER_ENGINE
FILTER_ENGINE --> CONTEXT_BUILDER
%% =====================================
%% LAYER 7: LLM GENERATION
%% =====================================
subgraph LLMGEN["🤖 LLM ANSWER GENERATION"]
RAG_PROMPT["RAG Prompt Template<br/>Structure:<br/>• System: You are a helpful assistant<br/>• Context: Retrieved chunks<br/>• Question: User query<br/>• Instruction: Answer using context"]:::llm
LLM_ENGINE["LLM Engine<br/>Model: Qwen 2.5 (14B/32B)<br/>API: Ollama/vLLM<br/>Port: 11434<br/>Temp: 0.2 (factual)<br/>Max tokens: 2048<br/>Stream: enabled"]:::llm
ANSWER_POST["Answer Post-processor<br/>• Citation formatting<br/>• Source links<br/>• Confidence scoring<br/>• Fallback handling"]:::llm
end
CONTEXT_BUILDER -->|"Context<br/>+ Sources"| RAG_PROMPT
QUERY_PREPROCESS -->|"Original<br/>Question"| RAG_PROMPT
RAG_PROMPT --> LLM_ENGINE
LLM_ENGINE --> ANSWER_POST
ANSWER_POST -->|"Final<br/>Answer"| USER_INPUT
%% =====================================
%% LAYER 8: CACHING LAYER
%% =====================================
subgraph CACHING["💾 MULTI-LEVEL CACHE"]
REDIS_CACHE["Redis Cluster<br/>Mode: Cluster<br/>Nodes: 3<br/>Memory: 16GB<br/>Persistence: AOF"]:::cache
CACHE_TYPES["Cache Types:<br/>• Query embeddings (TTL: 7d)<br/>• Search results (TTL: 1h)<br/>• LLM responses (TTL: 24h)<br/>• Popular queries (no TTL)<br/>Eviction: LRU"]:::cache
CACHE_WARMING["Cache Warming<br/>Pre-compute:<br/>• Top 100 queries<br/>• Common patterns<br/>Schedule: daily<br/>Update: on doc changes"]:::cache
end
REDIS_CACHE --> CACHE_TYPES
CACHE_TYPES --> CACHE_WARMING
QUERY_EMBEDDER <-.-> REDIS_CACHE
HYBRID_SEARCH <-.-> REDIS_CACHE
LLM_ENGINE <-.-> REDIS_CACHE
%% =====================================
%% LAYER 9: SCALING & LOAD BALANCING
%% =====================================
subgraph SCALING["📈 SCALING INFRASTRUCTURE"]
LOAD_BALANCER["Load Balancer<br/>Type: Nginx / HAProxy<br/>Algorithm: Least connections<br/>Health checks: /health<br/>Timeout: 30s"]:::monitor
QUERY_API["Query API Instances<br/>Replicas: 3-10 (auto-scale)<br/>Lang: FastAPI<br/>Container: Docker<br/>Orchestration: K8s"]:::user
EMBED_WORKERS["Embedding Workers<br/>Replicas: 4-8<br/>GPU: Optional<br/>Queue: Redis<br/>Auto-scale: based on queue depth"]:::process
end
LOAD_BALANCER --> QUERY_API
QUERY_API --> USER_INPUT
%% =====================================
%% LAYER 10: MONITORING & OBSERVABILITY
%% =====================================
subgraph MONITORING["📊 MONITORING & ANALYTICS"]
METRICS["Prometheus Metrics<br/>• Query latency (p50, p95, p99)<br/>• Vector search time<br/>• LLM response time<br/>• Cache hit rate<br/>• Embedding generation rate<br/>Scrape: 15s"]:::monitor
DASHBOARDS["Grafana Dashboards<br/>• RAG Performance<br/>• Query analytics<br/>• Resource utilization<br/>• Error tracking<br/>Refresh: real-time"]:::monitor
ANALYTICS["Query Analytics<br/>Track:<br/>• Popular queries<br/>• Failed queries<br/>• Avg relevance scores<br/>• User satisfaction<br/>Storage: TimescaleDB"]:::monitor
ALERTS["Alerting Rules<br/>• Latency > 5s<br/>• Error rate > 5%<br/>• Cache hit < 70%<br/>• Vector DB down<br/>Channel: Slack + Email"]:::monitor
end
METRICS --> DASHBOARDS
DASHBOARDS --> ANALYTICS
ANALYTICS --> ALERTS
QUERY_API -.->|"metrics"| METRICS
HYBRID_SEARCH -.->|"metrics"| METRICS
LLM_ENGINE -.->|"metrics"| METRICS
QDRANT -.->|"metrics"| METRICS
%% =====================================
%% LAYER 11: FEEDBACK LOOP
%% =====================================
subgraph FEEDBACK["🔄 FEEDBACK & IMPROVEMENT"]
USER_FEEDBACK["User Feedback<br/>• Thumbs up/down<br/>• Relevance rating<br/>• Comments<br/>Storage: PostgreSQL"]:::user
FEEDBACK_ANALYSIS["Feedback Analysis<br/>• Identify bad answers<br/>• Track improvement areas<br/>• A/B testing results<br/>Schedule: weekly"]:::monitor
MODEL_TUNING["Model Fine-tuning<br/>• Re-rank model updates<br/>• Prompt optimization<br/>• Chunk size tuning<br/>Cycle: monthly"]:::process
end
USER_INPUT -->|"Rate<br/>Answer"| USER_FEEDBACK
USER_FEEDBACK --> FEEDBACK_ANALYSIS
FEEDBACK_ANALYSIS --> MODEL_TUNING
MODEL_TUNING -.->|"Improve"| RERANKER
%% =====================================
%% ANNOTATIONS
%% =====================================
SCALE_NOTE["📈 SCALABILITY:<br/>• Vector DB: Horizontal sharding<br/>• API: K8s auto-scaling (HPA)<br/>• Workers: Queue-based scaling<br/>• Cache: Redis cluster<br/>Target: 100k+ docs, 1k+ QPS"]:::monitor
PERF_NOTE["⚡ PERFORMANCE TARGETS:<br/>• Query latency: <3s (p95)<br/>• Vector search: <100ms<br/>• LLM generation: <2s<br/>• Cache hit rate: >80%<br/>• Throughput: 1000 QPS"]:::cache
QUALITY_NOTE["✅ QUALITY ASSURANCE:<br/>• Re-ranking for precision<br/>• Source attribution<br/>• Confidence scoring<br/>• Fallback responses<br/>• Human feedback loop"]:::process
SCALE_NOTE -.-> QDRANT
PERF_NOTE -.-> REDIS_CACHE
QUALITY_NOTE -.-> RERANKER
```
### Pipeline RAG
**1. Ingestion Pipeline (Offline)**
- Parsing documentazione MkDocs
- Chunking intelligente (512 token, overlap 128)
- Generazione embeddings (all-MiniLM-L6-v2)
- Storage in Vector Database (Qdrant cluster)
**2. Query Pipeline (Real-time)**
- Embedding della query utente
- Hybrid search (semantic + keyword)
- Re-ranking con cross-encoder
- Context assembly per LLM
**3. Generation**
- LLM locale (Qwen) con RAG context
- Source attribution automatica
- Streaming delle risposte
**4. Scaling Strategy**
- Vector DB sharding automatico
- API instances con auto-scaling K8s
- Redis cluster per caching multi-livello
- Load balancing con Nginx
---
## 📧 Contatti
- **Team**: Infrastructure Documentation Team
- **Email**: infra-docs@company.com
- **GitLab**: https://gitlab.com/company/infra-docs-automation
---
**Versione**: 1.0.0
**Ultimo aggiornamento**: 2025-10-28

View File

@@ -4,7 +4,7 @@ FROM python:3.12-slim AS builder
WORKDIR /build
# Install Poetry and export plugin
RUN pip install --no-cache-dir poetry==1.8.0 poetry-plugin-export
RUN pip install --no-cache-dir poetry==2.2.1 poetry-plugin-export
# Copy dependency files
COPY pyproject.toml poetry.lock ./

View File

@@ -4,7 +4,7 @@ FROM python:3.12-slim AS builder
WORKDIR /build
# Install Poetry and export plugin
RUN pip install --no-cache-dir poetry==1.8.0 poetry-plugin-export
RUN pip install --no-cache-dir poetry==2.2.1 poetry-plugin-export
# Copy dependency files
COPY pyproject.toml poetry.lock ./
@@ -36,7 +36,6 @@ RUN pip install --no-cache-dir -r requirements.txt
COPY src/ /app/src/
COPY config/ /app/config/
COPY scripts/ /app/scripts/
COPY output/ /app/output/
COPY pyproject.toml README.md /app/
# Install poetry-core (required for install with pyproject.toml)
@@ -46,7 +45,7 @@ RUN pip install --no-cache-dir poetry-core
RUN pip install --no-cache-dir /app
# Set PYTHONPATH to ensure module can be imported
ENV PYTHONPATH=/app/src:$PYTHONPATH
ENV PYTHONPATH=/app/src
# Create necessary directories
RUN mkdir -p /app/logs /app/data /app/output /app/scripts

View File

@@ -4,7 +4,7 @@ FROM python:3.12-slim AS builder
WORKDIR /build
# Install Poetry and export plugin
RUN pip install --no-cache-dir poetry==1.8.0 poetry-plugin-export
RUN pip install --no-cache-dir poetry==2.2.1 poetry-plugin-export
# Copy dependency files
COPY pyproject.toml poetry.lock ./

400
deploy/helm/README.md Normal file
View File

@@ -0,0 +1,400 @@
# Helm Deployment
This directory contains Helm charts for deploying the Datacenter Docs & Remediation Engine on Kubernetes.
## Contents
- `datacenter-docs/` - Main Helm chart for the application
- `test-chart.sh` - Automated testing script for chart validation
## Quick Start
### Prerequisites
- Kubernetes cluster (1.19+)
- Helm 3.0+
- kubectl configured to access your cluster
### Development/Testing Installation
```bash
# Install with development settings (minimal resources, local testing)
helm install dev ./datacenter-docs -f ./datacenter-docs/values-development.yaml
# Access the application
kubectl port-forward svc/dev-datacenter-docs-api 8000:8000
kubectl port-forward svc/dev-datacenter-docs-frontend 8080:80
# View API docs: http://localhost:8000/api/docs
# View frontend: http://localhost:8080
```
### Production Installation
```bash
# Copy and customize production values
cp datacenter-docs/values-production.yaml my-production-values.yaml
# Edit my-production-values.yaml:
# - Change all secrets (llmApiKey, apiSecretKey, mongodbPassword)
# - Update ingress hosts
# - Adjust resource limits
# - Configure LLM provider
# - Review auto-remediation settings
# Install
helm install prod ./datacenter-docs -f my-production-values.yaml
# Verify deployment
helm list
kubectl get pods
kubectl get ingress
```
## Chart Structure
```
datacenter-docs/
├── Chart.yaml # Chart metadata
├── values.yaml # Default configuration
├── values-development.yaml # Development settings
├── values-production.yaml # Production example
├── README.md # Detailed chart documentation
├── .helmignore # Files to exclude from package
└── templates/
├── NOTES.txt # Post-install instructions
├── _helpers.tpl # Template helpers
├── configmap.yaml # Application configuration
├── secrets.yaml # Sensitive data
├── serviceaccount.yaml # Service account
├── mongodb-statefulset.yaml # MongoDB StatefulSet
├── mongodb-service.yaml # MongoDB Service
├── redis-deployment.yaml # Redis Deployment
├── redis-service.yaml # Redis Service
├── api-deployment.yaml # API Deployment
├── api-service.yaml # API Service
├── api-hpa.yaml # API autoscaling
├── chat-deployment.yaml # Chat Deployment
├── chat-service.yaml # Chat Service
├── worker-deployment.yaml # Worker Deployment
├── worker-hpa.yaml # Worker autoscaling
├── frontend-deployment.yaml # Frontend Deployment
├── frontend-service.yaml # Frontend Service
└── ingress.yaml # Ingress configuration
```
## Testing the Chart
Run the automated test script:
```bash
cd deploy/helm
./test-chart.sh
```
This will:
1. Lint the chart
2. Render templates with different value files
3. Perform dry-run installation
4. Validate Kubernetes manifests
5. Package the chart
## Common Operations
### Upgrade Release
```bash
# Upgrade with new values
helm upgrade prod ./datacenter-docs -f my-production-values.yaml
# Upgrade with specific parameter changes
helm upgrade prod ./datacenter-docs --set api.replicaCount=10 --reuse-values
```
### Check Status
```bash
# List releases
helm list
# Get release status
helm status prod
# Get current values
helm get values prod
# Get all manifests
helm get manifest prod
```
### Rollback
```bash
# View revision history
helm history prod
# Rollback to previous version
helm rollback prod
# Rollback to specific revision
helm rollback prod 2
```
### Uninstall
```bash
# Uninstall release
helm uninstall prod
# Also delete PVCs (if using persistent storage)
kubectl delete pvc -l app.kubernetes.io/instance=prod
```
## Configuration Files
### values.yaml
Default configuration with reasonable settings for development/testing.
### values-development.yaml
Optimized for local development:
- Minimal resource requests/limits
- Single replicas
- Persistence disabled
- Dry-run mode for auto-remediation
- Debug logging
- Ingress disabled (use port-forward)
### values-production.yaml
Example production configuration:
- Higher resource limits
- Multiple replicas
- Autoscaling enabled
- Persistence enabled with larger volumes
- TLS/SSL enabled
- Production-grade security settings
- All components enabled
**Important**: Copy and customize this file for your environment. Never use default secrets!
## Available Components
| Component | Purpose | Default Enabled |
|-----------|---------|-----------------|
| MongoDB | Document database | Yes |
| Redis | Cache & task queue | Yes |
| API | REST API service | Yes |
| Chat | WebSocket server | No (not implemented) |
| Worker | Celery background tasks | No (not implemented) |
| Frontend | Web UI | Yes |
Enable/disable components in your values file:
```yaml
mongodb:
enabled: true
redis:
enabled: true
api:
enabled: true
chat:
enabled: false # Set to true when implemented
worker:
enabled: false # Set to true when implemented
frontend:
enabled: true
```
## Architecture
The chart deploys a complete microservices architecture:
```
┌─────────────┐
│ Ingress │
└──────┬──────┘
┌─────────────┼─────────────┐
│ │ │
┌────▼────┐ ┌────▼────┐ ┌────▼────┐
│Frontend │ │ API │ │ Chat │
└─────────┘ └────┬────┘ └────┬────┘
│ │
┌─────────────┼────────────┘
│ │
┌────▼────┐ ┌────▼────┐
│ Redis │ │ MongoDB │
└─────────┘ └─────────┘
┌────┴────┐
│ Worker │
└─────────┘
```
## LLM Provider Configuration
The chart supports multiple LLM providers. Configure in your values file:
### OpenAI
```yaml
config:
llm:
baseUrl: "https://api.openai.com/v1"
model: "gpt-4-turbo-preview"
secrets:
llmApiKey: "sk-your-openai-key"
```
### Anthropic Claude
```yaml
config:
llm:
baseUrl: "https://api.anthropic.com/v1"
model: "claude-3-opus-20240229"
secrets:
llmApiKey: "sk-ant-your-anthropic-key"
```
### Local (Ollama)
```yaml
config:
llm:
baseUrl: "http://ollama-service:11434/v1"
model: "llama2"
secrets:
llmApiKey: "not-needed"
```
### Azure OpenAI
```yaml
config:
llm:
baseUrl: "https://your-resource.openai.azure.com"
model: "gpt-4"
secrets:
llmApiKey: "your-azure-key"
```
## Security Best Practices
For production deployments:
1. **Change all default secrets**
```bash
helm install prod ./datacenter-docs \
--set secrets.llmApiKey="your-actual-key" \
--set secrets.apiSecretKey="$(openssl rand -base64 32)" \
--set secrets.mongodbPassword="$(openssl rand -base64 32)"
```
2. **Use external secret management**
- HashiCorp Vault
- AWS Secrets Manager
- Azure Key Vault
- Kubernetes External Secrets Operator
3. **Enable TLS/SSL**
```yaml
ingress:
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"
tls:
- secretName: datacenter-docs-tls
hosts:
- datacenter-docs.yourdomain.com
```
4. **Review auto-remediation settings**
```yaml
config:
autoRemediation:
enabled: true
minReliabilityScore: 95.0 # High threshold for production
dryRun: true # Test first, then set to false
```
5. **Implement network policies**
6. **Enable resource quotas**
7. **Regular security scanning**
## Monitoring and Observability
The chart is designed to integrate with:
- **Prometheus**: Metrics collection
- **Grafana**: Visualization
- **Jaeger**: Distributed tracing
- **ELK/Loki**: Log aggregation
Add annotations to enable monitoring:
```yaml
podAnnotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8000"
prometheus.io/path: "/metrics"
```
## Troubleshooting
### Pods not starting
```bash
# Check pod status
kubectl get pods -l app.kubernetes.io/instance=prod
# Describe pod for events
kubectl describe pod <pod-name>
# View logs
kubectl logs <pod-name> -f
```
### Storage issues
```bash
# Check PVC status
kubectl get pvc
# Check storage class
kubectl get storageclass
# Manually create PVC if needed
kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mongodb-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
EOF
```
### Ingress not working
```bash
# Check ingress status
kubectl get ingress
kubectl describe ingress prod-datacenter-docs
# Check ingress controller logs
kubectl logs -n ingress-nginx -l app.kubernetes.io/component=controller -f
```
## Support
For detailed documentation, see:
- Chart README: `datacenter-docs/README.md`
- Main project: `../../README.md`
- Issues: https://git.commandware.com/it-ops/llm-automation-docs-and-remediation-engine/issues
## License
See the main repository for license information.

View File

@@ -0,0 +1,32 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
# CI/CD
.github/
.gitlab-ci.yml
.gitea/
# Documentation
README.md
NOTES.md
# Development files
*.log

View File

@@ -0,0 +1,19 @@
apiVersion: v2
name: datacenter-docs
description: A Helm chart for LLM Automation - Docs & Remediation Engine
type: application
version: 0.1.0
appVersion: "0.1.0"
keywords:
- datacenter
- documentation
- ai
- automation
- remediation
- llm
maintainers:
- name: Datacenter Docs Team
home: https://git.commandware.com/it-ops/llm-automation-docs-and-remediation-engine
sources:
- https://git.commandware.com/it-ops/llm-automation-docs-and-remediation-engine
dependencies: []

View File

@@ -0,0 +1,423 @@
# Datacenter Docs & Remediation Engine - Helm Chart
Helm chart for deploying the LLM Automation - Docs & Remediation Engine on Kubernetes.
## Overview
This chart deploys a complete stack including:
- **MongoDB**: Document database for storing tickets, documentation, and metadata
- **Redis**: Cache and task queue backend
- **API Service**: FastAPI REST API with auto-remediation capabilities
- **Chat Service**: WebSocket server for real-time documentation queries (optional, not yet implemented)
- **Worker Service**: Celery workers for background tasks (optional, not yet implemented)
- **Frontend**: React-based web interface
## Prerequisites
- Kubernetes 1.19+
- Helm 3.0+
- PersistentVolume provisioner support in the underlying infrastructure (for MongoDB persistence)
- Ingress controller (optional, for external access)
## Installation
### Quick Start
```bash
# Add the chart repository (if published)
helm repo add datacenter-docs https://your-repo-url
helm repo update
# Install with default values
helm install my-datacenter-docs datacenter-docs/datacenter-docs
# Or install from local directory
helm install my-datacenter-docs ./datacenter-docs
```
### Production Installation
For production, create a custom `values.yaml`:
```bash
# Copy and edit the values file
cp values.yaml my-values.yaml
# Edit my-values.yaml with your configuration
# At minimum, change:
# - secrets.llmApiKey
# - secrets.apiSecretKey
# - ingress.hosts
# Install with custom values
helm install my-datacenter-docs ./datacenter-docs -f my-values.yaml
```
### Install with Specific Configuration
```bash
helm install my-datacenter-docs ./datacenter-docs \
--set secrets.llmApiKey="sk-your-openai-api-key" \
--set secrets.apiSecretKey="your-strong-secret-key" \
--set ingress.hosts[0].host="datacenter-docs.yourdomain.com" \
--set mongodb.persistence.size="50Gi"
```
## Configuration
### Key Configuration Parameters
#### Global Settings
| Parameter | Description | Default |
|-----------|-------------|---------|
| `global.imagePullPolicy` | Image pull policy | `IfNotPresent` |
| `global.storageClass` | Storage class for PVCs | `""` |
#### MongoDB
| Parameter | Description | Default |
|-----------|-------------|---------|
| `mongodb.enabled` | Enable MongoDB | `true` |
| `mongodb.image.repository` | MongoDB image | `mongo` |
| `mongodb.image.tag` | MongoDB version | `7` |
| `mongodb.auth.rootUsername` | Root username | `admin` |
| `mongodb.auth.rootPassword` | Root password | `admin123` |
| `mongodb.persistence.enabled` | Enable persistence | `true` |
| `mongodb.persistence.size` | Volume size | `10Gi` |
| `mongodb.resources.requests.memory` | Memory request | `512Mi` |
| `mongodb.resources.limits.memory` | Memory limit | `2Gi` |
#### Redis
| Parameter | Description | Default |
|-----------|-------------|---------|
| `redis.enabled` | Enable Redis | `true` |
| `redis.image.repository` | Redis image | `redis` |
| `redis.image.tag` | Redis version | `7-alpine` |
| `redis.resources.requests.memory` | Memory request | `128Mi` |
| `redis.resources.limits.memory` | Memory limit | `512Mi` |
#### API Service
| Parameter | Description | Default |
|-----------|-------------|---------|
| `api.enabled` | Enable API service | `true` |
| `api.replicaCount` | Number of replicas | `2` |
| `api.image.repository` | API image repository | `datacenter-docs-api` |
| `api.image.tag` | API image tag | `latest` |
| `api.service.port` | Service port | `8000` |
| `api.autoscaling.enabled` | Enable HPA | `true` |
| `api.autoscaling.minReplicas` | Min replicas | `2` |
| `api.autoscaling.maxReplicas` | Max replicas | `10` |
| `api.resources.requests.memory` | Memory request | `512Mi` |
| `api.resources.limits.memory` | Memory limit | `2Gi` |
#### Worker Service
| Parameter | Description | Default |
|-----------|-------------|---------|
| `worker.enabled` | Enable worker service | `false` |
| `worker.replicaCount` | Number of replicas | `3` |
| `worker.autoscaling.enabled` | Enable HPA | `true` |
| `worker.autoscaling.minReplicas` | Min replicas | `1` |
| `worker.autoscaling.maxReplicas` | Max replicas | `10` |
#### Chat Service
| Parameter | Description | Default |
|-----------|-------------|---------|
| `chat.enabled` | Enable chat service | `false` |
| `chat.replicaCount` | Number of replicas | `1` |
| `chat.service.port` | Service port | `8001` |
#### Frontend
| Parameter | Description | Default |
|-----------|-------------|---------|
| `frontend.enabled` | Enable frontend | `true` |
| `frontend.replicaCount` | Number of replicas | `2` |
| `frontend.service.port` | Service port | `80` |
#### Ingress
| Parameter | Description | Default |
|-----------|-------------|---------|
| `ingress.enabled` | Enable ingress | `true` |
| `ingress.className` | Ingress class | `nginx` |
| `ingress.hosts[0].host` | Hostname | `datacenter-docs.example.com` |
| `ingress.tls[0].secretName` | TLS secret name | `datacenter-docs-tls` |
#### Application Configuration
| Parameter | Description | Default |
|-----------|-------------|---------|
| `config.llm.baseUrl` | LLM provider URL | `https://api.openai.com/v1` |
| `config.llm.model` | LLM model | `gpt-4-turbo-preview` |
| `config.autoRemediation.enabled` | Enable auto-remediation | `true` |
| `config.autoRemediation.minReliabilityScore` | Min reliability score | `85.0` |
| `config.autoRemediation.dryRun` | Dry run mode | `false` |
| `config.logLevel` | Log level | `INFO` |
#### Secrets
| Parameter | Description | Default |
|-----------|-------------|---------|
| `secrets.llmApiKey` | LLM API key | `sk-your-openai-api-key-here` |
| `secrets.apiSecretKey` | API secret key | `your-secret-key-here-change-in-production` |
**IMPORTANT**: Change these secrets in production!
## Usage Examples
### Enable All Services (including chat and worker)
```bash
helm install my-datacenter-docs ./datacenter-docs \
--set chat.enabled=true \
--set worker.enabled=true
```
### Disable Auto-Remediation
```bash
helm install my-datacenter-docs ./datacenter-docs \
--set config.autoRemediation.enabled=false
```
### Use Different LLM Provider (e.g., Anthropic Claude)
```bash
helm install my-datacenter-docs ./datacenter-docs \
--set config.llm.baseUrl="https://api.anthropic.com/v1" \
--set config.llm.model="claude-3-opus-20240229" \
--set secrets.llmApiKey="sk-ant-your-anthropic-key"
```
### Use Local LLM (e.g., Ollama)
```bash
helm install my-datacenter-docs ./datacenter-docs \
--set config.llm.baseUrl="http://ollama-service:11434/v1" \
--set config.llm.model="llama2" \
--set secrets.llmApiKey="not-needed"
```
### Scale MongoDB Storage
```bash
helm install my-datacenter-docs ./datacenter-docs \
--set mongodb.persistence.size="100Gi"
```
### Disable Ingress (use port-forward instead)
```bash
helm install my-datacenter-docs ./datacenter-docs \
--set ingress.enabled=false
```
### Production Configuration with External MongoDB
```yaml
# production-values.yaml
mongodb:
enabled: false
config:
mongodbUrl: "mongodb://user:pass@external-mongodb:27017/datacenter_docs?authSource=admin"
api:
replicaCount: 5
autoscaling:
maxReplicas: 20
secrets:
llmApiKey: "sk-your-production-api-key"
apiSecretKey: "your-production-secret-key"
ingress:
hosts:
- host: "datacenter-docs.prod.yourdomain.com"
paths:
- path: /
pathType: Prefix
service: frontend
- path: /api
pathType: Prefix
service: api
```
```bash
helm install prod-datacenter-docs ./datacenter-docs -f production-values.yaml
```
## Upgrading
```bash
# Upgrade with new values
helm upgrade my-datacenter-docs ./datacenter-docs -f my-values.yaml
# Upgrade specific parameters
helm upgrade my-datacenter-docs ./datacenter-docs \
--set api.image.tag="v1.2.0" \
--reuse-values
```
## Uninstallation
```bash
helm uninstall my-datacenter-docs
```
**Note**: This will delete all resources except PersistentVolumeClaims (PVCs) for MongoDB. To also delete PVCs:
```bash
kubectl delete pvc -l app.kubernetes.io/instance=my-datacenter-docs
```
## Monitoring and Troubleshooting
### Check Pod Status
```bash
kubectl get pods -l app.kubernetes.io/instance=my-datacenter-docs
```
### View Logs
```bash
# API logs
kubectl logs -l app.kubernetes.io/component=api -f
# Worker logs
kubectl logs -l app.kubernetes.io/component=worker -f
# MongoDB logs
kubectl logs -l app.kubernetes.io/component=database -f
```
### Access Services Locally
```bash
# API
kubectl port-forward svc/my-datacenter-docs-api 8000:8000
# Frontend
kubectl port-forward svc/my-datacenter-docs-frontend 8080:80
# MongoDB (for debugging)
kubectl port-forward svc/my-datacenter-docs-mongodb 27017:27017
```
### Common Issues
#### Pods Stuck in Pending
Check if PVCs are bound:
```bash
kubectl get pvc
```
If storage class is missing, set it:
```bash
helm upgrade my-datacenter-docs ./datacenter-docs \
--set mongodb.persistence.storageClass="standard" \
--reuse-values
```
#### API Pods Crash Loop
Check logs:
```bash
kubectl logs -l app.kubernetes.io/component=api --tail=100
```
Common causes:
- MongoDB not ready (wait for init containers)
- Invalid LLM API key
- Missing environment variables
#### Cannot Access via Ingress
Check ingress status:
```bash
kubectl get ingress
kubectl describe ingress my-datacenter-docs
```
Ensure:
- Ingress controller is installed
- DNS points to ingress IP
- TLS certificate is valid (if using HTTPS)
## Security Considerations
### Production Checklist
- [ ] Change `secrets.llmApiKey` to a valid API key
- [ ] Change `secrets.apiSecretKey` to a strong random key
- [ ] Change MongoDB credentials (`mongodb.auth.rootPassword`)
- [ ] Enable TLS/SSL on ingress
- [ ] Review RBAC policies
- [ ] Use external secret management (e.g., HashiCorp Vault, AWS Secrets Manager)
- [ ] Enable network policies
- [ ] Set resource limits on all pods
- [ ] Enable pod security policies
- [ ] Review auto-remediation settings
### Using External Secrets
Instead of storing secrets in values.yaml, use Kubernetes secrets:
```bash
# Create secret
kubectl create secret generic datacenter-docs-secrets \
--from-literal=llm-api-key="sk-your-key" \
--from-literal=api-secret-key="your-secret"
# Modify templates to use existing secret
# (requires chart customization)
```
## Development
### Validating the Chart
```bash
# Lint the chart
helm lint ./datacenter-docs
# Dry run
helm install my-test ./datacenter-docs --dry-run --debug
# Template rendering
helm template my-test ./datacenter-docs > rendered.yaml
```
### Testing Locally
```bash
# Create kind cluster
kind create cluster
# Install chart
helm install test ./datacenter-docs \
--set ingress.enabled=false \
--set api.autoscaling.enabled=false \
--set mongodb.persistence.enabled=false
# Test
kubectl port-forward svc/test-datacenter-docs-api 8000:8000
curl http://localhost:8000/health
```
## Support
For issues and questions:
- Issues: https://git.commandware.com/it-ops/llm-automation-docs-and-remediation-engine/issues
- Documentation: https://git.commandware.com/it-ops/llm-automation-docs-and-remediation-engine
## License
See the main repository for license information.

View File

@@ -0,0 +1,162 @@
█████████████████████████████████████████████████████████████████████████████
█ █
█ Datacenter Docs & Remediation Engine - Successfully Deployed! █
█ █
█████████████████████████████████████████████████████████████████████████████
Thank you for installing {{ .Chart.Name }}.
Your release is named {{ .Release.Name }}.
Release namespace: {{ .Release.Namespace }}
==============================================================================
📦 INSTALLED COMPONENTS:
==============================================================================
{{- if .Values.mongodb.enabled }}
✓ MongoDB (Database)
{{- end }}
{{- if .Values.redis.enabled }}
✓ Redis (Cache & Task Queue)
{{- end }}
{{- if .Values.api.enabled }}
✓ API Service
{{- end }}
{{- if .Values.chat.enabled }}
✓ Chat Service (WebSocket)
{{- end }}
{{- if .Values.worker.enabled }}
✓ Celery Worker (Background Tasks)
{{- end }}
{{- if .Values.frontend.enabled }}
✓ Frontend (Web UI)
{{- end }}
==============================================================================
🔍 CHECK DEPLOYMENT STATUS:
==============================================================================
kubectl get pods -n {{ .Release.Namespace }} -l app.kubernetes.io/instance={{ .Release.Name }}
kubectl get services -n {{ .Release.Namespace }} -l app.kubernetes.io/instance={{ .Release.Name }}
==============================================================================
🌐 ACCESS YOUR APPLICATION:
==============================================================================
{{- if .Values.ingress.enabled }}
{{- range $host := .Values.ingress.hosts }}
{{ if $.Values.ingress.tls }}https{{ else }}http{{ end }}://{{ $host.host }}
{{- end }}
{{- else if .Values.frontend.enabled }}
To access the frontend, run:
kubectl port-forward -n {{ .Release.Namespace }} svc/{{ include "datacenter-docs.frontend.fullname" . }} 8080:{{ .Values.frontend.service.port }}
Then visit: http://localhost:8080
{{- end }}
{{- if .Values.api.enabled }}
To access the API directly, run:
kubectl port-forward -n {{ .Release.Namespace }} svc/{{ include "datacenter-docs.api.fullname" . }} 8000:{{ .Values.api.service.port }}
Then visit: http://localhost:8000/api/docs (OpenAPI documentation)
{{- end }}
==============================================================================
📊 VIEW LOGS:
==============================================================================
API logs:
kubectl logs -n {{ .Release.Namespace }} -l app.kubernetes.io/component=api -f
{{- if .Values.worker.enabled }}
Worker logs:
kubectl logs -n {{ .Release.Namespace }} -l app.kubernetes.io/component=worker -f
{{- end }}
{{- if .Values.chat.enabled }}
Chat logs:
kubectl logs -n {{ .Release.Namespace }} -l app.kubernetes.io/component=chat -f
{{- end }}
==============================================================================
🔐 SECURITY NOTICE:
==============================================================================
{{ if eq .Values.secrets.llmApiKey "sk-your-openai-api-key-here" }}
⚠️ WARNING: You are using the default LLM API key!
Update this immediately in production:
helm upgrade {{ .Release.Name }} datacenter-docs \
--set secrets.llmApiKey="your-actual-api-key" \
--reuse-values
{{ end }}
{{ if eq .Values.secrets.apiSecretKey "your-secret-key-here-change-in-production" }}
⚠️ WARNING: You are using the default API secret key!
Update this immediately in production:
helm upgrade {{ .Release.Name }} datacenter-docs \
--set secrets.apiSecretKey="your-actual-secret-key" \
--reuse-values
{{ end }}
For production deployments:
- Use strong, unique secrets
- Enable TLS/SSL for all services
- Review security context and RBAC policies
- Consider using external secret management (e.g., HashiCorp Vault)
==============================================================================
📖 USEFUL COMMANDS:
==============================================================================
Upgrade release:
helm upgrade {{ .Release.Name }} datacenter-docs --values custom-values.yaml
Get values:
helm get values {{ .Release.Name }}
View all resources:
helm get manifest {{ .Release.Name }}
Uninstall:
helm uninstall {{ .Release.Name }}
==============================================================================
🛠️ CONFIGURATION:
==============================================================================
{{- if .Values.config.autoRemediation.enabled }}
✓ Auto-remediation: ENABLED
- Minimum reliability score: {{ .Values.config.autoRemediation.minReliabilityScore }}%
- Approval threshold: {{ .Values.config.autoRemediation.requireApprovalThreshold }}%
{{- if .Values.config.autoRemediation.dryRun }}
- Mode: DRY RUN (no actual changes will be made)
{{- else }}
- Mode: ACTIVE (changes will be applied)
{{- end }}
{{- else }}
⚠️ Auto-remediation: DISABLED
{{- end }}
LLM Provider: {{ .Values.config.llm.baseUrl }}
Model: {{ .Values.config.llm.model }}
==============================================================================
📚 DOCUMENTATION & SUPPORT:
==============================================================================
For more information, visit:
https://git.commandware.com/it-ops/llm-automation-docs-and-remediation-engine
Report issues:
https://git.commandware.com/it-ops/llm-automation-docs-and-remediation-engine/issues
==============================================================================
Happy automating! 🚀

View File

@@ -0,0 +1,235 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "datacenter-docs.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Create a default fully qualified app name.
*/}}
{{- define "datacenter-docs.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}
{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "datacenter-docs.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Common labels
*/}}
{{- define "datacenter-docs.labels" -}}
helm.sh/chart: {{ include "datacenter-docs.chart" . }}
{{ include "datacenter-docs.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}
{{/*
Selector labels
*/}}
{{- define "datacenter-docs.selectorLabels" -}}
app.kubernetes.io/name: {{ include "datacenter-docs.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
{{/*
Create the name of the service account to use
*/}}
{{- define "datacenter-docs.serviceAccountName" -}}
{{- if .Values.serviceAccount.create }}
{{- default (include "datacenter-docs.fullname" .) .Values.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.serviceAccount.name }}
{{- end }}
{{- end }}
{{/*
MongoDB fullname
*/}}
{{- define "datacenter-docs.mongodb.fullname" -}}
{{- printf "%s-mongodb" (include "datacenter-docs.fullname" .) | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Redis fullname
*/}}
{{- define "datacenter-docs.redis.fullname" -}}
{{- printf "%s-redis" (include "datacenter-docs.fullname" .) | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
API fullname
*/}}
{{- define "datacenter-docs.api.fullname" -}}
{{- printf "%s-api" (include "datacenter-docs.fullname" .) | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Chat fullname
*/}}
{{- define "datacenter-docs.chat.fullname" -}}
{{- printf "%s-chat" (include "datacenter-docs.fullname" .) | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Worker fullname
*/}}
{{- define "datacenter-docs.worker.fullname" -}}
{{- printf "%s-worker" (include "datacenter-docs.fullname" .) | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Frontend fullname
*/}}
{{- define "datacenter-docs.frontend.fullname" -}}
{{- printf "%s-frontend" (include "datacenter-docs.fullname" .) | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Component labels for MongoDB
*/}}
{{- define "datacenter-docs.mongodb.labels" -}}
{{ include "datacenter-docs.labels" . }}
app.kubernetes.io/component: database
{{- end }}
{{/*
Component labels for Redis
*/}}
{{- define "datacenter-docs.redis.labels" -}}
{{ include "datacenter-docs.labels" . }}
app.kubernetes.io/component: cache
{{- end }}
{{/*
Component labels for API
*/}}
{{- define "datacenter-docs.api.labels" -}}
{{ include "datacenter-docs.labels" . }}
app.kubernetes.io/component: api
{{- end }}
{{/*
Component labels for Chat
*/}}
{{- define "datacenter-docs.chat.labels" -}}
{{ include "datacenter-docs.labels" . }}
app.kubernetes.io/component: chat
{{- end }}
{{/*
Component labels for Worker
*/}}
{{- define "datacenter-docs.worker.labels" -}}
{{ include "datacenter-docs.labels" . }}
app.kubernetes.io/component: worker
{{- end }}
{{/*
Component labels for Frontend
*/}}
{{- define "datacenter-docs.frontend.labels" -}}
{{ include "datacenter-docs.labels" . }}
app.kubernetes.io/component: frontend
{{- end }}
{{/*
Selector labels for MongoDB
*/}}
{{- define "datacenter-docs.mongodb.selectorLabels" -}}
{{ include "datacenter-docs.selectorLabels" . }}
app.kubernetes.io/component: database
{{- end }}
{{/*
Selector labels for Redis
*/}}
{{- define "datacenter-docs.redis.selectorLabels" -}}
{{ include "datacenter-docs.selectorLabels" . }}
app.kubernetes.io/component: cache
{{- end }}
{{/*
Selector labels for API
*/}}
{{- define "datacenter-docs.api.selectorLabels" -}}
{{ include "datacenter-docs.selectorLabels" . }}
app.kubernetes.io/component: api
{{- end }}
{{/*
Selector labels for Chat
*/}}
{{- define "datacenter-docs.chat.selectorLabels" -}}
{{ include "datacenter-docs.selectorLabels" . }}
app.kubernetes.io/component: chat
{{- end }}
{{/*
Selector labels for Worker
*/}}
{{- define "datacenter-docs.worker.selectorLabels" -}}
{{ include "datacenter-docs.selectorLabels" . }}
app.kubernetes.io/component: worker
{{- end }}
{{/*
Selector labels for Frontend
*/}}
{{- define "datacenter-docs.frontend.selectorLabels" -}}
{{ include "datacenter-docs.selectorLabels" . }}
app.kubernetes.io/component: frontend
{{- end }}
{{/*
Return the proper image name
*/}}
{{- define "datacenter-docs.image" -}}
{{- $registryName := .registry -}}
{{- $repositoryName := .repository -}}
{{- $tag := .tag | toString -}}
{{- if $registryName }}
{{- printf "%s/%s:%s" $registryName $repositoryName $tag -}}
{{- else }}
{{- printf "%s:%s" $repositoryName $tag -}}
{{- end }}
{{- end }}
{{/*
Return the proper Docker Image Registry Secret Names
*/}}
{{- define "datacenter-docs.imagePullSecrets" -}}
{{- if .Values.global.imagePullSecrets }}
imagePullSecrets:
{{- range .Values.global.imagePullSecrets }}
- name: {{ . }}
{{- end }}
{{- end }}
{{- end }}
{{/*
Return the appropriate apiVersion for HPA
*/}}
{{- define "datacenter-docs.hpa.apiVersion" -}}
{{- if semverCompare ">=1.23-0" .Capabilities.KubeVersion.GitVersion -}}
{{- print "autoscaling/v2" -}}
{{- else -}}
{{- print "autoscaling/v2beta2" -}}
{{- end -}}
{{- end -}}

View File

@@ -0,0 +1,120 @@
{{- if .Values.api.enabled }}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "datacenter-docs.api.fullname" . }}
labels:
{{- include "datacenter-docs.api.labels" . | nindent 4 }}
spec:
{{- if not .Values.api.autoscaling.enabled }}
replicas: {{ .Values.api.replicaCount }}
{{- end }}
selector:
matchLabels:
{{- include "datacenter-docs.api.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "datacenter-docs.api.selectorLabels" . | nindent 8 }}
annotations:
checksum/config: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }}
checksum/secret: {{ include (print $.Template.BasePath "/secrets.yaml") . | sha256sum }}
{{- with .Values.podAnnotations }}
{{- toYaml . | nindent 8 }}
{{- end }}
spec:
{{- with .Values.global.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
serviceAccountName: {{ include "datacenter-docs.serviceAccountName" . }}
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
initContainers:
- name: wait-for-mongodb
image: busybox:1.36
command:
- sh
- -c
- |
until nc -z {{ include "datacenter-docs.mongodb.fullname" . }} {{ .Values.mongodb.service.port }}; do
echo "Waiting for MongoDB..."
sleep 2
done
- name: wait-for-redis
image: busybox:1.36
command:
- sh
- -c
- |
until nc -z {{ include "datacenter-docs.redis.fullname" . }} {{ .Values.redis.service.port }}; do
echo "Waiting for Redis..."
sleep 2
done
containers:
- name: api
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
image: "{{ .Values.api.image.repository }}:{{ .Values.api.image.tag }}"
imagePullPolicy: {{ .Values.api.image.pullPolicy }}
ports:
- name: http
containerPort: {{ .Values.api.service.targetPort }}
protocol: TCP
env:
- name: MONGODB_URL
valueFrom:
configMapKeyRef:
name: {{ include "datacenter-docs.fullname" . }}-config
key: mongodb-url
- name: REDIS_URL
valueFrom:
configMapKeyRef:
name: {{ include "datacenter-docs.fullname" . }}-config
key: redis-url
- name: LLM_BASE_URL
valueFrom:
configMapKeyRef:
name: {{ include "datacenter-docs.fullname" . }}-config
key: llm-base-url
- name: LLM_MODEL
valueFrom:
configMapKeyRef:
name: {{ include "datacenter-docs.fullname" . }}-config
key: llm-model
- name: LLM_API_KEY
valueFrom:
secretKeyRef:
name: {{ include "datacenter-docs.fullname" . }}-secrets
key: llm-api-key
- name: API_SECRET_KEY
valueFrom:
secretKeyRef:
name: {{ include "datacenter-docs.fullname" . }}-secrets
key: api-secret-key
- name: LOG_LEVEL
valueFrom:
configMapKeyRef:
name: {{ include "datacenter-docs.fullname" . }}-config
key: log-level
- name: PYTHONPATH
value: "/app/src"
livenessProbe:
{{- toYaml .Values.api.livenessProbe | nindent 12 }}
readinessProbe:
{{- toYaml .Values.api.readinessProbe | nindent 12 }}
resources:
{{- toYaml .Values.api.resources | nindent 12 }}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,32 @@
{{- if and .Values.api.enabled .Values.api.autoscaling.enabled }}
apiVersion: {{ include "datacenter-docs.hpa.apiVersion" . }}
kind: HorizontalPodAutoscaler
metadata:
name: {{ include "datacenter-docs.api.fullname" . }}
labels:
{{- include "datacenter-docs.api.labels" . | nindent 4 }}
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: {{ include "datacenter-docs.api.fullname" . }}
minReplicas: {{ .Values.api.autoscaling.minReplicas }}
maxReplicas: {{ .Values.api.autoscaling.maxReplicas }}
metrics:
{{- if .Values.api.autoscaling.targetCPUUtilizationPercentage }}
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: {{ .Values.api.autoscaling.targetCPUUtilizationPercentage }}
{{- end }}
{{- if .Values.api.autoscaling.targetMemoryUtilizationPercentage }}
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: {{ .Values.api.autoscaling.targetMemoryUtilizationPercentage }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,17 @@
{{- if .Values.api.enabled }}
apiVersion: v1
kind: Service
metadata:
name: {{ include "datacenter-docs.api.fullname" . }}
labels:
{{- include "datacenter-docs.api.labels" . | nindent 4 }}
spec:
type: {{ .Values.api.service.type }}
ports:
- port: {{ .Values.api.service.port }}
targetPort: http
protocol: TCP
name: http
selector:
{{- include "datacenter-docs.api.selectorLabels" . | nindent 4 }}
{{- end }}

View File

@@ -0,0 +1,94 @@
{{- if .Values.chat.enabled }}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "datacenter-docs.chat.fullname" . }}
labels:
{{- include "datacenter-docs.chat.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.chat.replicaCount }}
selector:
matchLabels:
{{- include "datacenter-docs.chat.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "datacenter-docs.chat.selectorLabels" . | nindent 8 }}
annotations:
checksum/config: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }}
checksum/secret: {{ include (print $.Template.BasePath "/secrets.yaml") . | sha256sum }}
{{- with .Values.podAnnotations }}
{{- toYaml . | nindent 8 }}
{{- end }}
spec:
{{- with .Values.global.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
serviceAccountName: {{ include "datacenter-docs.serviceAccountName" . }}
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
initContainers:
- name: wait-for-mongodb
image: busybox:1.36
command:
- sh
- -c
- |
until nc -z {{ include "datacenter-docs.mongodb.fullname" . }} {{ .Values.mongodb.service.port }}; do
echo "Waiting for MongoDB..."
sleep 2
done
containers:
- name: chat
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
image: "{{ .Values.chat.image.repository }}:{{ .Values.chat.image.tag }}"
imagePullPolicy: {{ .Values.chat.image.pullPolicy }}
ports:
- name: http
containerPort: {{ .Values.chat.service.targetPort }}
protocol: TCP
env:
- name: MONGODB_URL
valueFrom:
configMapKeyRef:
name: {{ include "datacenter-docs.fullname" . }}-config
key: mongodb-url
- name: LLM_BASE_URL
valueFrom:
configMapKeyRef:
name: {{ include "datacenter-docs.fullname" . }}-config
key: llm-base-url
- name: LLM_MODEL
valueFrom:
configMapKeyRef:
name: {{ include "datacenter-docs.fullname" . }}-config
key: llm-model
- name: LLM_API_KEY
valueFrom:
secretKeyRef:
name: {{ include "datacenter-docs.fullname" . }}-secrets
key: llm-api-key
- name: LOG_LEVEL
valueFrom:
configMapKeyRef:
name: {{ include "datacenter-docs.fullname" . }}-config
key: log-level
- name: PYTHONPATH
value: "/app/src"
resources:
{{- toYaml .Values.chat.resources | nindent 12 }}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,17 @@
{{- if .Values.chat.enabled }}
apiVersion: v1
kind: Service
metadata:
name: {{ include "datacenter-docs.chat.fullname" . }}
labels:
{{- include "datacenter-docs.chat.labels" . | nindent 4 }}
spec:
type: {{ .Values.chat.service.type }}
ports:
- port: {{ .Values.chat.service.port }}
targetPort: http
protocol: TCP
name: http
selector:
{{- include "datacenter-docs.chat.selectorLabels" . | nindent 4 }}
{{- end }}

View File

@@ -0,0 +1,37 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ include "datacenter-docs.fullname" . }}-config
labels:
{{- include "datacenter-docs.labels" . | nindent 4 }}
data:
# MongoDB connection
mongodb-url: {{ tpl .Values.config.mongodbUrl . | quote }}
# Redis connection
redis-url: {{ tpl .Values.config.redisUrl . | quote }}
# LLM configuration
llm-base-url: {{ .Values.config.llm.baseUrl | quote }}
llm-model: {{ .Values.config.llm.model | quote }}
llm-max-tokens: {{ .Values.config.llm.maxTokens | quote }}
llm-temperature: {{ .Values.config.llm.temperature | quote }}
# MCP configuration
mcp-base-url: {{ .Values.config.mcp.baseUrl | quote }}
mcp-timeout: {{ .Values.config.mcp.timeout | quote }}
# Auto-remediation configuration
auto-remediation-enabled: {{ .Values.config.autoRemediation.enabled | quote }}
auto-remediation-min-reliability: {{ .Values.config.autoRemediation.minReliabilityScore | quote }}
auto-remediation-approval-threshold: {{ .Values.config.autoRemediation.requireApprovalThreshold | quote }}
auto-remediation-max-actions-per-hour: {{ .Values.config.autoRemediation.maxActionsPerHour | quote }}
auto-remediation-dry-run: {{ .Values.config.autoRemediation.dryRun | quote }}
# Security configuration
api-key-enabled: {{ .Values.config.apiKeyEnabled | quote }}
cors-origins: {{ join "," .Values.config.corsOrigins | quote }}
# Logging configuration
log-level: {{ .Values.config.logLevel | quote }}
log-format: {{ .Values.config.logFormat | quote }}

View File

@@ -0,0 +1,69 @@
{{- if .Values.frontend.enabled }}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "datacenter-docs.frontend.fullname" . }}
labels:
{{- include "datacenter-docs.frontend.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.frontend.replicaCount }}
selector:
matchLabels:
{{- include "datacenter-docs.frontend.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "datacenter-docs.frontend.selectorLabels" . | nindent 8 }}
{{- with .Values.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
spec:
{{- with .Values.global.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
serviceAccountName: {{ include "datacenter-docs.serviceAccountName" . }}
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
containers:
- name: frontend
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
image: "{{ .Values.frontend.image.repository }}:{{ .Values.frontend.image.tag }}"
imagePullPolicy: {{ .Values.frontend.image.pullPolicy }}
ports:
- name: http
containerPort: {{ .Values.frontend.service.targetPort }}
protocol: TCP
livenessProbe:
httpGet:
path: /
port: http
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /
port: http
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
resources:
{{- toYaml .Values.frontend.resources | nindent 12 }}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,17 @@
{{- if .Values.frontend.enabled }}
apiVersion: v1
kind: Service
metadata:
name: {{ include "datacenter-docs.frontend.fullname" . }}
labels:
{{- include "datacenter-docs.frontend.labels" . | nindent 4 }}
spec:
type: {{ .Values.frontend.service.type }}
ports:
- port: {{ .Values.frontend.service.port }}
targetPort: http
protocol: TCP
name: http
selector:
{{- include "datacenter-docs.frontend.selectorLabels" . | nindent 4 }}
{{- end }}

View File

@@ -0,0 +1,57 @@
{{- if .Values.ingress.enabled -}}
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: {{ include "datacenter-docs.fullname" . }}
labels:
{{- include "datacenter-docs.labels" . | nindent 4 }}
{{- with .Values.ingress.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
spec:
{{- if .Values.ingress.className }}
ingressClassName: {{ .Values.ingress.className }}
{{- end }}
{{- if .Values.ingress.tls }}
tls:
{{- range .Values.ingress.tls }}
- hosts:
{{- range .hosts }}
- {{ . | quote }}
{{- end }}
secretName: {{ .secretName }}
{{- end }}
{{- end }}
rules:
{{- range .Values.ingress.hosts }}
- host: {{ .host | quote }}
http:
paths:
{{- range .paths }}
- path: {{ .path }}
pathType: {{ .pathType }}
backend:
service:
{{- if eq .service "frontend" }}
name: {{ include "datacenter-docs.frontend.fullname" $ }}
{{- else if eq .service "api" }}
name: {{ include "datacenter-docs.api.fullname" $ }}
{{- else if eq .service "chat" }}
name: {{ include "datacenter-docs.chat.fullname" $ }}
{{- else }}
name: {{ .service }}
{{- end }}
port:
{{- if eq .service "frontend" }}
number: {{ $.Values.frontend.service.port }}
{{- else if eq .service "api" }}
number: {{ $.Values.api.service.port }}
{{- else if eq .service "chat" }}
number: {{ $.Values.chat.service.port }}
{{- else }}
number: 80
{{- end }}
{{- end }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,17 @@
{{- if .Values.mongodb.enabled }}
apiVersion: v1
kind: Service
metadata:
name: {{ include "datacenter-docs.mongodb.fullname" . }}
labels:
{{- include "datacenter-docs.mongodb.labels" . | nindent 4 }}
spec:
type: {{ .Values.mongodb.service.type }}
ports:
- port: {{ .Values.mongodb.service.port }}
targetPort: mongodb
protocol: TCP
name: mongodb
selector:
{{- include "datacenter-docs.mongodb.selectorLabels" . | nindent 4 }}
{{- end }}

View File

@@ -0,0 +1,113 @@
{{- if .Values.mongodb.enabled }}
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: {{ include "datacenter-docs.mongodb.fullname" . }}
labels:
{{- include "datacenter-docs.mongodb.labels" . | nindent 4 }}
spec:
serviceName: {{ include "datacenter-docs.mongodb.fullname" . }}
replicas: 1
selector:
matchLabels:
{{- include "datacenter-docs.mongodb.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "datacenter-docs.mongodb.selectorLabels" . | nindent 8 }}
{{- with .Values.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
spec:
{{- with .Values.global.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
serviceAccountName: {{ include "datacenter-docs.serviceAccountName" . }}
securityContext:
fsGroup: 999
runAsUser: 999
containers:
- name: mongodb
image: "{{ .Values.mongodb.image.repository }}:{{ .Values.mongodb.image.tag }}"
imagePullPolicy: {{ .Values.mongodb.image.pullPolicy }}
ports:
- name: mongodb
containerPort: 27017
protocol: TCP
env:
- name: MONGO_INITDB_ROOT_USERNAME
valueFrom:
secretKeyRef:
name: {{ include "datacenter-docs.fullname" . }}-secrets
key: mongodb-username
- name: MONGO_INITDB_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: {{ include "datacenter-docs.fullname" . }}-secrets
key: mongodb-password
- name: MONGO_INITDB_DATABASE
value: {{ .Values.mongodb.auth.database | quote }}
livenessProbe:
exec:
command:
- mongosh
- --eval
- "db.adminCommand('ping')"
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
exec:
command:
- mongosh
- --eval
- "db.adminCommand('ping')"
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
resources:
{{- toYaml .Values.mongodb.resources | nindent 12 }}
volumeMounts:
- name: data
mountPath: /data/db
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- if .Values.mongodb.persistence.enabled }}
volumeClaimTemplates:
- metadata:
name: data
labels:
{{- include "datacenter-docs.mongodb.labels" . | nindent 10 }}
spec:
accessModes:
- ReadWriteOnce
{{- if .Values.mongodb.persistence.storageClass }}
{{- if (eq "-" .Values.mongodb.persistence.storageClass) }}
storageClassName: ""
{{- else }}
storageClassName: {{ .Values.mongodb.persistence.storageClass | quote }}
{{- end }}
{{- end }}
resources:
requests:
storage: {{ .Values.mongodb.persistence.size | quote }}
{{- else }}
volumes:
- name: data
emptyDir: {}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,70 @@
{{- if .Values.redis.enabled }}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "datacenter-docs.redis.fullname" . }}
labels:
{{- include "datacenter-docs.redis.labels" . | nindent 4 }}
spec:
replicas: 1
selector:
matchLabels:
{{- include "datacenter-docs.redis.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "datacenter-docs.redis.selectorLabels" . | nindent 8 }}
{{- with .Values.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
spec:
{{- with .Values.global.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
serviceAccountName: {{ include "datacenter-docs.serviceAccountName" . }}
securityContext:
fsGroup: 999
runAsUser: 999
containers:
- name: redis
image: "{{ .Values.redis.image.repository }}:{{ .Values.redis.image.tag }}"
imagePullPolicy: {{ .Values.redis.image.pullPolicy }}
ports:
- name: redis
containerPort: 6379
protocol: TCP
livenessProbe:
exec:
command:
- redis-cli
- ping
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
exec:
command:
- redis-cli
- ping
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
resources:
{{- toYaml .Values.redis.resources | nindent 12 }}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,17 @@
{{- if .Values.redis.enabled }}
apiVersion: v1
kind: Service
metadata:
name: {{ include "datacenter-docs.redis.fullname" . }}
labels:
{{- include "datacenter-docs.redis.labels" . | nindent 4 }}
spec:
type: {{ .Values.redis.service.type }}
ports:
- port: {{ .Values.redis.service.port }}
targetPort: redis
protocol: TCP
name: redis
selector:
{{- include "datacenter-docs.redis.selectorLabels" . | nindent 4 }}
{{- end }}

View File

@@ -0,0 +1,17 @@
apiVersion: v1
kind: Secret
metadata:
name: {{ include "datacenter-docs.fullname" . }}-secrets
labels:
{{- include "datacenter-docs.labels" . | nindent 4 }}
type: Opaque
stringData:
# LLM API Key
llm-api-key: {{ .Values.secrets.llmApiKey | quote }}
# API Secret Key
api-secret-key: {{ .Values.secrets.apiSecretKey | quote }}
# MongoDB credentials
mongodb-username: {{ .Values.secrets.mongodbUsername | quote }}
mongodb-password: {{ .Values.secrets.mongodbPassword | quote }}

View File

@@ -0,0 +1,13 @@
{{- if .Values.serviceAccount.create -}}
apiVersion: v1
kind: ServiceAccount
metadata:
name: {{ include "datacenter-docs.serviceAccountName" . }}
labels:
{{- include "datacenter-docs.labels" . | nindent 4 }}
{{- with .Values.serviceAccount.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
automountServiceAccountToken: true
{{- end }}

View File

@@ -0,0 +1,107 @@
{{- if .Values.worker.enabled }}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "datacenter-docs.worker.fullname" . }}
labels:
{{- include "datacenter-docs.worker.labels" . | nindent 4 }}
spec:
{{- if not .Values.worker.autoscaling.enabled }}
replicas: {{ .Values.worker.replicaCount }}
{{- end }}
selector:
matchLabels:
{{- include "datacenter-docs.worker.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "datacenter-docs.worker.selectorLabels" . | nindent 8 }}
annotations:
checksum/config: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }}
checksum/secret: {{ include (print $.Template.BasePath "/secrets.yaml") . | sha256sum }}
{{- with .Values.podAnnotations }}
{{- toYaml . | nindent 8 }}
{{- end }}
spec:
{{- with .Values.global.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
serviceAccountName: {{ include "datacenter-docs.serviceAccountName" . }}
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
initContainers:
- name: wait-for-mongodb
image: busybox:1.36
command:
- sh
- -c
- |
until nc -z {{ include "datacenter-docs.mongodb.fullname" . }} {{ .Values.mongodb.service.port }}; do
echo "Waiting for MongoDB..."
sleep 2
done
- name: wait-for-redis
image: busybox:1.36
command:
- sh
- -c
- |
until nc -z {{ include "datacenter-docs.redis.fullname" . }} {{ .Values.redis.service.port }}; do
echo "Waiting for Redis..."
sleep 2
done
containers:
- name: worker
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
image: "{{ .Values.worker.image.repository }}:{{ .Values.worker.image.tag }}"
imagePullPolicy: {{ .Values.worker.image.pullPolicy }}
env:
- name: MONGODB_URL
valueFrom:
configMapKeyRef:
name: {{ include "datacenter-docs.fullname" . }}-config
key: mongodb-url
- name: REDIS_URL
valueFrom:
configMapKeyRef:
name: {{ include "datacenter-docs.fullname" . }}-config
key: redis-url
- name: LLM_BASE_URL
valueFrom:
configMapKeyRef:
name: {{ include "datacenter-docs.fullname" . }}-config
key: llm-base-url
- name: LLM_MODEL
valueFrom:
configMapKeyRef:
name: {{ include "datacenter-docs.fullname" . }}-config
key: llm-model
- name: LLM_API_KEY
valueFrom:
secretKeyRef:
name: {{ include "datacenter-docs.fullname" . }}-secrets
key: llm-api-key
- name: LOG_LEVEL
valueFrom:
configMapKeyRef:
name: {{ include "datacenter-docs.fullname" . }}-config
key: log-level
- name: PYTHONPATH
value: "/app/src"
resources:
{{- toYaml .Values.worker.resources | nindent 12 }}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,24 @@
{{- if and .Values.worker.enabled .Values.worker.autoscaling.enabled }}
apiVersion: {{ include "datacenter-docs.hpa.apiVersion" . }}
kind: HorizontalPodAutoscaler
metadata:
name: {{ include "datacenter-docs.worker.fullname" . }}
labels:
{{- include "datacenter-docs.worker.labels" . | nindent 4 }}
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: {{ include "datacenter-docs.worker.fullname" . }}
minReplicas: {{ .Values.worker.autoscaling.minReplicas }}
maxReplicas: {{ .Values.worker.autoscaling.maxReplicas }}
metrics:
{{- if .Values.worker.autoscaling.targetCPUUtilizationPercentage }}
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: {{ .Values.worker.autoscaling.targetCPUUtilizationPercentage }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,181 @@
# Development values for datacenter-docs
# This configuration is optimized for local development and testing
# Use with: helm install dev ./datacenter-docs -f values-development.yaml
global:
imagePullPolicy: IfNotPresent
storageClass: ""
# MongoDB - minimal resources for development
mongodb:
enabled: true
image:
repository: mongo
tag: "7"
pullPolicy: IfNotPresent
auth:
rootUsername: admin
rootPassword: admin123
database: datacenter_docs
persistence:
enabled: false # Use emptyDir for faster testing
size: 1Gi
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
# Redis - minimal resources
redis:
enabled: true
resources:
requests:
memory: "64Mi"
cpu: "50m"
limits:
memory: "256Mi"
cpu: "200m"
# API service - single replica for development
api:
enabled: true
replicaCount: 1
image:
repository: datacenter-docs-api
tag: "latest"
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 8000
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "1Gi"
cpu: "500m"
autoscaling:
enabled: false # Disable for development
# Chat service - disabled by default (not implemented)
chat:
enabled: false
# Worker service - disabled by default (not implemented)
worker:
enabled: false
# Frontend - single replica
frontend:
enabled: true
replicaCount: 1
image:
repository: datacenter-docs-frontend
tag: "latest"
pullPolicy: IfNotPresent
resources:
requests:
memory: "64Mi"
cpu: "50m"
limits:
memory: "128Mi"
cpu: "100m"
# Ingress - disabled for development (use port-forward)
ingress:
enabled: false
# Application configuration for development
config:
mongodbUrl: "mongodb://admin:admin123@{{ include \"datacenter-docs.mongodb.fullname\" . }}:27017/datacenter_docs?authSource=admin"
redisUrl: "redis://{{ include \"datacenter-docs.redis.fullname\" . }}:6379/0"
llm:
# Use local LLM for development (no API costs)
baseUrl: "http://localhost:11434/v1" # Ollama
model: "llama2"
# Or use OpenAI with a test key
# baseUrl: "https://api.openai.com/v1"
# model: "gpt-3.5-turbo"
maxTokens: 2048
temperature: 0.7
mcp:
baseUrl: "http://mcp-server:8080"
timeout: 30
# Auto-remediation in dry-run mode for safety
autoRemediation:
enabled: true
minReliabilityScore: 85.0
requireApprovalThreshold: 90.0
maxActionsPerHour: 100
dryRun: true # ALWAYS dry-run in development
apiKeyEnabled: false # Disable for easier testing
corsOrigins:
- "http://localhost:3000"
- "http://localhost:8080"
- "http://localhost:8000"
logLevel: "DEBUG" # Verbose logging for development
logFormat: "text" # Human-readable logs
# Secrets - safe defaults for development only
secrets:
llmApiKey: "not-needed-for-local-llm"
apiSecretKey: "dev-secret-key-not-for-production"
mongodbUsername: "admin"
mongodbPassword: "admin123"
# ServiceAccount
serviceAccount:
create: true
annotations: {}
name: ""
# Relaxed security for development
podSecurityContext:
fsGroup: 1000
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
# No node selectors or tolerations
nodeSelector: {}
tolerations: []
affinity: {}
# No priority class
priorityClassName: ""
# Development tips:
#
# 1. Port-forward to access services:
# kubectl port-forward svc/dev-datacenter-docs-api 8000:8000
# kubectl port-forward svc/dev-datacenter-docs-frontend 8080:80
#
# 2. View logs:
# kubectl logs -l app.kubernetes.io/component=api -f
#
# 3. Access MongoDB directly:
# kubectl port-forward svc/dev-datacenter-docs-mongodb 27017:27017
# mongosh mongodb://admin:admin123@localhost:27017
#
# 4. Quick iteration:
# # Make code changes
# docker build -t datacenter-docs-api:latest -f deploy/docker/Dockerfile.api .
# kubectl rollout restart deployment/dev-datacenter-docs-api
#
# 5. Clean slate:
# helm uninstall dev
# kubectl delete pvc --all
# helm install dev ./datacenter-docs -f values-development.yaml

View File

@@ -0,0 +1,304 @@
# Production values for datacenter-docs
# This is an example configuration for production deployment
# Copy this file and customize it for your environment
global:
imagePullPolicy: Always
storageClass: "standard" # Use your storage class
# MongoDB configuration for production
mongodb:
enabled: true
auth:
rootUsername: admin
rootPassword: "CHANGE-THIS-IN-PRODUCTION" # Use strong password
database: datacenter_docs
persistence:
enabled: true
size: 50Gi # Adjust based on expected data volume
storageClass: "fast-ssd" # Use SSD storage class for better performance
resources:
requests:
memory: "2Gi"
cpu: "1000m"
limits:
memory: "4Gi"
cpu: "2000m"
# Redis configuration for production
redis:
enabled: true
resources:
requests:
memory: "256Mi"
cpu: "200m"
limits:
memory: "1Gi"
cpu: "1000m"
# API service - production scale
api:
enabled: true
replicaCount: 5
image:
repository: your-registry.io/datacenter-docs-api
tag: "v1.0.0" # Use specific version, not latest
pullPolicy: Always
service:
type: ClusterIP
port: 8000
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "4Gi"
cpu: "2000m"
autoscaling:
enabled: true
minReplicas: 5
maxReplicas: 20
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 80
# Chat service - enable in production
chat:
enabled: true
replicaCount: 3
image:
repository: your-registry.io/datacenter-docs-chat
tag: "v1.0.0"
pullPolicy: Always
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "2Gi"
cpu: "1000m"
# Worker service - enable in production
worker:
enabled: true
replicaCount: 5
image:
repository: your-registry.io/datacenter-docs-worker
tag: "v1.0.0"
pullPolicy: Always
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "4Gi"
cpu: "2000m"
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 20
targetCPUUtilizationPercentage: 75
# Frontend - production scale
frontend:
enabled: true
replicaCount: 3
image:
repository: your-registry.io/datacenter-docs-frontend
tag: "v1.0.0"
pullPolicy: Always
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
# Ingress - production configuration
ingress:
enabled: true
className: "nginx"
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
nginx.ingress.kubernetes.io/proxy-body-size: "50m"
nginx.ingress.kubernetes.io/rate-limit: "100"
nginx.ingress.kubernetes.io/limit-rps: "50"
hosts:
- host: datacenter-docs.yourdomain.com
paths:
- path: /
pathType: Prefix
service: frontend
- path: /api
pathType: Prefix
service: api
- path: /ws
pathType: Prefix
service: chat
tls:
- secretName: datacenter-docs-tls
hosts:
- datacenter-docs.yourdomain.com
# Application configuration for production
config:
# MongoDB connection (if using external MongoDB, change this)
mongodbUrl: "mongodb://admin:CHANGE-THIS-IN-PRODUCTION@{{ include \"datacenter-docs.mongodb.fullname\" . }}:27017/datacenter_docs?authSource=admin"
# Redis connection
redisUrl: "redis://{{ include \"datacenter-docs.redis.fullname\" . }}:6379/0"
# LLM Provider configuration
llm:
# For OpenAI
baseUrl: "https://api.openai.com/v1"
model: "gpt-4-turbo-preview"
# For Anthropic Claude (alternative)
# baseUrl: "https://api.anthropic.com/v1"
# model: "claude-3-opus-20240229"
# For Azure OpenAI (alternative)
# baseUrl: "https://your-resource.openai.azure.com"
# model: "gpt-4"
maxTokens: 4096
temperature: 0.7
# MCP configuration
mcp:
baseUrl: "http://mcp-server:8080"
timeout: 30
# Auto-remediation configuration
autoRemediation:
enabled: true
minReliabilityScore: 90.0 # Higher threshold for production
requireApprovalThreshold: 95.0
maxActionsPerHour: 50 # Conservative limit
dryRun: false # Set to true for initial deployment
# Security
apiKeyEnabled: true
corsOrigins:
- "https://datacenter-docs.yourdomain.com"
- "https://admin.yourdomain.com"
# Logging
logLevel: "INFO" # Use "DEBUG" for troubleshooting
logFormat: "json"
# Secrets - MUST BE CHANGED IN PRODUCTION
secrets:
# LLM API Key
llmApiKey: "CHANGE-THIS-TO-YOUR-ACTUAL-API-KEY"
# API authentication secret key
apiSecretKey: "CHANGE-THIS-TO-A-STRONG-RANDOM-KEY"
# MongoDB credentials
mongodbUsername: "admin"
mongodbPassword: "CHANGE-THIS-IN-PRODUCTION"
# ServiceAccount
serviceAccount:
create: true
annotations:
# Add cloud provider annotations if needed
# eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT-ID:role/IAM-ROLE-NAME
name: ""
# Pod security context
podSecurityContext:
fsGroup: 1000
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
# Container security context
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: false
runAsNonRoot: true
runAsUser: 1000
# Node selector - place workloads on specific nodes
nodeSelector:
workload-type: "application"
# kubernetes.io/arch: amd64
# Tolerations - allow scheduling on tainted nodes
tolerations:
- key: "workload-type"
operator: "Equal"
value: "application"
effect: "NoSchedule"
# Affinity rules - spread pods across zones and nodes
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- datacenter-docs
topologyKey: kubernetes.io/hostname
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app.kubernetes.io/component
operator: In
values:
- api
topologyKey: topology.kubernetes.io/zone
# Priority class - ensure critical pods are scheduled first
priorityClassName: "high-priority"
# Additional production recommendations:
#
# 1. Use external secret management:
# - HashiCorp Vault
# - AWS Secrets Manager
# - Azure Key Vault
# - Google Secret Manager
#
# 2. Enable monitoring:
# - Prometheus metrics
# - Grafana dashboards
# - AlertManager alerts
#
# 3. Enable logging:
# - ELK Stack
# - Loki
# - CloudWatch
#
# 4. Enable tracing:
# - Jaeger
# - OpenTelemetry
#
# 5. Backup strategy:
# - MongoDB backups (Velero, native tools)
# - Disaster recovery plan
#
# 6. Network policies:
# - Restrict pod-to-pod communication
# - Isolate database access
#
# 7. Pod disruption budgets:
# - Ensure high availability during updates
#
# 8. Regular security scans:
# - Container image scanning
# - Dependency vulnerability scanning

View File

@@ -0,0 +1,265 @@
# Default values for datacenter-docs
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.
global:
imagePullPolicy: IfNotPresent
storageClass: ""
# MongoDB configuration
mongodb:
enabled: true
image:
repository: mongo
tag: "7"
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 27017
auth:
enabled: true
rootUsername: admin
rootPassword: admin123
database: datacenter_docs
persistence:
enabled: true
size: 10Gi
storageClass: ""
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "2Gi"
cpu: "1000m"
# Redis configuration
redis:
enabled: true
image:
repository: redis
tag: "7-alpine"
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 6379
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
# API service configuration
api:
enabled: true
replicaCount: 2
image:
repository: datacenter-docs-api
tag: "latest"
pullPolicy: Always
service:
type: ClusterIP
port: 8000
targetPort: 8000
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "2Gi"
cpu: "1000m"
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 80
targetMemoryUtilizationPercentage: 80
livenessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
# Chat service configuration
chat:
enabled: false # Not yet implemented
replicaCount: 1
image:
repository: datacenter-docs-chat
tag: "latest"
pullPolicy: Always
service:
type: ClusterIP
port: 8001
targetPort: 8001
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "1Gi"
cpu: "500m"
# Worker service configuration
worker:
enabled: false # Not yet implemented
replicaCount: 3
image:
repository: datacenter-docs-worker
tag: "latest"
pullPolicy: Always
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "2Gi"
cpu: "1000m"
autoscaling:
enabled: true
minReplicas: 1
maxReplicas: 10
targetCPUUtilizationPercentage: 80
# Frontend service configuration
frontend:
enabled: true
replicaCount: 2
image:
repository: datacenter-docs-frontend
tag: "latest"
pullPolicy: Always
service:
type: ClusterIP
port: 80
targetPort: 80
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "200m"
# Ingress configuration
ingress:
enabled: true
className: "nginx"
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/proxy-body-size: "50m"
hosts:
- host: datacenter-docs.example.com
paths:
- path: /
pathType: Prefix
service: frontend
- path: /api
pathType: Prefix
service: api
- path: /ws
pathType: Prefix
service: chat
tls:
- secretName: datacenter-docs-tls
hosts:
- datacenter-docs.example.com
# Application configuration
config:
# MongoDB connection
mongodbUrl: "mongodb://admin:admin123@{{ include \"datacenter-docs.mongodb.fullname\" . }}:27017/datacenter_docs?authSource=admin"
# Redis connection
redisUrl: "redis://{{ include \"datacenter-docs.redis.fullname\" . }}:6379/0"
# LLM Provider configuration
llm:
baseUrl: "https://api.openai.com/v1"
model: "gpt-4-turbo-preview"
maxTokens: 4096
temperature: 0.7
# MCP configuration
mcp:
baseUrl: "http://mcp-server:8080"
timeout: 30
# Auto-remediation configuration
autoRemediation:
enabled: true
minReliabilityScore: 85.0
requireApprovalThreshold: 90.0
maxActionsPerHour: 100
dryRun: false
# Security
apiKeyEnabled: true
corsOrigins:
- "http://localhost:3000"
- "https://datacenter-docs.example.com"
# Logging
logLevel: "INFO"
logFormat: "json"
# Secrets (should be overridden in production)
secrets:
# LLM API Key
llmApiKey: "sk-your-openai-api-key-here"
# API authentication
apiSecretKey: "your-secret-key-here-change-in-production"
# MongoDB credentials (override mongodb.auth if using external DB)
mongodbUsername: "admin"
mongodbPassword: "admin123"
# ServiceAccount configuration
serviceAccount:
create: true
annotations: {}
name: ""
# Pod annotations
podAnnotations: {}
# Pod security context
podSecurityContext:
fsGroup: 1000
runAsNonRoot: true
runAsUser: 1000
# Container security context
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: false
# Node selector
nodeSelector: {}
# Tolerations
tolerations: []
# Affinity rules
affinity: {}
# Priority class
priorityClassName: ""

143
deploy/helm/test-chart.sh Executable file
View File

@@ -0,0 +1,143 @@
#!/bin/bash
# Test script for Helm chart validation
# Usage: ./test-chart.sh
set -e
CHART_DIR="datacenter-docs"
RELEASE_NAME="test-datacenter-docs"
echo "=========================================="
echo "Helm Chart Testing Script"
echo "=========================================="
echo ""
# Check if helm is installed
if ! command -v helm &> /dev/null; then
echo "ERROR: helm is not installed. Please install Helm first."
exit 1
fi
echo "✓ Helm version: $(helm version --short)"
echo ""
# Lint the chart
echo "=========================================="
echo "Step 1: Linting Chart"
echo "=========================================="
helm lint ${CHART_DIR}
echo "✓ Lint passed"
echo ""
# Template rendering with default values
echo "=========================================="
echo "Step 2: Template Rendering (default values)"
echo "=========================================="
helm template ${RELEASE_NAME} ${CHART_DIR} > /tmp/rendered-default.yaml
echo "✓ Template rendering successful"
echo " Output: /tmp/rendered-default.yaml"
echo ""
# Template rendering with development values
echo "=========================================="
echo "Step 3: Template Rendering (development values)"
echo "=========================================="
helm template ${RELEASE_NAME} ${CHART_DIR} -f ${CHART_DIR}/values-development.yaml > /tmp/rendered-dev.yaml
echo "✓ Template rendering successful"
echo " Output: /tmp/rendered-dev.yaml"
echo ""
# Template rendering with production values
echo "=========================================="
echo "Step 4: Template Rendering (production values)"
echo "=========================================="
helm template ${RELEASE_NAME} ${CHART_DIR} -f ${CHART_DIR}/values-production.yaml > /tmp/rendered-prod.yaml
echo "✓ Template rendering successful"
echo " Output: /tmp/rendered-prod.yaml"
echo ""
# Dry run installation
echo "=========================================="
echo "Step 5: Dry Run Installation"
echo "=========================================="
helm install ${RELEASE_NAME} ${CHART_DIR} --dry-run --debug > /tmp/dry-run.log 2>&1
echo "✓ Dry run successful"
echo " Output: /tmp/dry-run.log"
echo ""
# Test with disabled components
echo "=========================================="
echo "Step 6: Template with Disabled Components"
echo "=========================================="
helm template ${RELEASE_NAME} ${CHART_DIR} \
--set mongodb.enabled=false \
--set redis.enabled=false \
--set api.enabled=false \
--set frontend.enabled=false \
> /tmp/rendered-minimal.yaml
echo "✓ Minimal template rendering successful"
echo " Output: /tmp/rendered-minimal.yaml"
echo ""
# Test with all components enabled
echo "=========================================="
echo "Step 7: Template with All Components"
echo "=========================================="
helm template ${RELEASE_NAME} ${CHART_DIR} \
--set chat.enabled=true \
--set worker.enabled=true \
> /tmp/rendered-full.yaml
echo "✓ Full template rendering successful"
echo " Output: /tmp/rendered-full.yaml"
echo ""
# Validate Kubernetes manifests (if kubectl is available)
if command -v kubectl &> /dev/null; then
echo "=========================================="
echo "Step 8: Kubernetes Manifest Validation"
echo "=========================================="
if kubectl version --client &> /dev/null; then
kubectl apply --dry-run=client -f /tmp/rendered-default.yaml > /dev/null 2>&1
echo "✓ Kubernetes manifest validation passed"
else
echo "⚠ kubectl not connected to cluster, skipping validation"
fi
echo ""
else
echo "⚠ kubectl not found, skipping Kubernetes validation"
echo ""
fi
# Package the chart
echo "=========================================="
echo "Step 9: Packaging Chart"
echo "=========================================="
helm package ${CHART_DIR} -d /tmp/
echo "✓ Chart packaged successfully"
echo " Output: /tmp/datacenter-docs-*.tgz"
echo ""
# Summary
echo "=========================================="
echo "All Tests Passed! ✓"
echo "=========================================="
echo ""
echo "Generated files:"
echo " - /tmp/rendered-default.yaml (default values)"
echo " - /tmp/rendered-dev.yaml (development values)"
echo " - /tmp/rendered-prod.yaml (production values)"
echo " - /tmp/rendered-minimal.yaml (minimal components)"
echo " - /tmp/rendered-full.yaml (all components)"
echo " - /tmp/dry-run.log (dry run output)"
echo " - /tmp/datacenter-docs-*.tgz (packaged chart)"
echo ""
echo "To install the chart locally:"
echo " helm install my-release ${CHART_DIR}"
echo ""
echo "To install with development values:"
echo " helm install dev ${CHART_DIR} -f ${CHART_DIR}/values-development.yaml"
echo ""
echo "To install with production values (customize first!):"
echo " helm install prod ${CHART_DIR} -f ${CHART_DIR}/values-production.yaml"
echo ""

View File

@@ -148,6 +148,118 @@ After running the validation script, you'll find:
---
## 🔄 convert_config.py
**Configuration Format Converter**
### Description
Converts between `.env` and `values.yaml` configuration formats, making it easy to switch between Docker Compose and Helm deployments.
### Usage
#### Prerequisites
```bash
pip install pyyaml
```
#### Convert .env to values.yaml
```bash
./scripts/convert_config.py env-to-yaml .env values.yaml
```
#### Convert values.yaml to .env
```bash
./scripts/convert_config.py yaml-to-env values.yaml .env
```
### Examples
**Example 1: Create values.yaml from existing .env**
```bash
# You have an existing .env file from Docker development
./scripts/convert_config.py env-to-yaml .env my-values.yaml
# Use the generated values.yaml with Helm
helm install my-release deploy/helm/datacenter-docs -f my-values.yaml
```
**Example 2: Generate .env from values.yaml**
```bash
# You have a values.yaml from Kubernetes deployment
./scripts/convert_config.py yaml-to-env values.yaml .env
# Use the generated .env with Docker Compose
cd deploy/docker
docker-compose -f docker-compose.dev.yml up -d
```
**Example 3: Environment migration**
```bash
# Convert development .env to staging values.yaml
./scripts/convert_config.py env-to-yaml .env.development values-staging.yaml
# Manually adjust staging-specific settings
nano values-staging.yaml
# Deploy to staging Kubernetes cluster
helm install staging deploy/helm/datacenter-docs -f values-staging.yaml
```
### Supported Configuration
The script converts:
- **MongoDB**: Connection settings and authentication
- **Redis**: Connection and authentication
- **MCP Server**: URL and API key
- **Proxmox**: Host, authentication, SSL settings
- **LLM**: Provider settings (OpenAI, Anthropic, Ollama, etc.)
- **API**: Server configuration and workers
- **CORS**: Allowed origins
- **Application**: Logging and debug settings
- **Celery**: Broker and result backend
- **Vector Store**: ChromaDB and embedding model
### Output
```
Reading .env file: .env
Converting to values.yaml format...
Writing values.yaml: my-values.yaml
✓ Conversion completed successfully!
Output written to: my-values.yaml
```
### Limitations
- Converts common configuration options only
- Complex nested structures may require manual adjustment
- Helm-specific values (resource limits, replicas) not included in .env conversion
- Always review and test converted configuration
### Tips
1. **Review output**: Always check converted files for accuracy
2. **Test first**: Validate in development before production
3. **Keep secrets secure**: Use proper secret management tools
4. **Version control**: Track configuration changes
### See Also
- [CONFIGURATION.md](../CONFIGURATION.md) - Complete configuration guide
- [.env.example](../.env.example) - Environment variable template
- [values.yaml](../values.yaml) - YAML configuration template
---
## 🚀 Quick Start
```bash

298
scripts/convert_config.py Executable file
View File

@@ -0,0 +1,298 @@
#!/usr/bin/env python3
"""
Configuration Converter
Converts between .env and values.yaml formats
"""
import os
import sys
import argparse
from pathlib import Path
from typing import Dict, Any
import yaml
def parse_env_file(env_file: Path) -> Dict[str, str]:
"""Parse .env file and return dictionary of variables."""
env_vars = {}
with open(env_file, 'r') as f:
for line in f:
line = line.strip()
# Skip comments and empty lines
if not line or line.startswith('#'):
continue
# Parse KEY=VALUE
if '=' in line:
key, value = line.split('=', 1)
env_vars[key.strip()] = value.strip()
return env_vars
def env_to_values(env_vars: Dict[str, str]) -> Dict[str, Any]:
"""Convert environment variables to values.yaml structure."""
values = {
'mongodb': {
'auth': {
'rootUsername': env_vars.get('MONGO_ROOT_USER', 'admin'),
'rootPassword': env_vars.get('MONGO_ROOT_PASSWORD', 'changeme'),
'database': env_vars.get('MONGODB_DATABASE', 'datacenter_docs'),
},
'url': env_vars.get('MONGODB_URL', 'mongodb://admin:changeme@mongodb:27017'),
},
'redis': {
'auth': {
'password': env_vars.get('REDIS_PASSWORD', 'changeme'),
},
'url': env_vars.get('REDIS_URL', 'redis://redis:6379/0'),
},
'mcp': {
'server': {
'url': env_vars.get('MCP_SERVER_URL', 'https://mcp.company.local'),
'apiKey': env_vars.get('MCP_API_KEY', 'your_mcp_api_key_here'),
},
},
'proxmox': {
'host': env_vars.get('PROXMOX_HOST', 'proxmox.example.com'),
'port': int(env_vars.get('PROXMOX_PORT', '8006')),
'auth': {
'user': env_vars.get('PROXMOX_USER', 'root@pam'),
'password': env_vars.get('PROXMOX_PASSWORD', 'your-password-here'),
},
'ssl': {
'verify': env_vars.get('PROXMOX_VERIFY_SSL', 'false').lower() == 'true',
},
'timeout': int(env_vars.get('PROXMOX_TIMEOUT', '30')),
},
'llm': {
'baseUrl': env_vars.get('LLM_BASE_URL', 'https://api.openai.com/v1'),
'apiKey': env_vars.get('LLM_API_KEY', 'sk-your-openai-api-key-here'),
'model': env_vars.get('LLM_MODEL', 'gpt-4-turbo-preview'),
'generation': {
'temperature': float(env_vars.get('LLM_TEMPERATURE', '0.3')),
'maxTokens': int(env_vars.get('LLM_MAX_TOKENS', '4096')),
},
},
'api': {
'host': env_vars.get('API_HOST', '0.0.0.0'),
'port': int(env_vars.get('API_PORT', '8000')),
'workers': int(env_vars.get('WORKERS', '4')),
},
'cors': {
'origins': env_vars.get('CORS_ORIGINS', 'http://localhost:3000').split(','),
},
'application': {
'logging': {
'level': env_vars.get('LOG_LEVEL', 'INFO'),
},
'debug': env_vars.get('DEBUG', 'false').lower() == 'true',
},
'celery': {
'broker': {
'url': env_vars.get('CELERY_BROKER_URL', 'redis://redis:6379/0'),
},
'result': {
'backend': env_vars.get('CELERY_RESULT_BACKEND', 'redis://redis:6379/0'),
},
},
'vectorStore': {
'chroma': {
'path': env_vars.get('VECTOR_STORE_PATH', './data/chroma_db'),
},
'embedding': {
'model': env_vars.get('EMBEDDING_MODEL', 'sentence-transformers/all-MiniLM-L6-v2'),
},
},
}
return values
def values_to_env(values: Dict[str, Any]) -> Dict[str, str]:
"""Convert values.yaml structure to environment variables."""
env_vars = {}
# MongoDB
if 'mongodb' in values:
mongo = values['mongodb']
if 'auth' in mongo:
env_vars['MONGO_ROOT_USER'] = mongo['auth'].get('rootUsername', 'admin')
env_vars['MONGO_ROOT_PASSWORD'] = mongo['auth'].get('rootPassword', 'changeme')
env_vars['MONGODB_DATABASE'] = mongo['auth'].get('database', 'datacenter_docs')
env_vars['MONGODB_URL'] = mongo.get('url', 'mongodb://admin:changeme@mongodb:27017')
# Redis
if 'redis' in values:
redis = values['redis']
if 'auth' in redis:
env_vars['REDIS_PASSWORD'] = redis['auth'].get('password', 'changeme')
env_vars['REDIS_URL'] = redis.get('url', 'redis://redis:6379/0')
# MCP
if 'mcp' in values and 'server' in values['mcp']:
mcp = values['mcp']['server']
env_vars['MCP_SERVER_URL'] = mcp.get('url', 'https://mcp.company.local')
env_vars['MCP_API_KEY'] = mcp.get('apiKey', 'your_mcp_api_key_here')
# Proxmox
if 'proxmox' in values:
px = values['proxmox']
env_vars['PROXMOX_HOST'] = px.get('host', 'proxmox.example.com')
env_vars['PROXMOX_PORT'] = str(px.get('port', 8006))
if 'auth' in px:
env_vars['PROXMOX_USER'] = px['auth'].get('user', 'root@pam')
env_vars['PROXMOX_PASSWORD'] = px['auth'].get('password', 'your-password-here')
if 'ssl' in px:
env_vars['PROXMOX_VERIFY_SSL'] = str(px['ssl'].get('verify', False)).lower()
env_vars['PROXMOX_TIMEOUT'] = str(px.get('timeout', 30))
# LLM
if 'llm' in values:
llm = values['llm']
env_vars['LLM_BASE_URL'] = llm.get('baseUrl', 'https://api.openai.com/v1')
env_vars['LLM_API_KEY'] = llm.get('apiKey', 'sk-your-openai-api-key-here')
env_vars['LLM_MODEL'] = llm.get('model', 'gpt-4-turbo-preview')
if 'generation' in llm:
env_vars['LLM_TEMPERATURE'] = str(llm['generation'].get('temperature', 0.3))
env_vars['LLM_MAX_TOKENS'] = str(llm['generation'].get('maxTokens', 4096))
# API
if 'api' in values:
api = values['api']
env_vars['API_HOST'] = api.get('host', '0.0.0.0')
env_vars['API_PORT'] = str(api.get('port', 8000))
env_vars['WORKERS'] = str(api.get('workers', 4))
# CORS
if 'cors' in values:
origins = values['cors'].get('origins', ['http://localhost:3000'])
env_vars['CORS_ORIGINS'] = ','.join(origins)
# Application
if 'application' in values:
app = values['application']
if 'logging' in app:
env_vars['LOG_LEVEL'] = app['logging'].get('level', 'INFO')
env_vars['DEBUG'] = str(app.get('debug', False)).lower()
# Celery
if 'celery' in values:
celery = values['celery']
if 'broker' in celery:
env_vars['CELERY_BROKER_URL'] = celery['broker'].get('url', 'redis://redis:6379/0')
if 'result' in celery:
env_vars['CELERY_RESULT_BACKEND'] = celery['result'].get('backend', 'redis://redis:6379/0')
# Vector Store
if 'vectorStore' in values:
vs = values['vectorStore']
if 'chroma' in vs:
env_vars['VECTOR_STORE_PATH'] = vs['chroma'].get('path', './data/chroma_db')
if 'embedding' in vs:
env_vars['EMBEDDING_MODEL'] = vs['embedding'].get('model', 'sentence-transformers/all-MiniLM-L6-v2')
return env_vars
def write_env_file(env_vars: Dict[str, str], output_file: Path):
"""Write environment variables to .env file."""
with open(output_file, 'w') as f:
f.write("# =============================================================================\n")
f.write("# Datacenter Documentation System - Configuration\n")
f.write("# Generated from values.yaml\n")
f.write("# =============================================================================\n\n")
# Group by section
sections = {
'MongoDB': ['MONGO_ROOT_USER', 'MONGO_ROOT_PASSWORD', 'MONGODB_URL', 'MONGODB_DATABASE'],
'Redis': ['REDIS_PASSWORD', 'REDIS_URL'],
'MCP': ['MCP_SERVER_URL', 'MCP_API_KEY'],
'Proxmox': ['PROXMOX_HOST', 'PROXMOX_PORT', 'PROXMOX_USER', 'PROXMOX_PASSWORD',
'PROXMOX_VERIFY_SSL', 'PROXMOX_TIMEOUT'],
'LLM': ['LLM_BASE_URL', 'LLM_API_KEY', 'LLM_MODEL', 'LLM_TEMPERATURE', 'LLM_MAX_TOKENS'],
'API': ['API_HOST', 'API_PORT', 'WORKERS'],
'CORS': ['CORS_ORIGINS'],
'Application': ['LOG_LEVEL', 'DEBUG'],
'Celery': ['CELERY_BROKER_URL', 'CELERY_RESULT_BACKEND'],
'Vector Store': ['VECTOR_STORE_PATH', 'EMBEDDING_MODEL'],
}
for section, keys in sections.items():
f.write(f"# {section}\n")
for key in keys:
if key in env_vars:
f.write(f"{key}={env_vars[key]}\n")
f.write("\n")
def main():
parser = argparse.ArgumentParser(
description='Convert between .env and values.yaml configuration formats'
)
parser.add_argument(
'mode',
choices=['env-to-yaml', 'yaml-to-env'],
help='Conversion mode'
)
parser.add_argument(
'input',
type=Path,
help='Input file path'
)
parser.add_argument(
'output',
type=Path,
help='Output file path'
)
args = parser.parse_args()
# Check input file exists
if not args.input.exists():
print(f"Error: Input file not found: {args.input}", file=sys.stderr)
sys.exit(1)
try:
if args.mode == 'env-to-yaml':
# Convert .env to values.yaml
print(f"Reading .env file: {args.input}")
env_vars = parse_env_file(args.input)
print("Converting to values.yaml format...")
values = env_to_values(env_vars)
print(f"Writing values.yaml: {args.output}")
with open(args.output, 'w') as f:
yaml.dump(values, f, default_flow_style=False, sort_keys=False, indent=2)
print("✓ Conversion completed successfully!")
else: # yaml-to-env
# Convert values.yaml to .env
print(f"Reading values.yaml file: {args.input}")
with open(args.input, 'r') as f:
values = yaml.safe_load(f)
print("Converting to .env format...")
env_vars = values_to_env(values)
print(f"Writing .env file: {args.output}")
write_env_file(env_vars, args.output)
print("✓ Conversion completed successfully!")
print(f"\nOutput written to: {args.output}")
except Exception as e:
print(f"Error during conversion: {e}", file=sys.stderr)
sys.exit(1)
if __name__ == '__main__':
main()

513
values.yaml Normal file
View File

@@ -0,0 +1,513 @@
# =============================================================================
# Datacenter Documentation System - Configuration Values
# This file provides a structured YAML configuration based on .env variables
# Can be used with Helm or directly for configuration management
# =============================================================================
# =============================================================================
# MongoDB Configuration
# =============================================================================
mongodb:
# Authentication
auth:
enabled: true
rootUsername: admin
rootPassword: admin123
database: datacenter_docs
# Connection URL (auto-generated in Helm, can be overridden)
url: "mongodb://admin:admin123@mongodb:27017"
# Service configuration
service:
host: mongodb
port: 27017
# Persistence (for Kubernetes deployments)
persistence:
enabled: true
size: 10Gi
storageClass: "longhorn"
# =============================================================================
# Redis Configuration
# =============================================================================
redis:
# Authentication
auth:
enabled: false
password: admin
# Connection URL
url: "redis://redis:6379/0"
# Service configuration
service:
host: redis
port: 6379
# Database number
database: 0
# =============================================================================
# MCP Server Configuration
# =============================================================================
mcp:
# MCP server connection
server:
url: "https://mcp.company.local"
apiKey: "7DKfHC8i79iPp43tFKNyiHEXQRSec4dH"
timeout: 30
# Enable MCP integration
enabled: true
# =============================================================================
# Proxmox VE Configuration
# =============================================================================
proxmox:
# Proxmox server
host: "proxmox.apps.home.arpa.viti"
port: 443
# Authentication Method 1: Username + Password (less secure)
auth:
user: "monitoring@pve"
name: "docs-llm-token"
password: "4d97d058-cc96-4189-936d-fe6a6583fcbd"
# Authentication Method 2: API Token (RECOMMENDED)
# To create: Datacenter → Permissions → API Tokens
# Format: user@realm!tokenname
# token:
# user: "automation@pam"
# name: "docs-collector"
# value: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
# SSL Configuration
ssl:
verify: false # Set to true in production with valid certificates
# Connection settings
timeout: 30
# Enable Proxmox collector
enabled: true
# =============================================================================
# LLM Configuration (OpenAI-compatible API)
# =============================================================================
llm:
# Provider selection - uncomment the one you want to use
# --- OpenAI (Default) ---
provider: openai
baseUrl: "https://llm-studio.apps.home.arpa.viti/v1"
apiKey: ""
model: "llama-3.2-3b-instruct"
# Alternative models: gpt-4, gpt-3.5-turbo, gpt-4o
# --- Anthropic Claude ---
# provider: anthropic
# baseUrl: "https://api.anthropic.com/v1"
# apiKey: "sk-ant-your-anthropic-key-here"
# model: "claude-sonnet-4-20250514"
# Alternative models: claude-3-opus-20240229, claude-3-sonnet-20240229
# --- LLMStudio (Local) ---
# provider: llmstudio
# baseUrl: "http://localhost:1234/v1"
# apiKey: "not-needed"
# model: "your-local-model-name"
# --- Open-WebUI (Local) ---
# provider: openwebui
# baseUrl: "http://localhost:8080/v1"
# apiKey: "your-open-webui-key"
# model: "llama3"
# Alternative models: mistral, mixtral, codellama
# --- Ollama (Local) ---
# provider: ollama
# baseUrl: "http://localhost:11434/v1"
# apiKey: "ollama"
# model: "llama3"
# Alternative models: mistral, mixtral, codellama, phi3
# Generation Settings
generation:
temperature: 0.3
maxTokens: 4096
topP: 1.0
frequencyPenalty: 0.0
presencePenalty: 0.0
# =============================================================================
# API Configuration
# =============================================================================
api:
# Server settings
host: "0.0.0.0"
port: 8000
workers: 4
# Service configuration (for Kubernetes)
service:
type: ClusterIP
port: 8000
targetPort: 8000
# Application settings
debug: false
reloadOnChange: false
# Security
secretKey: "your-secret-key-change-in-production"
apiKeyEnabled: true
# =============================================================================
# CORS Configuration
# =============================================================================
cors:
enabled: true
origins:
- "http://localhost:3000"
- "https://docs.company.local"
allowCredentials: true
allowMethods:
- "GET"
- "POST"
- "PUT"
- "DELETE"
- "PATCH"
- "OPTIONS"
allowHeaders:
- "*"
# =============================================================================
# Application Settings
# =============================================================================
application:
# Logging
logging:
level: "INFO" # DEBUG, INFO, WARNING, ERROR, CRITICAL
format: "json" # json or text
# Debug mode
debug: false
# Environment
environment: "production" # development, staging, production
# =============================================================================
# Auto-Remediation Configuration
# =============================================================================
autoRemediation:
# Enable/disable auto-remediation
enabled: true
# Reliability thresholds
minReliabilityScore: 85.0
requireApprovalThreshold: 90.0
# Rate limiting
maxActionsPerHour: 100
maxActionsPerDay: 500
# Safety settings
dryRun: false # Set to true for testing
requireHumanApproval: false
# Notification settings
notifications:
enabled: true
channels:
- email
- slack
# =============================================================================
# Celery Configuration (Background Tasks)
# =============================================================================
celery:
# Broker configuration
broker:
url: "redis://redis:6379/0"
transport: "redis"
# Result backend
result:
backend: "redis://redis:6379/0"
expires: 3600
# Worker configuration
worker:
concurrency: 4
maxTasksPerChild: 1000
prefetchMultiplier: 4
# Task configuration
task:
acks_late: true
reject_on_worker_lost: true
time_limit: 3600
soft_time_limit: 3000
# Queue configuration
queues:
default:
name: "default"
priority: 5
high_priority:
name: "high_priority"
priority: 10
low_priority:
name: "low_priority"
priority: 1
# =============================================================================
# Vector Store Configuration
# =============================================================================
vectorStore:
# Storage type
type: "chroma" # chroma, pinecone, weaviate
# ChromaDB configuration
chroma:
path: "./data/chroma_db"
persistDirectory: "/data/vector_store"
# Embedding configuration
embedding:
model: "sentence-transformers/all-MiniLM-L6-v2"
dimensions: 384
# Alternative models:
# - "sentence-transformers/all-mpnet-base-v2" (768 dims, better quality)
# - "BAAI/bge-small-en-v1.5" (384 dims, good performance)
# - "thenlper/gte-small" (384 dims, multilingual)
# Search configuration
search:
topK: 5
scoreThreshold: 0.7
# =============================================================================
# Documentation Generation Settings
# =============================================================================
documentation:
# Generation settings
generation:
enabled: true
autoUpdate: true
updateInterval: 3600 # seconds
# Output configuration
output:
format: "markdown" # markdown, html, pdf
directory: "./docs/generated"
templateDirectory: "./templates/docs"
# Content settings
content:
includeTimestamps: true
includeMetadata: true
includeDiagrams: true
includeExamples: true
# =============================================================================
# Ticket Management Settings
# =============================================================================
tickets:
# Auto-categorization
autoCategorization:
enabled: true
confidenceThreshold: 0.8
# Priority assignment
autoPriority:
enabled: true
# SLA settings
sla:
critical: 1 # hours
high: 4
medium: 24
low: 72
# Notification settings
notifications:
enabled: true
onCreation: true
onStatusChange: true
onResolution: true
# =============================================================================
# Collectors Configuration
# =============================================================================
collectors:
# VMware vCenter
vmware:
enabled: false
host: "vcenter.example.com"
username: "administrator@vsphere.local"
password: "your-password"
verifySsl: false
collectInterval: 3600
# Kubernetes
kubernetes:
enabled: false
configPath: "~/.kube/config"
context: "default"
collectInterval: 1800
# Network devices
network:
enabled: false
devices: []
# - host: "switch1.example.com"
# type: "cisco"
# username: "admin"
# password: "password"
collectInterval: 7200
# Storage
storage:
enabled: false
systems: []
collectInterval: 3600
# =============================================================================
# Monitoring & Observability
# =============================================================================
monitoring:
# Metrics
metrics:
enabled: true
port: 9090
path: "/metrics"
# Health checks
health:
enabled: true
path: "/health"
interval: 30
# Tracing
tracing:
enabled: false
provider: "jaeger" # jaeger, zipkin, otlp
endpoint: "http://jaeger:14268/api/traces"
# Logging exporters
logging:
exporters:
- type: "stdout"
# - type: "elasticsearch"
# endpoint: "http://elasticsearch:9200"
# - type: "loki"
# endpoint: "http://loki:3100"
# =============================================================================
# Security Settings
# =============================================================================
security:
# Authentication
authentication:
enabled: true
method: "jwt" # jwt, oauth2, ldap
tokenExpiration: 3600
# Authorization
authorization:
enabled: true
rbacEnabled: true
# Encryption
encryption:
enabled: true
algorithm: "AES-256-GCM"
# Rate limiting
rateLimit:
enabled: true
requestsPerMinute: 100
requestsPerHour: 1000
# =============================================================================
# Backup & Recovery
# =============================================================================
backup:
# Enable backup
enabled: true
# Backup schedule (cron format)
schedule: "0 2 * * *" # Daily at 2 AM
# Retention policy
retention:
daily: 7
weekly: 4
monthly: 12
# Backup destination
destination:
type: "s3" # s3, gcs, azure, local
# s3:
# bucket: "datacenter-docs-backups"
# region: "us-east-1"
# accessKeyId: "your-access-key"
# secretAccessKey: "your-secret-key"
# =============================================================================
# Feature Flags
# =============================================================================
features:
# Enable/disable specific features
autoRemediation: true
aiDocGeneration: true
vectorSearch: true
chatInterface: true
ticketManagement: true
multiTenancy: false
auditLogging: true
realTimeUpdates: true
# =============================================================================
# Resource Limits (for Kubernetes deployments)
# =============================================================================
resources:
# API service
api:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "2Gi"
cpu: "1000m"
# Worker service
worker:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "2Gi"
cpu: "1000m"
# Chat service
chat:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "1Gi"
cpu: "500m"
# =============================================================================
# Notes
# =============================================================================
# - Copy this file to customize your deployment
# - For Helm deployments, use: helm install -f values.yaml
# - For environment variables, use the .env file
# - Sensitive values should be stored in Kubernetes Secrets or external secret managers
# - See documentation at: https://git.commandware.com/it-ops/llm-automation-docs-and-remediation-engine