Files
llm-automation-docs-and-rem…/deploy/helm/datacenter-docs/README.md
dnviti 2719cfff59
Some checks failed
Build / Code Quality Checks (push) Successful in 15m11s
Build / Build & Push Docker Images (worker) (push) Successful in 13m44s
Build / Build & Push Docker Images (frontend) (push) Successful in 5m8s
Build / Build & Push Docker Images (chat) (push) Failing after 30m7s
Build / Build & Push Docker Images (api) (push) Failing after 21m39s
Add Helm chart, Docs, and Config conversion script
2025-10-22 14:35:21 +02:00

11 KiB

Datacenter Docs & Remediation Engine - Helm Chart

Helm chart for deploying the LLM Automation - Docs & Remediation Engine on Kubernetes.

Overview

This chart deploys a complete stack including:

  • MongoDB: Document database for storing tickets, documentation, and metadata
  • Redis: Cache and task queue backend
  • API Service: FastAPI REST API with auto-remediation capabilities
  • Chat Service: WebSocket server for real-time documentation queries (optional, not yet implemented)
  • Worker Service: Celery workers for background tasks (optional, not yet implemented)
  • Frontend: React-based web interface

Prerequisites

  • Kubernetes 1.19+
  • Helm 3.0+
  • PersistentVolume provisioner support in the underlying infrastructure (for MongoDB persistence)
  • Ingress controller (optional, for external access)

Installation

Quick Start

# Add the chart repository (if published)
helm repo add datacenter-docs https://your-repo-url
helm repo update

# Install with default values
helm install my-datacenter-docs datacenter-docs/datacenter-docs

# Or install from local directory
helm install my-datacenter-docs ./datacenter-docs

Production Installation

For production, create a custom values.yaml:

# Copy and edit the values file
cp values.yaml my-values.yaml

# Edit my-values.yaml with your configuration
# At minimum, change:
# - secrets.llmApiKey
# - secrets.apiSecretKey
# - ingress.hosts

# Install with custom values
helm install my-datacenter-docs ./datacenter-docs -f my-values.yaml

Install with Specific Configuration

helm install my-datacenter-docs ./datacenter-docs \
  --set secrets.llmApiKey="sk-your-openai-api-key" \
  --set secrets.apiSecretKey="your-strong-secret-key" \
  --set ingress.hosts[0].host="datacenter-docs.yourdomain.com" \
  --set mongodb.persistence.size="50Gi"

Configuration

Key Configuration Parameters

Global Settings

Parameter Description Default
global.imagePullPolicy Image pull policy IfNotPresent
global.storageClass Storage class for PVCs ""

MongoDB

Parameter Description Default
mongodb.enabled Enable MongoDB true
mongodb.image.repository MongoDB image mongo
mongodb.image.tag MongoDB version 7
mongodb.auth.rootUsername Root username admin
mongodb.auth.rootPassword Root password admin123
mongodb.persistence.enabled Enable persistence true
mongodb.persistence.size Volume size 10Gi
mongodb.resources.requests.memory Memory request 512Mi
mongodb.resources.limits.memory Memory limit 2Gi

Redis

Parameter Description Default
redis.enabled Enable Redis true
redis.image.repository Redis image redis
redis.image.tag Redis version 7-alpine
redis.resources.requests.memory Memory request 128Mi
redis.resources.limits.memory Memory limit 512Mi

API Service

Parameter Description Default
api.enabled Enable API service true
api.replicaCount Number of replicas 2
api.image.repository API image repository datacenter-docs-api
api.image.tag API image tag latest
api.service.port Service port 8000
api.autoscaling.enabled Enable HPA true
api.autoscaling.minReplicas Min replicas 2
api.autoscaling.maxReplicas Max replicas 10
api.resources.requests.memory Memory request 512Mi
api.resources.limits.memory Memory limit 2Gi

Worker Service

Parameter Description Default
worker.enabled Enable worker service false
worker.replicaCount Number of replicas 3
worker.autoscaling.enabled Enable HPA true
worker.autoscaling.minReplicas Min replicas 1
worker.autoscaling.maxReplicas Max replicas 10

Chat Service

Parameter Description Default
chat.enabled Enable chat service false
chat.replicaCount Number of replicas 1
chat.service.port Service port 8001

Frontend

Parameter Description Default
frontend.enabled Enable frontend true
frontend.replicaCount Number of replicas 2
frontend.service.port Service port 80

Ingress

Parameter Description Default
ingress.enabled Enable ingress true
ingress.className Ingress class nginx
ingress.hosts[0].host Hostname datacenter-docs.example.com
ingress.tls[0].secretName TLS secret name datacenter-docs-tls

Application Configuration

Parameter Description Default
config.llm.baseUrl LLM provider URL https://api.openai.com/v1
config.llm.model LLM model gpt-4-turbo-preview
config.autoRemediation.enabled Enable auto-remediation true
config.autoRemediation.minReliabilityScore Min reliability score 85.0
config.autoRemediation.dryRun Dry run mode false
config.logLevel Log level INFO

Secrets

Parameter Description Default
secrets.llmApiKey LLM API key sk-your-openai-api-key-here
secrets.apiSecretKey API secret key your-secret-key-here-change-in-production

IMPORTANT: Change these secrets in production!

Usage Examples

Enable All Services (including chat and worker)

helm install my-datacenter-docs ./datacenter-docs \
  --set chat.enabled=true \
  --set worker.enabled=true

Disable Auto-Remediation

helm install my-datacenter-docs ./datacenter-docs \
  --set config.autoRemediation.enabled=false

Use Different LLM Provider (e.g., Anthropic Claude)

helm install my-datacenter-docs ./datacenter-docs \
  --set config.llm.baseUrl="https://api.anthropic.com/v1" \
  --set config.llm.model="claude-3-opus-20240229" \
  --set secrets.llmApiKey="sk-ant-your-anthropic-key"

Use Local LLM (e.g., Ollama)

helm install my-datacenter-docs ./datacenter-docs \
  --set config.llm.baseUrl="http://ollama-service:11434/v1" \
  --set config.llm.model="llama2" \
  --set secrets.llmApiKey="not-needed"

Scale MongoDB Storage

helm install my-datacenter-docs ./datacenter-docs \
  --set mongodb.persistence.size="100Gi"

Disable Ingress (use port-forward instead)

helm install my-datacenter-docs ./datacenter-docs \
  --set ingress.enabled=false

Production Configuration with External MongoDB

# production-values.yaml
mongodb:
  enabled: false

config:
  mongodbUrl: "mongodb://user:pass@external-mongodb:27017/datacenter_docs?authSource=admin"

api:
  replicaCount: 5
  autoscaling:
    maxReplicas: 20

secrets:
  llmApiKey: "sk-your-production-api-key"
  apiSecretKey: "your-production-secret-key"

ingress:
  hosts:
    - host: "datacenter-docs.prod.yourdomain.com"
      paths:
        - path: /
          pathType: Prefix
          service: frontend
        - path: /api
          pathType: Prefix
          service: api
helm install prod-datacenter-docs ./datacenter-docs -f production-values.yaml

Upgrading

# Upgrade with new values
helm upgrade my-datacenter-docs ./datacenter-docs -f my-values.yaml

# Upgrade specific parameters
helm upgrade my-datacenter-docs ./datacenter-docs \
  --set api.image.tag="v1.2.0" \
  --reuse-values

Uninstallation

helm uninstall my-datacenter-docs

Note: This will delete all resources except PersistentVolumeClaims (PVCs) for MongoDB. To also delete PVCs:

kubectl delete pvc -l app.kubernetes.io/instance=my-datacenter-docs

Monitoring and Troubleshooting

Check Pod Status

kubectl get pods -l app.kubernetes.io/instance=my-datacenter-docs

View Logs

# API logs
kubectl logs -l app.kubernetes.io/component=api -f

# Worker logs
kubectl logs -l app.kubernetes.io/component=worker -f

# MongoDB logs
kubectl logs -l app.kubernetes.io/component=database -f

Access Services Locally

# API
kubectl port-forward svc/my-datacenter-docs-api 8000:8000

# Frontend
kubectl port-forward svc/my-datacenter-docs-frontend 8080:80

# MongoDB (for debugging)
kubectl port-forward svc/my-datacenter-docs-mongodb 27017:27017

Common Issues

Pods Stuck in Pending

Check if PVCs are bound:

kubectl get pvc

If storage class is missing, set it:

helm upgrade my-datacenter-docs ./datacenter-docs \
  --set mongodb.persistence.storageClass="standard" \
  --reuse-values

API Pods Crash Loop

Check logs:

kubectl logs -l app.kubernetes.io/component=api --tail=100

Common causes:

  • MongoDB not ready (wait for init containers)
  • Invalid LLM API key
  • Missing environment variables

Cannot Access via Ingress

Check ingress status:

kubectl get ingress
kubectl describe ingress my-datacenter-docs

Ensure:

  • Ingress controller is installed
  • DNS points to ingress IP
  • TLS certificate is valid (if using HTTPS)

Security Considerations

Production Checklist

  • Change secrets.llmApiKey to a valid API key
  • Change secrets.apiSecretKey to a strong random key
  • Change MongoDB credentials (mongodb.auth.rootPassword)
  • Enable TLS/SSL on ingress
  • Review RBAC policies
  • Use external secret management (e.g., HashiCorp Vault, AWS Secrets Manager)
  • Enable network policies
  • Set resource limits on all pods
  • Enable pod security policies
  • Review auto-remediation settings

Using External Secrets

Instead of storing secrets in values.yaml, use Kubernetes secrets:

# Create secret
kubectl create secret generic datacenter-docs-secrets \
  --from-literal=llm-api-key="sk-your-key" \
  --from-literal=api-secret-key="your-secret"

# Modify templates to use existing secret
# (requires chart customization)

Development

Validating the Chart

# Lint the chart
helm lint ./datacenter-docs

# Dry run
helm install my-test ./datacenter-docs --dry-run --debug

# Template rendering
helm template my-test ./datacenter-docs > rendered.yaml

Testing Locally

# Create kind cluster
kind create cluster

# Install chart
helm install test ./datacenter-docs \
  --set ingress.enabled=false \
  --set api.autoscaling.enabled=false \
  --set mongodb.persistence.enabled=false

# Test
kubectl port-forward svc/test-datacenter-docs-api 8000:8000
curl http://localhost:8000/health

Support

For issues and questions:

License

See the main repository for license information.