# Helm Deployment This directory contains Helm charts for deploying the Datacenter Docs & Remediation Engine on Kubernetes. ## Contents - `datacenter-docs/` - Main Helm chart for the application - `test-chart.sh` - Automated testing script for chart validation ## Quick Start ### Prerequisites - Kubernetes cluster (1.19+) - Helm 3.0+ - kubectl configured to access your cluster ### Development/Testing Installation ```bash # Install with development settings (minimal resources, local testing) helm install dev ./datacenter-docs -f ./datacenter-docs/values-development.yaml # Access the application kubectl port-forward svc/dev-datacenter-docs-api 8000:8000 kubectl port-forward svc/dev-datacenter-docs-frontend 8080:80 # View API docs: http://localhost:8000/api/docs # View frontend: http://localhost:8080 ``` ### Production Installation ```bash # Copy and customize production values cp datacenter-docs/values-production.yaml my-production-values.yaml # Edit my-production-values.yaml: # - Change all secrets (llmApiKey, apiSecretKey, mongodbPassword) # - Update ingress hosts # - Adjust resource limits # - Configure LLM provider # - Review auto-remediation settings # Install helm install prod ./datacenter-docs -f my-production-values.yaml # Verify deployment helm list kubectl get pods kubectl get ingress ``` ## Chart Structure ``` datacenter-docs/ ├── Chart.yaml # Chart metadata ├── values.yaml # Default configuration ├── values-development.yaml # Development settings ├── values-production.yaml # Production example ├── README.md # Detailed chart documentation ├── .helmignore # Files to exclude from package └── templates/ ├── NOTES.txt # Post-install instructions ├── _helpers.tpl # Template helpers ├── configmap.yaml # Application configuration ├── secrets.yaml # Sensitive data ├── serviceaccount.yaml # Service account ├── mongodb-statefulset.yaml # MongoDB StatefulSet ├── mongodb-service.yaml # MongoDB Service ├── redis-deployment.yaml # Redis Deployment ├── redis-service.yaml # Redis Service ├── api-deployment.yaml # API Deployment ├── api-service.yaml # API Service ├── api-hpa.yaml # API autoscaling ├── chat-deployment.yaml # Chat Deployment ├── chat-service.yaml # Chat Service ├── worker-deployment.yaml # Worker Deployment ├── worker-hpa.yaml # Worker autoscaling ├── frontend-deployment.yaml # Frontend Deployment ├── frontend-service.yaml # Frontend Service └── ingress.yaml # Ingress configuration ``` ## Testing the Chart Run the automated test script: ```bash cd deploy/helm ./test-chart.sh ``` This will: 1. Lint the chart 2. Render templates with different value files 3. Perform dry-run installation 4. Validate Kubernetes manifests 5. Package the chart ## Common Operations ### Upgrade Release ```bash # Upgrade with new values helm upgrade prod ./datacenter-docs -f my-production-values.yaml # Upgrade with specific parameter changes helm upgrade prod ./datacenter-docs --set api.replicaCount=10 --reuse-values ``` ### Check Status ```bash # List releases helm list # Get release status helm status prod # Get current values helm get values prod # Get all manifests helm get manifest prod ``` ### Rollback ```bash # View revision history helm history prod # Rollback to previous version helm rollback prod # Rollback to specific revision helm rollback prod 2 ``` ### Uninstall ```bash # Uninstall release helm uninstall prod # Also delete PVCs (if using persistent storage) kubectl delete pvc -l app.kubernetes.io/instance=prod ``` ## Configuration Files ### values.yaml Default configuration with reasonable settings for development/testing. ### values-development.yaml Optimized for local development: - Minimal resource requests/limits - Single replicas - Persistence disabled - Dry-run mode for auto-remediation - Debug logging - Ingress disabled (use port-forward) ### values-production.yaml Example production configuration: - Higher resource limits - Multiple replicas - Autoscaling enabled - Persistence enabled with larger volumes - TLS/SSL enabled - Production-grade security settings - All components enabled **Important**: Copy and customize this file for your environment. Never use default secrets! ## Available Components | Component | Purpose | Default Enabled | |-----------|---------|-----------------| | MongoDB | Document database | Yes | | Redis | Cache & task queue | Yes | | API | REST API service | Yes | | Chat | WebSocket server | No (not implemented) | | Worker | Celery background tasks | No (not implemented) | | Frontend | Web UI | Yes | Enable/disable components in your values file: ```yaml mongodb: enabled: true redis: enabled: true api: enabled: true chat: enabled: false # Set to true when implemented worker: enabled: false # Set to true when implemented frontend: enabled: true ``` ## Architecture The chart deploys a complete microservices architecture: ``` ┌─────────────┐ │ Ingress │ └──────┬──────┘ │ ┌─────────────┼─────────────┐ │ │ │ ┌────▼────┐ ┌────▼────┐ ┌────▼────┐ │Frontend │ │ API │ │ Chat │ └─────────┘ └────┬────┘ └────┬────┘ │ │ ┌─────────────┼────────────┘ │ │ ┌────▼────┐ ┌────▼────┐ │ Redis │ │ MongoDB │ └─────────┘ └─────────┘ ▲ │ ┌────┴────┐ │ Worker │ └─────────┘ ``` ## LLM Provider Configuration The chart supports multiple LLM providers. Configure in your values file: ### OpenAI ```yaml config: llm: baseUrl: "https://api.openai.com/v1" model: "gpt-4-turbo-preview" secrets: llmApiKey: "sk-your-openai-key" ``` ### Anthropic Claude ```yaml config: llm: baseUrl: "https://api.anthropic.com/v1" model: "claude-3-opus-20240229" secrets: llmApiKey: "sk-ant-your-anthropic-key" ``` ### Local (Ollama) ```yaml config: llm: baseUrl: "http://ollama-service:11434/v1" model: "llama2" secrets: llmApiKey: "not-needed" ``` ### Azure OpenAI ```yaml config: llm: baseUrl: "https://your-resource.openai.azure.com" model: "gpt-4" secrets: llmApiKey: "your-azure-key" ``` ## Security Best Practices For production deployments: 1. **Change all default secrets** ```bash helm install prod ./datacenter-docs \ --set secrets.llmApiKey="your-actual-key" \ --set secrets.apiSecretKey="$(openssl rand -base64 32)" \ --set secrets.mongodbPassword="$(openssl rand -base64 32)" ``` 2. **Use external secret management** - HashiCorp Vault - AWS Secrets Manager - Azure Key Vault - Kubernetes External Secrets Operator 3. **Enable TLS/SSL** ```yaml ingress: annotations: cert-manager.io/cluster-issuer: "letsencrypt-prod" tls: - secretName: datacenter-docs-tls hosts: - datacenter-docs.yourdomain.com ``` 4. **Review auto-remediation settings** ```yaml config: autoRemediation: enabled: true minReliabilityScore: 95.0 # High threshold for production dryRun: true # Test first, then set to false ``` 5. **Implement network policies** 6. **Enable resource quotas** 7. **Regular security scanning** ## Monitoring and Observability The chart is designed to integrate with: - **Prometheus**: Metrics collection - **Grafana**: Visualization - **Jaeger**: Distributed tracing - **ELK/Loki**: Log aggregation Add annotations to enable monitoring: ```yaml podAnnotations: prometheus.io/scrape: "true" prometheus.io/port: "8000" prometheus.io/path: "/metrics" ``` ## Troubleshooting ### Pods not starting ```bash # Check pod status kubectl get pods -l app.kubernetes.io/instance=prod # Describe pod for events kubectl describe pod # View logs kubectl logs -f ``` ### Storage issues ```bash # Check PVC status kubectl get pvc # Check storage class kubectl get storageclass # Manually create PVC if needed kubectl apply -f - <