Add Helm chart, Docs, and Config conversion script
Some checks failed
Build / Code Quality Checks (push) Successful in 15m11s
Build / Build & Push Docker Images (worker) (push) Successful in 13m44s
Build / Build & Push Docker Images (frontend) (push) Successful in 5m8s
Build / Build & Push Docker Images (chat) (push) Failing after 30m7s
Build / Build & Push Docker Images (api) (push) Failing after 21m39s
Some checks failed
Build / Code Quality Checks (push) Successful in 15m11s
Build / Build & Push Docker Images (worker) (push) Successful in 13m44s
Build / Build & Push Docker Images (frontend) (push) Successful in 5m8s
Build / Build & Push Docker Images (chat) (push) Failing after 30m7s
Build / Build & Push Docker Images (api) (push) Failing after 21m39s
This commit is contained in:
423
deploy/helm/datacenter-docs/README.md
Normal file
423
deploy/helm/datacenter-docs/README.md
Normal file
@@ -0,0 +1,423 @@
|
||||
# Datacenter Docs & Remediation Engine - Helm Chart
|
||||
|
||||
Helm chart for deploying the LLM Automation - Docs & Remediation Engine on Kubernetes.
|
||||
|
||||
## Overview
|
||||
|
||||
This chart deploys a complete stack including:
|
||||
- **MongoDB**: Document database for storing tickets, documentation, and metadata
|
||||
- **Redis**: Cache and task queue backend
|
||||
- **API Service**: FastAPI REST API with auto-remediation capabilities
|
||||
- **Chat Service**: WebSocket server for real-time documentation queries (optional, not yet implemented)
|
||||
- **Worker Service**: Celery workers for background tasks (optional, not yet implemented)
|
||||
- **Frontend**: React-based web interface
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Kubernetes 1.19+
|
||||
- Helm 3.0+
|
||||
- PersistentVolume provisioner support in the underlying infrastructure (for MongoDB persistence)
|
||||
- Ingress controller (optional, for external access)
|
||||
|
||||
## Installation
|
||||
|
||||
### Quick Start
|
||||
|
||||
```bash
|
||||
# Add the chart repository (if published)
|
||||
helm repo add datacenter-docs https://your-repo-url
|
||||
helm repo update
|
||||
|
||||
# Install with default values
|
||||
helm install my-datacenter-docs datacenter-docs/datacenter-docs
|
||||
|
||||
# Or install from local directory
|
||||
helm install my-datacenter-docs ./datacenter-docs
|
||||
```
|
||||
|
||||
### Production Installation
|
||||
|
||||
For production, create a custom `values.yaml`:
|
||||
|
||||
```bash
|
||||
# Copy and edit the values file
|
||||
cp values.yaml my-values.yaml
|
||||
|
||||
# Edit my-values.yaml with your configuration
|
||||
# At minimum, change:
|
||||
# - secrets.llmApiKey
|
||||
# - secrets.apiSecretKey
|
||||
# - ingress.hosts
|
||||
|
||||
# Install with custom values
|
||||
helm install my-datacenter-docs ./datacenter-docs -f my-values.yaml
|
||||
```
|
||||
|
||||
### Install with Specific Configuration
|
||||
|
||||
```bash
|
||||
helm install my-datacenter-docs ./datacenter-docs \
|
||||
--set secrets.llmApiKey="sk-your-openai-api-key" \
|
||||
--set secrets.apiSecretKey="your-strong-secret-key" \
|
||||
--set ingress.hosts[0].host="datacenter-docs.yourdomain.com" \
|
||||
--set mongodb.persistence.size="50Gi"
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### Key Configuration Parameters
|
||||
|
||||
#### Global Settings
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|-----------|-------------|---------|
|
||||
| `global.imagePullPolicy` | Image pull policy | `IfNotPresent` |
|
||||
| `global.storageClass` | Storage class for PVCs | `""` |
|
||||
|
||||
#### MongoDB
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|-----------|-------------|---------|
|
||||
| `mongodb.enabled` | Enable MongoDB | `true` |
|
||||
| `mongodb.image.repository` | MongoDB image | `mongo` |
|
||||
| `mongodb.image.tag` | MongoDB version | `7` |
|
||||
| `mongodb.auth.rootUsername` | Root username | `admin` |
|
||||
| `mongodb.auth.rootPassword` | Root password | `admin123` |
|
||||
| `mongodb.persistence.enabled` | Enable persistence | `true` |
|
||||
| `mongodb.persistence.size` | Volume size | `10Gi` |
|
||||
| `mongodb.resources.requests.memory` | Memory request | `512Mi` |
|
||||
| `mongodb.resources.limits.memory` | Memory limit | `2Gi` |
|
||||
|
||||
#### Redis
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|-----------|-------------|---------|
|
||||
| `redis.enabled` | Enable Redis | `true` |
|
||||
| `redis.image.repository` | Redis image | `redis` |
|
||||
| `redis.image.tag` | Redis version | `7-alpine` |
|
||||
| `redis.resources.requests.memory` | Memory request | `128Mi` |
|
||||
| `redis.resources.limits.memory` | Memory limit | `512Mi` |
|
||||
|
||||
#### API Service
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|-----------|-------------|---------|
|
||||
| `api.enabled` | Enable API service | `true` |
|
||||
| `api.replicaCount` | Number of replicas | `2` |
|
||||
| `api.image.repository` | API image repository | `datacenter-docs-api` |
|
||||
| `api.image.tag` | API image tag | `latest` |
|
||||
| `api.service.port` | Service port | `8000` |
|
||||
| `api.autoscaling.enabled` | Enable HPA | `true` |
|
||||
| `api.autoscaling.minReplicas` | Min replicas | `2` |
|
||||
| `api.autoscaling.maxReplicas` | Max replicas | `10` |
|
||||
| `api.resources.requests.memory` | Memory request | `512Mi` |
|
||||
| `api.resources.limits.memory` | Memory limit | `2Gi` |
|
||||
|
||||
#### Worker Service
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|-----------|-------------|---------|
|
||||
| `worker.enabled` | Enable worker service | `false` |
|
||||
| `worker.replicaCount` | Number of replicas | `3` |
|
||||
| `worker.autoscaling.enabled` | Enable HPA | `true` |
|
||||
| `worker.autoscaling.minReplicas` | Min replicas | `1` |
|
||||
| `worker.autoscaling.maxReplicas` | Max replicas | `10` |
|
||||
|
||||
#### Chat Service
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|-----------|-------------|---------|
|
||||
| `chat.enabled` | Enable chat service | `false` |
|
||||
| `chat.replicaCount` | Number of replicas | `1` |
|
||||
| `chat.service.port` | Service port | `8001` |
|
||||
|
||||
#### Frontend
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|-----------|-------------|---------|
|
||||
| `frontend.enabled` | Enable frontend | `true` |
|
||||
| `frontend.replicaCount` | Number of replicas | `2` |
|
||||
| `frontend.service.port` | Service port | `80` |
|
||||
|
||||
#### Ingress
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|-----------|-------------|---------|
|
||||
| `ingress.enabled` | Enable ingress | `true` |
|
||||
| `ingress.className` | Ingress class | `nginx` |
|
||||
| `ingress.hosts[0].host` | Hostname | `datacenter-docs.example.com` |
|
||||
| `ingress.tls[0].secretName` | TLS secret name | `datacenter-docs-tls` |
|
||||
|
||||
#### Application Configuration
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|-----------|-------------|---------|
|
||||
| `config.llm.baseUrl` | LLM provider URL | `https://api.openai.com/v1` |
|
||||
| `config.llm.model` | LLM model | `gpt-4-turbo-preview` |
|
||||
| `config.autoRemediation.enabled` | Enable auto-remediation | `true` |
|
||||
| `config.autoRemediation.minReliabilityScore` | Min reliability score | `85.0` |
|
||||
| `config.autoRemediation.dryRun` | Dry run mode | `false` |
|
||||
| `config.logLevel` | Log level | `INFO` |
|
||||
|
||||
#### Secrets
|
||||
|
||||
| Parameter | Description | Default |
|
||||
|-----------|-------------|---------|
|
||||
| `secrets.llmApiKey` | LLM API key | `sk-your-openai-api-key-here` |
|
||||
| `secrets.apiSecretKey` | API secret key | `your-secret-key-here-change-in-production` |
|
||||
|
||||
**IMPORTANT**: Change these secrets in production!
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Enable All Services (including chat and worker)
|
||||
|
||||
```bash
|
||||
helm install my-datacenter-docs ./datacenter-docs \
|
||||
--set chat.enabled=true \
|
||||
--set worker.enabled=true
|
||||
```
|
||||
|
||||
### Disable Auto-Remediation
|
||||
|
||||
```bash
|
||||
helm install my-datacenter-docs ./datacenter-docs \
|
||||
--set config.autoRemediation.enabled=false
|
||||
```
|
||||
|
||||
### Use Different LLM Provider (e.g., Anthropic Claude)
|
||||
|
||||
```bash
|
||||
helm install my-datacenter-docs ./datacenter-docs \
|
||||
--set config.llm.baseUrl="https://api.anthropic.com/v1" \
|
||||
--set config.llm.model="claude-3-opus-20240229" \
|
||||
--set secrets.llmApiKey="sk-ant-your-anthropic-key"
|
||||
```
|
||||
|
||||
### Use Local LLM (e.g., Ollama)
|
||||
|
||||
```bash
|
||||
helm install my-datacenter-docs ./datacenter-docs \
|
||||
--set config.llm.baseUrl="http://ollama-service:11434/v1" \
|
||||
--set config.llm.model="llama2" \
|
||||
--set secrets.llmApiKey="not-needed"
|
||||
```
|
||||
|
||||
### Scale MongoDB Storage
|
||||
|
||||
```bash
|
||||
helm install my-datacenter-docs ./datacenter-docs \
|
||||
--set mongodb.persistence.size="100Gi"
|
||||
```
|
||||
|
||||
### Disable Ingress (use port-forward instead)
|
||||
|
||||
```bash
|
||||
helm install my-datacenter-docs ./datacenter-docs \
|
||||
--set ingress.enabled=false
|
||||
```
|
||||
|
||||
### Production Configuration with External MongoDB
|
||||
|
||||
```yaml
|
||||
# production-values.yaml
|
||||
mongodb:
|
||||
enabled: false
|
||||
|
||||
config:
|
||||
mongodbUrl: "mongodb://user:pass@external-mongodb:27017/datacenter_docs?authSource=admin"
|
||||
|
||||
api:
|
||||
replicaCount: 5
|
||||
autoscaling:
|
||||
maxReplicas: 20
|
||||
|
||||
secrets:
|
||||
llmApiKey: "sk-your-production-api-key"
|
||||
apiSecretKey: "your-production-secret-key"
|
||||
|
||||
ingress:
|
||||
hosts:
|
||||
- host: "datacenter-docs.prod.yourdomain.com"
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
service: frontend
|
||||
- path: /api
|
||||
pathType: Prefix
|
||||
service: api
|
||||
```
|
||||
|
||||
```bash
|
||||
helm install prod-datacenter-docs ./datacenter-docs -f production-values.yaml
|
||||
```
|
||||
|
||||
## Upgrading
|
||||
|
||||
```bash
|
||||
# Upgrade with new values
|
||||
helm upgrade my-datacenter-docs ./datacenter-docs -f my-values.yaml
|
||||
|
||||
# Upgrade specific parameters
|
||||
helm upgrade my-datacenter-docs ./datacenter-docs \
|
||||
--set api.image.tag="v1.2.0" \
|
||||
--reuse-values
|
||||
```
|
||||
|
||||
## Uninstallation
|
||||
|
||||
```bash
|
||||
helm uninstall my-datacenter-docs
|
||||
```
|
||||
|
||||
**Note**: This will delete all resources except PersistentVolumeClaims (PVCs) for MongoDB. To also delete PVCs:
|
||||
|
||||
```bash
|
||||
kubectl delete pvc -l app.kubernetes.io/instance=my-datacenter-docs
|
||||
```
|
||||
|
||||
## Monitoring and Troubleshooting
|
||||
|
||||
### Check Pod Status
|
||||
|
||||
```bash
|
||||
kubectl get pods -l app.kubernetes.io/instance=my-datacenter-docs
|
||||
```
|
||||
|
||||
### View Logs
|
||||
|
||||
```bash
|
||||
# API logs
|
||||
kubectl logs -l app.kubernetes.io/component=api -f
|
||||
|
||||
# Worker logs
|
||||
kubectl logs -l app.kubernetes.io/component=worker -f
|
||||
|
||||
# MongoDB logs
|
||||
kubectl logs -l app.kubernetes.io/component=database -f
|
||||
```
|
||||
|
||||
### Access Services Locally
|
||||
|
||||
```bash
|
||||
# API
|
||||
kubectl port-forward svc/my-datacenter-docs-api 8000:8000
|
||||
|
||||
# Frontend
|
||||
kubectl port-forward svc/my-datacenter-docs-frontend 8080:80
|
||||
|
||||
# MongoDB (for debugging)
|
||||
kubectl port-forward svc/my-datacenter-docs-mongodb 27017:27017
|
||||
```
|
||||
|
||||
### Common Issues
|
||||
|
||||
#### Pods Stuck in Pending
|
||||
|
||||
Check if PVCs are bound:
|
||||
```bash
|
||||
kubectl get pvc
|
||||
```
|
||||
|
||||
If storage class is missing, set it:
|
||||
```bash
|
||||
helm upgrade my-datacenter-docs ./datacenter-docs \
|
||||
--set mongodb.persistence.storageClass="standard" \
|
||||
--reuse-values
|
||||
```
|
||||
|
||||
#### API Pods Crash Loop
|
||||
|
||||
Check logs:
|
||||
```bash
|
||||
kubectl logs -l app.kubernetes.io/component=api --tail=100
|
||||
```
|
||||
|
||||
Common causes:
|
||||
- MongoDB not ready (wait for init containers)
|
||||
- Invalid LLM API key
|
||||
- Missing environment variables
|
||||
|
||||
#### Cannot Access via Ingress
|
||||
|
||||
Check ingress status:
|
||||
```bash
|
||||
kubectl get ingress
|
||||
kubectl describe ingress my-datacenter-docs
|
||||
```
|
||||
|
||||
Ensure:
|
||||
- Ingress controller is installed
|
||||
- DNS points to ingress IP
|
||||
- TLS certificate is valid (if using HTTPS)
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### Production Checklist
|
||||
|
||||
- [ ] Change `secrets.llmApiKey` to a valid API key
|
||||
- [ ] Change `secrets.apiSecretKey` to a strong random key
|
||||
- [ ] Change MongoDB credentials (`mongodb.auth.rootPassword`)
|
||||
- [ ] Enable TLS/SSL on ingress
|
||||
- [ ] Review RBAC policies
|
||||
- [ ] Use external secret management (e.g., HashiCorp Vault, AWS Secrets Manager)
|
||||
- [ ] Enable network policies
|
||||
- [ ] Set resource limits on all pods
|
||||
- [ ] Enable pod security policies
|
||||
- [ ] Review auto-remediation settings
|
||||
|
||||
### Using External Secrets
|
||||
|
||||
Instead of storing secrets in values.yaml, use Kubernetes secrets:
|
||||
|
||||
```bash
|
||||
# Create secret
|
||||
kubectl create secret generic datacenter-docs-secrets \
|
||||
--from-literal=llm-api-key="sk-your-key" \
|
||||
--from-literal=api-secret-key="your-secret"
|
||||
|
||||
# Modify templates to use existing secret
|
||||
# (requires chart customization)
|
||||
```
|
||||
|
||||
## Development
|
||||
|
||||
### Validating the Chart
|
||||
|
||||
```bash
|
||||
# Lint the chart
|
||||
helm lint ./datacenter-docs
|
||||
|
||||
# Dry run
|
||||
helm install my-test ./datacenter-docs --dry-run --debug
|
||||
|
||||
# Template rendering
|
||||
helm template my-test ./datacenter-docs > rendered.yaml
|
||||
```
|
||||
|
||||
### Testing Locally
|
||||
|
||||
```bash
|
||||
# Create kind cluster
|
||||
kind create cluster
|
||||
|
||||
# Install chart
|
||||
helm install test ./datacenter-docs \
|
||||
--set ingress.enabled=false \
|
||||
--set api.autoscaling.enabled=false \
|
||||
--set mongodb.persistence.enabled=false
|
||||
|
||||
# Test
|
||||
kubectl port-forward svc/test-datacenter-docs-api 8000:8000
|
||||
curl http://localhost:8000/health
|
||||
```
|
||||
|
||||
## Support
|
||||
|
||||
For issues and questions:
|
||||
- Issues: https://git.commandware.com/it-ops/llm-automation-docs-and-remediation-engine/issues
|
||||
- Documentation: https://git.commandware.com/it-ops/llm-automation-docs-and-remediation-engine
|
||||
|
||||
## License
|
||||
|
||||
See the main repository for license information.
|
||||
Reference in New Issue
Block a user