Initial commit: LLM Automation Docs & Remediation Engine v2.0

Features:
- Automated datacenter documentation generation
- MCP integration for device connectivity
- Auto-remediation engine with safety checks
- Multi-factor reliability scoring (0-100%)
- Human feedback learning loop
- Pattern recognition and continuous improvement
- Agentic chat support with AI
- API for ticket resolution
- Frontend React with Material-UI
- CI/CD pipelines (GitLab + Gitea)
- Docker & Kubernetes deployment
- Complete documentation and guides

v2.0 Highlights:
- Auto-remediation with write operations (disabled by default)
- Reliability calculator with 4-factor scoring
- Human feedback system for continuous learning
- Pattern-based progressive automation
- Approval workflow for critical actions
- Full audit trail and rollback capability
This commit is contained in:
LLM Automation System
2025-10-17 23:47:28 +00:00
commit 1ba5ce851d
89 changed files with 20468 additions and 0 deletions

View File

@@ -0,0 +1,687 @@
# API Endpoints e Comandi per Raccolta Dati
## 1. VMware vSphere API
### 1.1 REST API Endpoints
```bash
# Base URL
BASE_URL="https://vcenter.domain.local/rest"
# Authentication
curl -X POST $BASE_URL/com/vmware/cis/session \
-u 'automation@vsphere.local:password'
# Get all VMs
curl -X GET $BASE_URL/vcenter/vm \
-H "vmware-api-session-id: ${SESSION_ID}"
# Get VM details
curl -X GET $BASE_URL/vcenter/vm/${VM_ID} \
-H "vmware-api-session-id: ${SESSION_ID}"
# Get hosts
curl -X GET $BASE_URL/vcenter/host \
-H "vmware-api-session-id: ${SESSION_ID}"
# Get datastores
curl -X GET $BASE_URL/vcenter/datastore \
-H "vmware-api-session-id: ${SESSION_ID}"
# Get clusters
curl -X GET $BASE_URL/vcenter/cluster \
-H "vmware-api-session-id: ${SESSION_ID}"
```
### 1.2 PowerCLI Commands
```powershell
# Connect
Connect-VIServer -Server vcenter.domain.local -User automation@vsphere.local
# Get all VMs with details
Get-VM | Select-Object Name, PowerState, NumCpu, MemoryGB, @{N='UsedSpaceGB';E={[math]::Round($_.UsedSpaceGB,2)}}, VMHost, ResourcePool | Export-Csv -Path vms.csv
# Get hosts
Get-VMHost | Select-Object Name, ConnectionState, PowerState, Version, NumCpu, MemoryTotalGB, @{N='MemoryUsageGB';E={[math]::Round($_.MemoryUsageGB,2)}} | Export-Csv -Path hosts.csv
# Get datastores
Get-Datastore | Select-Object Name, Type, CapacityGB, FreeSpaceGB, @{N='PercentFree';E={[math]::Round(($_.FreeSpaceGB/$_.CapacityGB*100),2)}} | Export-Csv -Path datastores.csv
# Get performance stats
Get-Stat -Entity (Get-VM) -Stat cpu.usage.average,mem.usage.average -Start (Get-Date).AddDays(-7) -IntervalMins 5 | Export-Csv -Path performance.csv
```
---
## 2. Proxmox VE API
### 2.1 REST API
```bash
# Base URL
PROXMOX_URL="https://proxmox.domain.local:8006/api2/json"
# Get ticket (authentication)
curl -k -d "username=automation@pam&password=password" \
$PROXMOX_URL/access/ticket
# Get nodes
curl -k -H "Cookie: PVEAuthCookie=${TICKET}" \
$PROXMOX_URL/nodes
# Get VMs on node
curl -k -H "Cookie: PVEAuthCookie=${TICKET}" \
$PROXMOX_URL/nodes/${NODE}/qemu
# Get containers
curl -k -H "Cookie: PVEAuthCookie=${TICKET}" \
$PROXMOX_URL/nodes/${NODE}/lxc
# Get storage
curl -k -H "Cookie: PVEAuthCookie=${TICKET}" \
$PROXMOX_URL/nodes/${NODE}/storage
# Get cluster status
curl -k -H "Cookie: PVEAuthCookie=${TICKET}" \
$PROXMOX_URL/cluster/status
```
### 2.2 CLI Commands
```bash
# List VMs
pvesh get /cluster/resources --type vm
# VM status
qm status ${VMID}
# Container list
pct list
# Storage info
pvesm status
# Node info
pvesh get /nodes/${NODE}/status
```
---
## 3. Network Devices
### 3.1 Cisco IOS Commands
```bash
# Via SSH
ssh admin@switch.domain.local
# System information
show version
show inventory
show running-config
# Interfaces
show interfaces status
show interfaces description
show interfaces counters errors
show ip interface brief
# VLANs
show vlan brief
show vlan id ${VLAN_ID}
# Spanning Tree
show spanning-tree summary
show spanning-tree root
# Routing
show ip route
show ip protocols
# CDP/LLDP
show cdp neighbors detail
show lldp neighbors
# Performance
show processes cpu history
show memory statistics
show environment all
```
### 3.2 HP/Aruba Switch Commands
```bash
# System info
show system
show version
show running-config
# Interfaces
show interfaces brief
show interfaces status
# VLANs
show vlans
# Spanning tree
show spanning-tree
# Logging
show log
```
---
## 4. Firewall APIs
### 4.1 pfSense/OPNsense API
```bash
# Base URL
FW_URL="https://firewall.domain.local/api"
# Get system info
curl -X GET "${FW_URL}/core/system/status" \
-H "Authorization: Bearer ${API_TOKEN}"
# Get interfaces
curl -X GET "${FW_URL}/interfaces/overview/export" \
-H "Authorization: Bearer ${API_TOKEN}"
# Get firewall rules
curl -X GET "${FW_URL}/firewall/filter/searchRule" \
-H "Authorization: Bearer ${API_TOKEN}"
# Get VPN status
curl -X GET "${FW_URL}/ipsec/sessions" \
-H "Authorization: Bearer ${API_TOKEN}"
```
### 4.2 Fortinet FortiGate API
```bash
# Base URL
FORTI_URL="https://fortigate.domain.local/api/v2"
# System status
curl -X GET "${FORTI_URL}/monitor/system/status" \
-H "Authorization: Bearer ${API_TOKEN}"
# Interface stats
curl -X GET "${FORTI_URL}/monitor/system/interface/select" \
-H "Authorization: Bearer ${API_TOKEN}"
# Firewall policies
curl -X GET "${FORTI_URL}/cmdb/firewall/policy" \
-H "Authorization: Bearer ${API_TOKEN}"
# VPN status
curl -X GET "${FORTI_URL}/monitor/vpn/ipsec" \
-H "Authorization: Bearer ${API_TOKEN}"
```
---
## 5. Storage Arrays
### 5.1 Pure Storage API
```bash
# Base URL
PURE_URL="https://array.domain.local/api"
# Get array info
curl -X GET "${PURE_URL}/1.19/array" \
-H "api-token: ${API_TOKEN}"
# Get volumes
curl -X GET "${PURE_URL}/1.19/volume" \
-H "api-token: ${API_TOKEN}"
# Get hosts
curl -X GET "${PURE_URL}/1.19/host" \
-H "api-token: ${API_TOKEN}"
# Get performance metrics
curl -X GET "${PURE_URL}/1.19/array/monitor?action=monitor" \
-H "api-token: ${API_TOKEN}"
```
### 5.2 NetApp ONTAP API
```bash
# Base URL
NETAPP_URL="https://netapp.domain.local/api"
# Get cluster info
curl -X GET "${NETAPP_URL}/cluster" \
-u "admin:password"
# Get volumes
curl -X GET "${NETAPP_URL}/storage/volumes" \
-u "admin:password"
# Get aggregates
curl -X GET "${NETAPP_URL}/storage/aggregates" \
-u "admin:password"
# Get performance
curl -X GET "${NETAPP_URL}/cluster/counter/tables/volume" \
-u "admin:password"
```
### 5.3 Generic SAN Commands
```bash
# Via SSH to array management interface
# Show system info
show system
show controller
show disk
# Show volumes/LUNs
show volumes
show luns
show mappings
# Show performance
show statistics
show disk-statistics
```
---
## 6. Monitoring Systems
### 6.1 Zabbix API
```bash
# Base URL
ZABBIX_URL="https://zabbix.domain.local/api_jsonrpc.php"
# Authenticate
curl -X POST $ZABBIX_URL \
-H "Content-Type: application/json-rpc" \
-d '{
"jsonrpc": "2.0",
"method": "user.login",
"params": {
"user": "automation",
"password": "password"
},
"id": 1
}'
# Get hosts
curl -X POST $ZABBIX_URL \
-H "Content-Type: application/json-rpc" \
-d '{
"jsonrpc": "2.0",
"method": "host.get",
"params": {
"output": ["hostid", "host", "status"]
},
"auth": "'${AUTH_TOKEN}'",
"id": 1
}'
# Get problems
curl -X POST $ZABBIX_URL \
-H "Content-Type: application/json-rpc" \
-d '{
"jsonrpc": "2.0",
"method": "problem.get",
"params": {
"recent": true
},
"auth": "'${AUTH_TOKEN}'",
"id": 1
}'
```
### 6.2 Prometheus API
```bash
# Base URL
PROM_URL="http://prometheus.domain.local:9090"
# Query instant
curl -X GET "${PROM_URL}/api/v1/query?query=up"
# Query range
curl -X GET "${PROM_URL}/api/v1/query_range?query=node_cpu_seconds_total&start=2024-01-01T00:00:00Z&end=2024-01-02T00:00:00Z&step=15s"
# Get targets
curl -X GET "${PROM_URL}/api/v1/targets"
# Get alerts
curl -X GET "${PROM_URL}/api/v1/alerts"
```
### 6.3 Nagios/Icinga API
```bash
# Icinga2 API
ICINGA_URL="https://icinga.domain.local:5665"
# Get hosts
curl -k -u "automation:password" \
"${ICINGA_URL}/v1/objects/hosts"
# Get services
curl -k -u "automation:password" \
"${ICINGA_URL}/v1/objects/services"
# Get problems
curl -k -u "automation:password" \
"${ICINGA_URL}/v1/objects/services?filter=service.state!=0"
```
---
## 7. Backup Systems
### 7.1 Veeam API
```powershell
# Connect to Veeam server
Connect-VBRServer -Server veeam.domain.local -User automation
# Get backup jobs
Get-VBRJob | Select-Object Name, JobType, IsScheduleEnabled, LastResult
# Get backup sessions
Get-VBRBackupSession | Where-Object {$_.CreationTime -gt (Get-Date).AddDays(-7)} | Select-Object Name, JobName, Result, CreationTime
# Get restore points
Get-VBRRestorePoint | Select-Object VMName, CreationTime, Type
# Get repositories
Get-VBRBackupRepository | Select-Object Name, Path, @{N='FreeGB';E={[math]::Round($_.GetContainer().CachedFreeSpace.InGigabytes,2)}}
```
### 7.2 CommVault API
```bash
# Base URL
CV_URL="https://commvault.domain.local/webconsole/api"
# Login
curl -X POST "${CV_URL}/Login" \
-H "Content-Type: application/json" \
-d '{"username":"automation","password":"password"}'
# Get jobs
curl -X GET "${CV_URL}/Job?clientName=${CLIENT}" \
-H "Authtoken: ${TOKEN}"
# Get clients
curl -X GET "${CV_URL}/Client" \
-H "Authtoken: ${TOKEN}"
```
---
## 8. Database Queries
### 8.1 Asset Management DB
```sql
-- MySQL/MariaDB queries for asset database
-- Get all racks
SELECT
rack_id,
location,
total_units,
occupied_units,
(total_units - occupied_units) AS available_units,
max_power_kw,
ROUND(occupied_units * 100.0 / total_units, 2) AS utilization_percent
FROM racks
ORDER BY location, rack_id;
-- Get all servers
SELECT
s.hostname,
s.serial_number,
s.model,
s.cpu_model,
s.cpu_cores,
s.ram_gb,
s.rack_id,
s.rack_unit,
s.status,
s.environment
FROM servers s
ORDER BY s.rack_id, s.rack_unit;
-- Get network devices
SELECT
n.hostname,
n.device_type,
n.vendor,
n.model,
n.management_ip,
n.firmware_version,
n.rack_id,
n.status
FROM network_devices n
ORDER BY n.device_type, n.hostname;
-- Get contracts
SELECT
c.vendor,
c.service_type,
c.contract_type,
c.start_date,
c.end_date,
DATEDIFF(c.end_date, NOW()) AS days_to_expiry,
c.annual_cost
FROM contracts c
WHERE c.end_date > NOW()
ORDER BY c.end_date;
```
### 8.2 Database Server Queries
```sql
-- MySQL - Database sizes
SELECT
table_schema AS 'Database',
ROUND(SUM(data_length + index_length) / 1024 / 1024 / 1024, 2) AS 'Size_GB'
FROM information_schema.tables
GROUP BY table_schema
ORDER BY SUM(data_length + index_length) DESC;
-- PostgreSQL - Database sizes
SELECT
datname AS database_name,
pg_size_pretty(pg_database_size(datname)) AS size
FROM pg_database
ORDER BY pg_database_size(datname) DESC;
-- SQL Server - Database sizes
SELECT
DB_NAME(database_id) AS DatabaseName,
(size * 8.0 / 1024) AS SizeMB
FROM sys.master_files
WHERE type = 0
ORDER BY size DESC;
```
---
## 9. Cloud Provider APIs
### 9.1 AWS (Boto3)
```python
import boto3
# EC2 instances
ec2 = boto3.client('ec2')
instances = ec2.describe_instances()
# S3 buckets
s3 = boto3.client('s3')
buckets = s3.list_buckets()
# RDS databases
rds = boto3.client('rds')
databases = rds.describe_db_instances()
# Cost Explorer
ce = boto3.client('ce')
cost = ce.get_cost_and_usage(
TimePeriod={'Start': '2024-01-01', 'End': '2024-01-31'},
Granularity='MONTHLY',
Metrics=['UnblendedCost']
)
```
### 9.2 Azure (SDK)
```python
from azure.identity import DefaultAzureCredential
from azure.mgmt.compute import ComputeManagementClient
from azure.mgmt.storage import StorageManagementClient
credential = DefaultAzureCredential()
# VMs
compute_client = ComputeManagementClient(credential, subscription_id)
vms = compute_client.virtual_machines.list_all()
# Storage accounts
storage_client = StorageManagementClient(credential, subscription_id)
storage_accounts = storage_client.storage_accounts.list()
```
---
## 10. SNMP OIDs Reference
### 10.1 Common System OIDs
```bash
# System description
.1.3.6.1.2.1.1.1.0 # sysDescr
# System uptime
.1.3.6.1.2.1.1.3.0 # sysUpTime
# System name
.1.3.6.1.2.1.1.5.0 # sysName
# System location
.1.3.6.1.2.1.1.6.0 # sysLocation
```
### 10.2 UPS OIDs (RFC 1628)
```bash
# UPS identity
.1.3.6.1.2.1.33.1.1.1.0 # upsIdentManufacturer
.1.3.6.1.2.1.33.1.1.2.0 # upsIdentModel
# Battery status
.1.3.6.1.2.1.33.1.2.1.0 # upsBatteryStatus
.1.3.6.1.2.1.33.1.2.2.0 # upsSecondsOnBattery
.1.3.6.1.2.1.33.1.2.3.0 # upsEstimatedMinutesRemaining
# Input
.1.3.6.1.2.1.33.1.3.3.1.3 # upsInputVoltage
.1.3.6.1.2.1.33.1.3.3.1.4 # upsInputCurrent
.1.3.6.1.2.1.33.1.3.3.1.6 # upsInputTruePower
# Output
.1.3.6.1.2.1.33.1.4.4.1.2 # upsOutputVoltage
.1.3.6.1.2.1.33.1.4.4.1.3 # upsOutputCurrent
.1.3.6.1.2.1.33.1.4.4.1.4 # upsOutputPower
.1.3.6.1.2.1.33.1.4.4.1.5 # upsOutputPercentLoad
```
### 10.3 Network Interface OIDs
```bash
# Interface description
.1.3.6.1.2.1.2.2.1.2 # ifDescr
# Interface status
.1.3.6.1.2.1.2.2.1.8 # ifOperStatus
# Interface traffic
.1.3.6.1.2.1.2.2.1.10 # ifInOctets
.1.3.6.1.2.1.2.2.1.16 # ifOutOctets
# Interface errors
.1.3.6.1.2.1.2.2.1.14 # ifInErrors
.1.3.6.1.2.1.2.2.1.20 # ifOutErrors
```
---
## 11. Example Collection Script
### 11.1 Complete Data Collection
```bash
#!/bin/bash
# collect_all_data.sh - Orchestrate all data collection
OUTPUT_DIR="/tmp/datacenter-collection-$(date +%Y%m%d_%H%M%S)"
mkdir -p $OUTPUT_DIR
echo "Starting datacenter data collection..."
# VMware
echo "Collecting VMware data..."
python3 collect_vmware.py > $OUTPUT_DIR/vmware.json
# Network devices
echo "Collecting network configurations..."
./collect_network.sh > $OUTPUT_DIR/network.json
# Storage
echo "Collecting storage data..."
python3 collect_storage.py > $OUTPUT_DIR/storage.json
# Monitoring
echo "Collecting monitoring data..."
./collect_monitoring.sh > $OUTPUT_DIR/monitoring.json
# Databases
echo "Querying databases..."
mysql -h db.local -u reader -pPASS asset_db < queries.sql > $OUTPUT_DIR/asset_db.csv
# SNMP devices
echo "Polling SNMP devices..."
./poll_snmp.sh > $OUTPUT_DIR/snmp.json
echo "Collection complete. Data saved to: $OUTPUT_DIR"
tar -czf $OUTPUT_DIR.tar.gz $OUTPUT_DIR
```
---
## 12. Rate Limiting Reference
### 12.1 Vendor Rate Limits
| Vendor | Endpoint | Limit | Time Window |
|--------|----------|-------|-------------|
| VMware vCenter | REST API | 100 req | per minute |
| Zabbix | API | 300 req | per minute |
| Pure Storage | REST API | 60 req | per minute |
| Cisco DNA Center | API | 10 req | per second |
| AWS | API (varies) | 10-100 req | per second |
### 12.2 Retry Strategy
```python
import time
from functools import wraps
def rate_limited_retry(max_retries=3, backoff_factor=2):
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
for attempt in range(max_retries):
try:
return func(*args, **kwargs)
except RateLimitException:
if attempt == max_retries - 1:
raise
wait_time = backoff_factor ** attempt
logger.warning(f"Rate limited. Waiting {wait_time}s before retry {attempt+1}/{max_retries}")
time.sleep(wait_time)
except Exception as e:
logger.error(f"Error: {e}")
raise
return wrapper
return decorator
```
---
**Documento Versione**: 1.0
**Ultimo Aggiornamento**: 2025-01-XX
**Maintainer**: Automation Team