# Requisiti Tecnici per LLM - Generazione Documentazione Datacenter ## 1. Capacità Richieste al LLM ### 1.1 Capabilities Fondamentali - **Network Access**: Connessioni SSH, HTTPS, SNMP - **API Interaction**: REST, SOAP, GraphQL - **Code Execution**: Python, Bash, PowerShell - **File Operations**: Lettura/scrittura file markdown - **Database Access**: MySQL, PostgreSQL, SQL Server ### 1.2 Librerie Python Richieste ```python # Networking e protocolli pip install paramiko # SSH connections pip install pysnmp # SNMP queries pip install requests # HTTP/REST APIs pip install netmiko # Network device automation # Virtualizzazione pip install pyvmomi # VMware vSphere API pip install proxmoxer # Proxmox API pip install libvirt-python # KVM/QEMU # Storage pip install pure-storage # Pure Storage API pip install netapp-ontap # NetApp API # Database pip install mysql-connector-python pip install psycopg2 # PostgreSQL pip install pymssql # Microsoft SQL Server # Monitoring pip install zabbix-api # Zabbix pip install prometheus-client # Prometheus # Cloud providers pip install boto3 # AWS pip install azure-mgmt # Azure pip install google-cloud # GCP # Utilities pip install jinja2 # Template rendering pip install pyyaml # YAML parsing pip install pandas # Data analysis pip install markdown # Markdown generation ``` ### 1.3 CLI Tools Required ```bash # Network tools apt-get install snmp snmp-mibs-downloader apt-get install nmap apt-get install netcat-openbsd # Virtualization apt-get install open-vm-tools # VMware # Monitoring apt-get install nagios-plugins # Storage apt-get install nfs-common apt-get install cifs-utils apt-get install multipath-tools # Database clients apt-get install mysql-client apt-get install postgresql-client ``` --- ## 2. Accessi e Credenziali Necessarie ### 2.1 Formato Credenziali Le credenziali devono essere fornite in un file sicuro (vault/encrypted): ```yaml # credentials.yaml (encrypted) datacenter: # Network devices network: cisco_switches: username: admin password: ${ENCRYPTED} enable_password: ${ENCRYPTED} firewalls: api_key: ${ENCRYPTED} # Virtualization vmware: vcenter_host: vcenter.domain.local username: automation@vsphere.local password: ${ENCRYPTED} proxmox: host: proxmox.domain.local token_name: automation token_value: ${ENCRYPTED} # Storage storage_arrays: - name: SAN-01 type: pure_storage api_token: ${ENCRYPTED} # Databases databases: asset_management: host: db.domain.local port: 3306 username: readonly_user password: ${ENCRYPTED} database: asset_db # Monitoring monitoring: zabbix: url: https://zabbix.domain.local api_token: ${ENCRYPTED} # Backup backup: veeam: server: veeam.domain.local username: automation password: ${ENCRYPTED} ``` ### 2.2 Permessi Minimi Richiesti **IMPORTANTE**: Utilizzare SEMPRE account a permessi minimi (read-only dove possibile) | Sistema | Account Type | Permessi Richiesti | |---------|-------------|-------------------| | Network Devices | Read-only | show commands, SNMP read | | VMware vCenter | Read-only | Global > Read-only role | | Storage Arrays | Read-only | Monitoring/reporting access | | Databases | SELECT only | Read access su schema asset | | Monitoring | Read-only | View dashboards, metrics | | Backup Software | Read-only | View jobs, reports | --- ## 3. Connettività di Rete ### 3.1 Requisiti Rete ``` LLM Host deve poter raggiungere: Management Network: - VLAN 10: 10.0.10.0/24 (Infrastructure Management) - VLAN 20: 10.0.20.0/24 (Server Management) - VLAN 30: 10.0.30.0/24 (Storage Management) Porte richieste: - TCP 22 (SSH) - TCP 443 (HTTPS) - TCP 3306 (MySQL) - TCP 5432 (PostgreSQL) - TCP 1433 (MS SQL Server) - UDP 161 (SNMP) - TCP 8006 (Proxmox) ``` ### 3.2 Firewall Rules ``` # Allow LLM host to management networks Source: [LLM_HOST_IP] Destination: Management Networks Protocol: SSH, HTTPS, SNMP, Database ports Action: ALLOW # Deny all other traffic from LLM host Source: [LLM_HOST_IP] Destination: Production Networks Action: DENY ``` --- ## 4. Rate Limiting e Best Practices ### 4.1 API Call Limits ```python # Rispettare rate limits dei vendor RATE_LIMITS = { 'vmware_vcenter': {'calls_per_minute': 100}, 'network_devices': {'calls_per_minute': 10}, 'storage_api': {'calls_per_minute': 60}, 'monitoring_api': {'calls_per_minute': 300} } # Implementare retry logic con exponential backoff import time from functools import wraps def retry_with_backoff(max_retries=3, base_delay=1): def decorator(func): @wraps(func) def wrapper(*args, **kwargs): for attempt in range(max_retries): try: return func(*args, **kwargs) except Exception as e: if attempt == max_retries - 1: raise delay = base_delay * (2 ** attempt) time.sleep(delay) return wrapper return decorator ``` ### 4.2 Concurrent Operations ```python # Limitare operazioni concorrenti from concurrent.futures import ThreadPoolExecutor MAX_WORKERS = 5 # Non saturare le risorse with ThreadPoolExecutor(max_workers=MAX_WORKERS) as executor: futures = [executor.submit(query_device, device) for device in devices] results = [f.result() for f in futures] ``` --- ## 5. Error Handling e Logging ### 5.1 Logging Configuration ```python import logging logging.basicConfig( level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', handlers=[ logging.FileHandler('/var/log/datacenter-docs/generation.log'), logging.StreamHandler() ] ) logger = logging.getLogger('datacenter-docs') ``` ### 5.2 Error Handling Strategy ```python class DataCollectionError(Exception): """Custom exception per errori di raccolta dati""" pass try: data = collect_vmware_data() except ConnectionError as e: logger.error(f"Cannot connect to vCenter: {e}") # Utilizzare dati cached se disponibili data = load_cached_data('vmware') except AuthenticationError as e: logger.critical(f"Authentication failed: {e}") # Inviare alert al team send_alert("VMware auth failed") except Exception as e: logger.exception(f"Unexpected error: {e}") # Continuare con dati parziali data = get_partial_data() ``` --- ## 6. Caching e Performance ### 6.1 Cache Strategy ```python import redis from datetime import timedelta # Setup Redis per caching cache = redis.Redis(host='localhost', port=6379, db=0) def get_cached_or_fetch(key, fetch_function, ttl=3600): """Get from cache or fetch if not available""" cached = cache.get(key) if cached: logger.info(f"Cache hit for {key}") return json.loads(cached) logger.info(f"Cache miss for {key}, fetching...") data = fetch_function() cache.setex(key, ttl, json.dumps(data)) return data # Esempio uso vmware_inventory = get_cached_or_fetch( 'vmware_inventory', lambda: collect_vmware_inventory(), ttl=3600 # 1 hour ) ``` ### 6.2 Dati da Cachare - **1 ora**: Performance metrics, status real-time - **6 ore**: Inventory, configurazioni - **24 ore**: Asset database, ownership info - **7 giorni**: Historical trends, capacity planning --- ## 7. Schedule di Esecuzione ### 7.1 Cron Schedule Raccomandato ```cron # Aggiornamento documentazione completa - ogni 6 ore 0 */6 * * * /usr/local/bin/generate-datacenter-docs.sh --full # Quick update (solo metrics) - ogni ora 0 * * * * /usr/local/bin/generate-datacenter-docs.sh --metrics-only # Weekly comprehensive report - domenica notte 0 2 * * 0 /usr/local/bin/generate-datacenter-docs.sh --full --detailed ``` ### 7.2 Script Wrapper Esempio ```bash #!/bin/bash # generate-datacenter-docs.sh set -e LOGFILE="/var/log/datacenter-docs/$(date +%Y%m%d_%H%M%S).log" LOCKFILE="/var/run/datacenter-docs.lock" # Prevent concurrent executions if [ -f "$LOCKFILE" ]; then echo "Another instance is running. Exiting." exit 1 fi touch "$LOCKFILE" trap "rm -f $LOCKFILE" EXIT # Activate virtual environment source /opt/datacenter-docs/venv/bin/activate # Run Python script with parameters python3 /opt/datacenter-docs/main.py "$@" 2>&1 | tee -a "$LOGFILE" # Cleanup old logs (keep 30 days) find /var/log/datacenter-docs/ -name "*.log" -mtime +30 -delete ``` --- ## 8. Output e Validazione ### 8.1 Post-Generation Checks ```python def validate_documentation(section_file): """Valida il documento generato""" checks = { 'file_exists': os.path.exists(section_file), 'not_empty': os.path.getsize(section_file) > 0, 'valid_markdown': validate_markdown_syntax(section_file), 'no_placeholders': not contains_placeholders(section_file), 'token_limit': count_tokens(section_file) < 50000 } if all(checks.values()): logger.info(f"✓ {section_file} validation passed") return True else: failed = [k for k, v in checks.items() if not v] logger.error(f"✗ {section_file} validation failed: {failed}") return False def contains_placeholders(file_path): """Check per placeholders non sostituiti""" with open(file_path, 'r') as f: content = f.read() patterns = [r'\[.*?\]', r'\{.*?\}', r'TODO', r'FIXME'] import re return any(re.search(p, content) for p in patterns) ``` ### 8.2 Notification System ```python def send_completion_notification(success, sections_updated, errors): """Invia notifica a fine generazione""" message = f""" Datacenter Documentation Update Status: {'✓ SUCCESS' if success else '✗ FAILED'} Sections Updated: {', '.join(sections_updated)} Errors: {len(errors)} {'Errors:\n' + '\n'.join(errors) if errors else ''} Timestamp: {datetime.now().isoformat()} """ # Send via multiple channels send_email(recipients=['ops-team@company.com'], subject='Doc Update', body=message) send_slack(channel='#datacenter-ops', message=message) # send_teams / send_webhook as needed ``` --- ## 9. Security Considerations ### 9.1 Secrets Management ```python # NON salvare mai credenziali in chiaro # Utilizzare sempre un vault from cryptography.fernet import Fernet import keyring def get_credential(service, account): """Retrieve credential from OS keyring""" return keyring.get_password(service, account) # Oppure HashiCorp Vault import hvac client = hvac.Client(url='https://vault.company.com') client.auth.approle.login(role_id=ROLE_ID, secret_id=SECRET_ID) credentials = client.secrets.kv.v2.read_secret_version(path='datacenter/creds') ``` ### 9.2 Audit Trail ```python # Log TUTTE le operazioni per audit audit_log = { 'timestamp': datetime.now().isoformat(), 'user': 'automation-account', 'action': 'documentation_generation', 'sections': sections_updated, 'systems_accessed': list_of_systems, 'duration': elapsed_time, 'success': True/False } write_audit_log(audit_log) ``` --- ## 10. Troubleshooting ### 10.1 Common Issues | Problema | Causa Probabile | Soluzione | |----------|----------------|-----------| | Connection Timeout | Firewall/Network | Verificare connectivity, firewall rules | | Authentication Failed | Credenziali errate/scadute | Ruotare credenziali, verificare vault | | API Rate Limit | Troppe richieste | Implementare backoff, ridurre frequency | | Incomplete Data | Source temporaneamente down | Usare cached data, generare partial doc | | Token Limit Exceeded | Troppi dati in sezione | Rimuovere dati storici, ottimizzare formato | ### 10.2 Debug Mode ```python # Abilitare debug per troubleshooting DEBUG = os.getenv('DEBUG', 'False').lower() == 'true' if DEBUG: logging.getLogger().setLevel(logging.DEBUG) # Salvare raw responses per analisi with open(f'debug_{timestamp}.json', 'w') as f: json.dump(raw_response, f, indent=2) ``` --- ## 11. Testing ### 11.1 Unit Tests ```python import unittest class TestDataCollection(unittest.TestCase): def test_vmware_connection(self): """Test connessione a vCenter""" result = test_vmware_connection() self.assertTrue(result.success) def test_data_validation(self): """Test validazione dati raccolti""" sample_data = load_sample_data() self.assertTrue(validate_data_structure(sample_data)) ``` ### 11.2 Integration Tests ```bash # Test end-to-end in ambiente di test ./run-tests.sh --integration --environment=test # Verificare che tutti i sistemi siano raggiungibili ./check-connectivity.sh # Dry-run senza salvare python3 main.py --dry-run --verbose ``` --- ## Checklist Pre-Deployment Prima di mettere in produzione il sistema: - [ ] Tutte le librerie installate - [ ] Credenziali configurate in vault sicuro - [ ] Connectivity verificata verso tutti i sistemi - [ ] Permessi account automation validati (read-only) - [ ] Firewall rules approvate e configurate - [ ] Logging configurato e testato - [ ] Notification system testato - [ ] Cron jobs configurati - [ ] Backup documentazione esistente - [ ] Runbook operativo completato - [ ] Escalation path definito - [ ] DR procedure documentate --- **Documento Versione**: 1.0 **Ultimo Aggiornamento**: 2025-01-XX **Owner**: Automation Team