# πŸ“š Automated Infrastructure Documentation System Sistema automatizzato per la generazione e mantenimento della documentazione tecnica dell'infrastruttura aziendale tramite LLM locale con validazione umana e pubblicazione GitOps. [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/) [![Kafka](https://img.shields.io/badge/Kafka-3.6+-red.svg)](https://kafka.apache.org/) ## πŸ“‹ Indice - [Overview](#overview) - [Architettura](#architettura) - [Schema Architetturale](#schema-architetturale) - [Schema Tecnico](#schema-tecnico) - [Contatti](#contatti) ## 🎯 Overview Sistema progettato per **automatizzare la creazione e l'aggiornamento della documentazione tecnica** di sistemi infrastrutturali complessi (VMware, Kubernetes, Linux, Cisco, ecc.) utilizzando un Large Language Model locale (Qwen). ### Caratteristiche Principali - βœ… **Raccolta dati asincrona** da molteplici sistemi infrastrutturali - βœ… **Isolamento di sicurezza**: LLM non accede mai ai sistemi live - βœ… **Event-driven architecture** con Apache Kafka - βœ… **LLM locale on-premise** (Qwen) tramite MCP Server - βœ… **Human-in-the-loop validation** con workflow GitOps - βœ… **CI/CD automatizzato** per pubblicazione ## πŸ—οΈ Architettura Il sistema Γ¨ suddiviso in **3 flussi principali**: 1. **Raccolta Dati (Background)**: Connettori interrogano periodicamente i sistemi infrastrutturali tramite API e pubblicano i dati su Kafka 2. **Generazione Documentazione (On-Demand)**: LLM locale (Qwen) genera markdown interrogando Kafka/Redis tramite MCP Server 3. **Validazione e Pubblicazione (GitOps)**: Review umana su Pull Request e deploy automatico via CI/CD > **Principio di Sicurezza**: L'LLM non ha mai accesso diretto ai sistemi infrastrutturali. Tutti i dati passano attraverso Kafka/Redis. --- ## πŸ“Š Schema Architetturale ### Management View Schema semplificato per presentazioni executive e management. ```mermaid graph TB %% Styling classDef infrastructure fill:#e1f5ff,stroke:#01579b,stroke-width:3px,color:#333 classDef kafka fill:#fff3e0,stroke:#e65100,stroke-width:3px,color:#333 classDef cache fill:#f3e5f5,stroke:#4a148c,stroke-width:3px,color:#333 classDef llm fill:#e8f5e9,stroke:#1b5e20,stroke-width:3px,color:#333 classDef git fill:#fce4ec,stroke:#880e4f,stroke-width:3px,color:#333 classDef human fill:#fff9c4,stroke:#f57f17,stroke-width:3px,color:#333 %% ======================================== %% FLUSSO 1: RACCOLTA DATI (Background) %% ======================================== INFRA[("🏒 SISTEMI
INFRASTRUTTURALI

VMware | K8s | Linux | Cisco")]:::infrastructure CONN["πŸ”Œ CONNETTORI
Polling Automatico"]:::infrastructure KAFKA[("πŸ“¨ APACHE KAFKA
Message Broker
+ Persistenza")]:::kafka CONSUMER["βš™οΈ KAFKA CONSUMER
Processor Service"]:::kafka REDIS[("πŸ’Ύ REDIS CACHE
(Opzionale)
Performance Layer")]:::cache INFRA -->|"API Polling
Continuo"| CONN CONN -->|"Publish
Eventi"| KAFKA KAFKA -->|"Consume
Stream"| CONSUMER CONSUMER -.->|"Update
Opzionale"| REDIS %% ======================================== %% FLUSSO 2: GENERAZIONE DOCUMENTAZIONE %% ======================================== USER["πŸ‘€ UTENTE
Richiesta Doc"]:::human LLM["πŸ€– LLM ENGINE
Claude / GPT"]:::llm MCP["πŸ”§ MCP SERVER
API Control Platform"]:::llm DOC["πŸ“„ DOCUMENTO
Markdown Generato"]:::llm USER -->|"1. Prompt"| LLM LLM -->|"2. Tool Call"| MCP MCP -->|"3a. Query"| KAFKA MCP -.->|"3b. Query
Fast"| REDIS KAFKA -->|"4a. Dati"| MCP REDIS -.->|"4b. Dati"| MCP MCP -->|"5. Context"| LLM LLM -->|"6. Genera"| DOC %% ======================================== %% FLUSSO 3: VALIDAZIONE E PUBBLICAZIONE %% ======================================== GIT["πŸ“¦ GITLAB
Repository"]:::git PR["πŸ”€ PULL REQUEST
Review Automatica"]:::git TECH["πŸ‘¨β€πŸ’Ό TEAM TECNICO
Validazione Umana"]:::human PIPELINE["⚑ CI/CD PIPELINE
GitLab Runner"]:::git MKDOCS["πŸ“š MKDOCS
Static Site Generator"]:::git WEB["🌐 DOCUMENTAZIONE
GitLab Pages
(Pubblicata)"]:::git DOC -->|"Push +
Branch"| GIT GIT -->|"Crea"| PR PR -->|"Notifica"| TECH TECH -->|"Approva +
Merge"| GIT GIT -->|"Trigger"| PIPELINE PIPELINE -->|"Build"| MKDOCS MKDOCS -->|"Deploy"| WEB %% ======================================== %% ANNOTAZIONI SICUREZZA %% ======================================== SECURITY["πŸ”’ SICUREZZA
LLM isolato dai sistemi live"]:::human PERF["⚑ PERFORMANCE
Cache Redis opzionale"]:::cache LLM -.->|"NESSUN
ACCESSO"| INFRA SECURITY -.-> LLM PERF -.-> REDIS ``` --- ## πŸ”§ Schema Tecnico ### Implementation View Schema dettagliato per il team tecnico con specifiche implementative. ```mermaid graph TB %% Styling tecnico classDef infra fill:#e1f5ff,stroke:#01579b,stroke-width:2px,color:#333,font-size:11px classDef connector fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#333,font-size:11px classDef kafka fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#333,font-size:11px classDef cache fill:#f3e5f5,stroke:#4a148c,stroke-width:2px,color:#333,font-size:11px classDef llm fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px,color:#333,font-size:11px classDef git fill:#fce4ec,stroke:#880e4f,stroke-width:2px,color:#333,font-size:11px classDef monitor fill:#fff8e1,stroke:#f57f17,stroke-width:2px,color:#333,font-size:11px %% ===================================== %% LAYER 1: SISTEMI SORGENTE %% ===================================== subgraph SOURCES["🏒 INFRASTRUCTURE SOURCES"] VCENTER["VMware vCenter
API: vSphere REST 7.0+
Port: 443/HTTPS
Auth: API Token"]:::infra K8S_API["Kubernetes API
API: v1.28+
Port: 6443/HTTPS
Auth: ServiceAccount + RBAC"]:::infra LINUX["Linux Servers
Protocol: SSH/Ansible
Port: 22
Auth: SSH Keys"]:::infra CISCO["Cisco Devices
Protocol: NETCONF/RESTCONF
Port: 830/443
Auth: AAA"]:::infra end %% ===================================== %% LAYER 2: CONNETTORI %% ===================================== subgraph CONNECTORS["πŸ”Œ DATA COLLECTORS (Python/Go)"] CONN_VM["VMware Collector
Lang: Python 3.11
Lib: pyvmomi
Schedule: */15 * * * *
Output: JSON"]:::connector CONN_K8S["K8s Collector
Lang: Python 3.11
Lib: kubernetes-client
Schedule: */5 * * * *
Resources: pods,svc,ing,deploy"]:::connector CONN_LNX["Linux Collector
Lang: Python 3.11
Lib: paramiko/ansible
Schedule: */30 * * * *
Data: sysinfo,packages,services"]:::connector CONN_CSC["Cisco Collector
Lang: Python 3.11
Lib: ncclient
Schedule: */30 * * * *
Data: interfaces,routing,vlans"]:::connector end VCENTER -->|"GET /api/vcenter/vm"| CONN_VM K8S_API -->|"kubectl proxy
API calls"| CONN_K8S LINUX -->|"SSH batch
commands"| CONN_LNX CISCO -->|"NETCONF
get-config"| CONN_CSC %% ===================================== %% LAYER 3: MESSAGE BROKER %% ===================================== subgraph MESSAGING["πŸ“¨ KAFKA CLUSTER (3 brokers)"] KAFKA_TOPICS["Kafka Topics:
β€’ vmware.inventory (P:6, R:3)
β€’ k8s.resources (P:12, R:3)
β€’ linux.systems (P:3, R:3)
β€’ cisco.network (P:3, R:3)
Retention: 7 days
Format: JSON + Schema Registry"]:::kafka SCHEMA["Schema Registry
Avro Schemas
Versioning enabled
Port: 8081"]:::kafka end CONN_VM -->|"Producer
Batch 100 msg"| KAFKA_TOPICS CONN_K8S -->|"Producer
Batch 100 msg"| KAFKA_TOPICS CONN_LNX -->|"Producer
Batch 50 msg"| KAFKA_TOPICS CONN_CSC -->|"Producer
Batch 50 msg"| KAFKA_TOPICS KAFKA_TOPICS <--> SCHEMA %% ===================================== %% LAYER 4: PROCESSING & CACHE %% ===================================== subgraph PROCESSING["βš™οΈ STREAM PROCESSING"] CONSUMER_GRP["Kafka Consumer Group
Group ID: doc-consumers
Lang: Python 3.11
Lib: kafka-python
Workers: 6
Commit: auto (5s)"]:::kafka PROCESSOR["Data Processor
β€’ Validation
β€’ Transformation
β€’ Enrichment
β€’ Deduplication"]:::kafka end KAFKA_TOPICS -->|"Subscribe
offset management"| CONSUMER_GRP CONSUMER_GRP --> PROCESSOR subgraph STORAGE["πŸ’Ύ CACHE LAYER (Optional)"] REDIS_CLUSTER["Redis Cluster
Mode: Cluster (6 nodes)
Port: 6379
Persistence: RDB + AOF
Memory: 64GB
Eviction: allkeys-lru"]:::cache REDIS_KEYS["Key Structure:
β€’ vmware:vcenter-id:vms
β€’ k8s:cluster:namespace:resource
β€’ linux:hostname:info
β€’ cisco:device-id:config
TTL: 1-24h based on type"]:::cache end PROCESSOR -.->|"SET/HSET
Pipeline batch"| REDIS_CLUSTER REDIS_CLUSTER --> REDIS_KEYS %% ===================================== %% LAYER 5: LLM & MCP %% ===================================== subgraph LLM_LAYER["πŸ€– AI GENERATION LAYER"] LLM_ENGINE["LLM Engine
Model: Claude Sonnet 4 / GPT-4
API: Anthropic/OpenAI
Temp: 0.3
Max Tokens: 4096
Timeout: 120s"]:::llm MCP_SERVER["MCP Server
Lang: TypeScript/Node.js
Port: 3000
Protocol: JSON-RPC 2.0
Auth: JWT tokens"]:::llm MCP_TOOLS["MCP Tools:
β€’ getVMwareInventory(vcenter)
β€’ getK8sResources(cluster,ns,type)
β€’ getLinuxSystemInfo(hostname)
β€’ getCiscoConfig(device,section)
β€’ queryTimeRange(start,end)
Return: JSON + Metadata"]:::llm end LLM_ENGINE <-->|"Tool calls
JSON-RPC"| MCP_SERVER MCP_SERVER --> MCP_TOOLS MCP_TOOLS -->|"1. Query Kafka Consumer API
GET /api/v1/data"| CONSUMER_GRP MCP_TOOLS -.->|"2. Fallback Redis
MGET/HGETALL"| REDIS_CLUSTER CONSUMER_GRP -->|"JSON Response
+ Timestamps"| MCP_TOOLS REDIS_CLUSTER -.->|"Cached JSON
Fast response"| MCP_TOOLS MCP_TOOLS -->|"Structured Data
+ Context"| LLM_ENGINE subgraph OUTPUT["πŸ“ DOCUMENT GENERATION"] TEMPLATE["Template Engine
Format: Jinja2
Templates: markdown/*.j2
Variables: from LLM"]:::llm MARKDOWN["Markdown Output
Format: CommonMark
Metadata: YAML frontmatter
Assets: diagrams in mermaid"]:::llm VALIDATOR["Doc Validator
β€’ Markdown linting
β€’ Link checking
β€’ Schema validation"]:::llm end LLM_ENGINE --> TEMPLATE TEMPLATE --> MARKDOWN MARKDOWN --> VALIDATOR %% ===================================== %% LAYER 6: GITOPS %% ===================================== subgraph GITOPS["πŸ”„ GITOPS WORKFLOW"] GIT_REPO["GitLab Repository
URL: gitlab.com/docs/infra
Branch strategy: main + feature/*
Protected: main (require approval)"]:::git GIT_API["GitLab API
API: v4
Auth: Project Access Token
Permissions: api, write_repo"]:::git PR_AUTO["Automated PR Creator
Lang: Python 3.11
Lib: python-gitlab
Template: .gitlab/merge_request.md"]:::git end VALIDATOR -->|"git add/commit/push"| GIT_REPO GIT_REPO <--> GIT_API GIT_API --> PR_AUTO REVIEWER["πŸ‘¨β€πŸ’Ό Technical Reviewer
Role: Maintainer/Owner
Review: diff + validation
Approve: required (min 1)"]:::monitor PR_AUTO -->|"Notification
Email + Slack"| REVIEWER REVIEWER -->|"Merge to main"| GIT_REPO %% ===================================== %% LAYER 7: CI/CD & PUBLISH %% ===================================== subgraph CICD["⚑ CI/CD PIPELINE"] GITLAB_CI["GitLab CI/CD
Runner: docker
Image: python:3.11-alpine
Stages: build, test, deploy"]:::git PIPELINE_JOBS["Pipeline Jobs:
1. lint (markdownlint-cli)
2. build (mkdocs build)
3. test (link-checker)
4. deploy (rsync/s3)"]:::git MKDOCS_CFG["MkDocs Config
Theme: material
Plugins: search, tags, mermaid
Extensions: admonition, codehilite"]:::git end GIT_REPO -->|"on: push to main
Webhook trigger"| GITLAB_CI GITLAB_CI --> PIPELINE_JOBS PIPELINE_JOBS --> MKDOCS_CFG subgraph PUBLISH["🌐 PUBLICATION"] STATIC_SITE["Static Site
Generator: MkDocs
Output: HTML/CSS/JS
Assets: optimized images"]:::git CDN["GitLab Pages / S3 + CloudFront
URL: docs.company.com
SSL: Let's Encrypt
Cache: 1h"]:::git SEARCH["Search Index
Engine: Algolia/Meilisearch
Update: on publish
API: REST"]:::git end MKDOCS_CFG -->|"mkdocs build
--strict"| STATIC_SITE STATIC_SITE --> CDN STATIC_SITE --> SEARCH %% ===================================== %% LAYER 8: MONITORING & OBSERVABILITY %% ===================================== subgraph OBSERVABILITY["πŸ“Š MONITORING & LOGGING"] PROMETHEUS["Prometheus
Metrics: collector lag, cache hit/miss
Scrape: 30s
Retention: 15d"]:::monitor GRAFANA["Grafana Dashboards
β€’ Kafka metrics
β€’ Redis performance
β€’ LLM response times
β€’ Pipeline success rate"]:::monitor ELK["ELK Stack
Logs: all components
Index: daily rotation
Retention: 30d"]:::monitor ALERTS["Alerting
β€’ Connector failures
β€’ Kafka lag > 10k
β€’ Redis OOM
β€’ Pipeline failures
Channel: Slack + PagerDuty"]:::monitor end CONN_VM -.->|"metrics"| PROMETHEUS CONN_K8S -.->|"metrics"| PROMETHEUS KAFKA_TOPICS -.->|"metrics"| PROMETHEUS REDIS_CLUSTER -.->|"metrics"| PROMETHEUS MCP_SERVER -.->|"metrics"| PROMETHEUS GITLAB_CI -.->|"metrics"| PROMETHEUS PROMETHEUS --> GRAFANA CONN_VM -.->|"logs"| ELK CONSUMER_GRP -.->|"logs"| ELK MCP_SERVER -.->|"logs"| ELK GITLAB_CI -.->|"logs"| ELK GRAFANA --> ALERTS %% ===================================== %% SECURITY ANNOTATIONS %% ===================================== SEC1["πŸ”’ SECURITY:
β€’ All APIs use TLS 1.3
β€’ Secrets in Vault/K8s Secrets
β€’ Network: private VPC
β€’ LLM has NO direct access"]:::monitor SEC2["πŸ” AUTHENTICATION:
β€’ API Tokens rotated 90d
β€’ RBAC enforced
β€’ Audit logs enabled
β€’ MFA required for Git"]:::monitor SEC1 -.-> MCP_SERVER SEC2 -.-> GIT_REPO ``` --- ## πŸ“§ Contatti - **Team**: Infrastructure Documentation Team - **Email**: infra-docs@company.com - **GitLab**: https://gitlab.com/company/infra-docs-automation --- **Versione**: 1.0.0 **Ultimo aggiornamento**: 2025-10-28