# π Automated Infrastructure Documentation System
Sistema automatizzato per la generazione e mantenimento della documentazione tecnica dell'infrastruttura aziendale tramite LLM locale con validazione umana e pubblicazione GitOps.
[](https://opensource.org/licenses/MIT)
[](https://www.python.org/downloads/)
[](https://kafka.apache.org/)
## π Indice
- [Overview](#overview)
- [Architettura](#architettura)
- [Schema Architetturale](#schema-architetturale)
- [Schema Tecnico](#schema-tecnico)
- [Contatti](#contatti)
## π― Overview
Sistema progettato per **automatizzare la creazione e l'aggiornamento della documentazione tecnica** di sistemi infrastrutturali complessi (VMware, Kubernetes, Linux, Cisco, ecc.) utilizzando un Large Language Model locale (Qwen).
### Caratteristiche Principali
- β
**Raccolta dati asincrona** da molteplici sistemi infrastrutturali
- β
**Isolamento di sicurezza**: LLM non accede mai ai sistemi live
- β
**Event-driven architecture** con Apache Kafka
- β
**LLM locale on-premise** (Qwen) tramite MCP Server
- β
**Human-in-the-loop validation** con workflow GitOps
- β
**CI/CD automatizzato** per pubblicazione
## ποΈ Architettura
Il sistema Γ¨ suddiviso in **3 flussi principali**:
1. **Raccolta Dati (Background)**: Connettori interrogano periodicamente i sistemi infrastrutturali tramite API e pubblicano i dati su Kafka
2. **Generazione Documentazione (On-Demand)**: LLM locale (Qwen) genera markdown interrogando Kafka/Redis tramite MCP Server
3. **Validazione e Pubblicazione (GitOps)**: Review umana su Pull Request e deploy automatico via CI/CD
> **Principio di Sicurezza**: L'LLM non ha mai accesso diretto ai sistemi infrastrutturali. Tutti i dati passano attraverso Kafka/Redis.
---
## π Schema Architetturale
### Management View
Schema semplificato per presentazioni executive e management.
```mermaid
graph TB
%% Styling
classDef infrastructure fill:#e1f5ff,stroke:#01579b,stroke-width:3px,color:#333
classDef kafka fill:#fff3e0,stroke:#e65100,stroke-width:3px,color:#333
classDef cache fill:#f3e5f5,stroke:#4a148c,stroke-width:3px,color:#333
classDef llm fill:#e8f5e9,stroke:#1b5e20,stroke-width:3px,color:#333
classDef git fill:#fce4ec,stroke:#880e4f,stroke-width:3px,color:#333
classDef human fill:#fff9c4,stroke:#f57f17,stroke-width:3px,color:#333
%% ========================================
%% FLUSSO 1: RACCOLTA DATI (Background)
%% ========================================
INFRA[("π’ SISTEMI
INFRASTRUTTURALI
VMware | K8s | Linux | Cisco")]:::infrastructure
CONN["π CONNETTORI
Polling Automatico"]:::infrastructure
KAFKA[("π¨ APACHE KAFKA
Message Broker
+ Persistenza")]:::kafka
CONSUMER["βοΈ KAFKA CONSUMER
Processor Service"]:::kafka
REDIS[("πΎ REDIS CACHE
(Opzionale)
Performance Layer")]:::cache
INFRA -->|"API Polling
Continuo"| CONN
CONN -->|"Publish
Eventi"| KAFKA
KAFKA -->|"Consume
Stream"| CONSUMER
CONSUMER -.->|"Update
Opzionale"| REDIS
%% ========================================
%% FLUSSO 2: GENERAZIONE DOCUMENTAZIONE
%% ========================================
USER["π€ UTENTE
Richiesta Doc"]:::human
LLM["π€ LLM ENGINE
Claude / GPT"]:::llm
MCP["π§ MCP SERVER
API Control Platform"]:::llm
DOC["π DOCUMENTO
Markdown Generato"]:::llm
USER -->|"1. Prompt"| LLM
LLM -->|"2. Tool Call"| MCP
MCP -->|"3a. Query"| KAFKA
MCP -.->|"3b. Query
Fast"| REDIS
KAFKA -->|"4a. Dati"| MCP
REDIS -.->|"4b. Dati"| MCP
MCP -->|"5. Context"| LLM
LLM -->|"6. Genera"| DOC
%% ========================================
%% FLUSSO 3: VALIDAZIONE E PUBBLICAZIONE
%% ========================================
GIT["π¦ GITLAB
Repository"]:::git
PR["π PULL REQUEST
Review Automatica"]:::git
TECH["π¨βπΌ TEAM TECNICO
Validazione Umana"]:::human
PIPELINE["β‘ CI/CD PIPELINE
GitLab Runner"]:::git
MKDOCS["π MKDOCS
Static Site Generator"]:::git
WEB["π DOCUMENTAZIONE
GitLab Pages
(Pubblicata)"]:::git
DOC -->|"Push +
Branch"| GIT
GIT -->|"Crea"| PR
PR -->|"Notifica"| TECH
TECH -->|"Approva +
Merge"| GIT
GIT -->|"Trigger"| PIPELINE
PIPELINE -->|"Build"| MKDOCS
MKDOCS -->|"Deploy"| WEB
%% ========================================
%% ANNOTAZIONI SICUREZZA
%% ========================================
SECURITY["π SICUREZZA
LLM isolato dai sistemi live"]:::human
PERF["β‘ PERFORMANCE
Cache Redis opzionale"]:::cache
LLM -.->|"NESSUN
ACCESSO"| INFRA
SECURITY -.-> LLM
PERF -.-> REDIS
```
---
## π§ Schema Tecnico
### Implementation View
Schema dettagliato per il team tecnico con specifiche implementative.
```mermaid
graph TB
%% Styling tecnico
classDef infra fill:#e1f5ff,stroke:#01579b,stroke-width:2px,color:#333,font-size:11px
classDef connector fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#333,font-size:11px
classDef kafka fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#333,font-size:11px
classDef cache fill:#f3e5f5,stroke:#4a148c,stroke-width:2px,color:#333,font-size:11px
classDef llm fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px,color:#333,font-size:11px
classDef git fill:#fce4ec,stroke:#880e4f,stroke-width:2px,color:#333,font-size:11px
classDef monitor fill:#fff8e1,stroke:#f57f17,stroke-width:2px,color:#333,font-size:11px
%% =====================================
%% LAYER 1: SISTEMI SORGENTE
%% =====================================
subgraph SOURCES["π’ INFRASTRUCTURE SOURCES"]
VCENTER["VMware vCenter
API: vSphere REST 7.0+
Port: 443/HTTPS
Auth: API Token"]:::infra
K8S_API["Kubernetes API
API: v1.28+
Port: 6443/HTTPS
Auth: ServiceAccount + RBAC"]:::infra
LINUX["Linux Servers
Protocol: SSH/Ansible
Port: 22
Auth: SSH Keys"]:::infra
CISCO["Cisco Devices
Protocol: NETCONF/RESTCONF
Port: 830/443
Auth: AAA"]:::infra
end
%% =====================================
%% LAYER 2: CONNETTORI
%% =====================================
subgraph CONNECTORS["π DATA COLLECTORS (Python/Go)"]
CONN_VM["VMware Collector
Lang: Python 3.11
Lib: pyvmomi
Schedule: */15 * * * *
Output: JSON"]:::connector
CONN_K8S["K8s Collector
Lang: Python 3.11
Lib: kubernetes-client
Schedule: */5 * * * *
Resources: pods,svc,ing,deploy"]:::connector
CONN_LNX["Linux Collector
Lang: Python 3.11
Lib: paramiko/ansible
Schedule: */30 * * * *
Data: sysinfo,packages,services"]:::connector
CONN_CSC["Cisco Collector
Lang: Python 3.11
Lib: ncclient
Schedule: */30 * * * *
Data: interfaces,routing,vlans"]:::connector
end
VCENTER -->|"GET /api/vcenter/vm"| CONN_VM
K8S_API -->|"kubectl proxy
API calls"| CONN_K8S
LINUX -->|"SSH batch
commands"| CONN_LNX
CISCO -->|"NETCONF
get-config"| CONN_CSC
%% =====================================
%% LAYER 3: MESSAGE BROKER
%% =====================================
subgraph MESSAGING["π¨ KAFKA CLUSTER (3 brokers)"]
KAFKA_TOPICS["Kafka Topics:
β’ vmware.inventory (P:6, R:3)
β’ k8s.resources (P:12, R:3)
β’ linux.systems (P:3, R:3)
β’ cisco.network (P:3, R:3)
Retention: 7 days
Format: JSON + Schema Registry"]:::kafka
SCHEMA["Schema Registry
Avro Schemas
Versioning enabled
Port: 8081"]:::kafka
end
CONN_VM -->|"Producer
Batch 100 msg"| KAFKA_TOPICS
CONN_K8S -->|"Producer
Batch 100 msg"| KAFKA_TOPICS
CONN_LNX -->|"Producer
Batch 50 msg"| KAFKA_TOPICS
CONN_CSC -->|"Producer
Batch 50 msg"| KAFKA_TOPICS
KAFKA_TOPICS <--> SCHEMA
%% =====================================
%% LAYER 4: PROCESSING & CACHE
%% =====================================
subgraph PROCESSING["βοΈ STREAM PROCESSING"]
CONSUMER_GRP["Kafka Consumer Group
Group ID: doc-consumers
Lang: Python 3.11
Lib: kafka-python
Workers: 6
Commit: auto (5s)"]:::kafka
PROCESSOR["Data Processor
β’ Validation
β’ Transformation
β’ Enrichment
β’ Deduplication"]:::kafka
end
KAFKA_TOPICS -->|"Subscribe
offset management"| CONSUMER_GRP
CONSUMER_GRP --> PROCESSOR
subgraph STORAGE["πΎ CACHE LAYER (Optional)"]
REDIS_CLUSTER["Redis Cluster
Mode: Cluster (6 nodes)
Port: 6379
Persistence: RDB + AOF
Memory: 64GB
Eviction: allkeys-lru"]:::cache
REDIS_KEYS["Key Structure:
β’ vmware:vcenter-id:vms
β’ k8s:cluster:namespace:resource
β’ linux:hostname:info
β’ cisco:device-id:config
TTL: 1-24h based on type"]:::cache
end
PROCESSOR -.->|"SET/HSET
Pipeline batch"| REDIS_CLUSTER
REDIS_CLUSTER --> REDIS_KEYS
%% =====================================
%% LAYER 5: LLM & MCP
%% =====================================
subgraph LLM_LAYER["π€ AI GENERATION LAYER"]
LLM_ENGINE["LLM Engine
Model: Claude Sonnet 4 / GPT-4
API: Anthropic/OpenAI
Temp: 0.3
Max Tokens: 4096
Timeout: 120s"]:::llm
MCP_SERVER["MCP Server
Lang: TypeScript/Node.js
Port: 3000
Protocol: JSON-RPC 2.0
Auth: JWT tokens"]:::llm
MCP_TOOLS["MCP Tools:
β’ getVMwareInventory(vcenter)
β’ getK8sResources(cluster,ns,type)
β’ getLinuxSystemInfo(hostname)
β’ getCiscoConfig(device,section)
β’ queryTimeRange(start,end)
Return: JSON + Metadata"]:::llm
end
LLM_ENGINE <-->|"Tool calls
JSON-RPC"| MCP_SERVER
MCP_SERVER --> MCP_TOOLS
MCP_TOOLS -->|"1. Query Kafka Consumer API
GET /api/v1/data"| CONSUMER_GRP
MCP_TOOLS -.->|"2. Fallback Redis
MGET/HGETALL"| REDIS_CLUSTER
CONSUMER_GRP -->|"JSON Response
+ Timestamps"| MCP_TOOLS
REDIS_CLUSTER -.->|"Cached JSON
Fast response"| MCP_TOOLS
MCP_TOOLS -->|"Structured Data
+ Context"| LLM_ENGINE
subgraph OUTPUT["π DOCUMENT GENERATION"]
TEMPLATE["Template Engine
Format: Jinja2
Templates: markdown/*.j2
Variables: from LLM"]:::llm
MARKDOWN["Markdown Output
Format: CommonMark
Metadata: YAML frontmatter
Assets: diagrams in mermaid"]:::llm
VALIDATOR["Doc Validator
β’ Markdown linting
β’ Link checking
β’ Schema validation"]:::llm
end
LLM_ENGINE --> TEMPLATE
TEMPLATE --> MARKDOWN
MARKDOWN --> VALIDATOR
%% =====================================
%% LAYER 6: GITOPS
%% =====================================
subgraph GITOPS["π GITOPS WORKFLOW"]
GIT_REPO["GitLab Repository
URL: gitlab.com/docs/infra
Branch strategy: main + feature/*
Protected: main (require approval)"]:::git
GIT_API["GitLab API
API: v4
Auth: Project Access Token
Permissions: api, write_repo"]:::git
PR_AUTO["Automated PR Creator
Lang: Python 3.11
Lib: python-gitlab
Template: .gitlab/merge_request.md"]:::git
end
VALIDATOR -->|"git add/commit/push"| GIT_REPO
GIT_REPO <--> GIT_API
GIT_API --> PR_AUTO
REVIEWER["π¨βπΌ Technical Reviewer
Role: Maintainer/Owner
Review: diff + validation
Approve: required (min 1)"]:::monitor
PR_AUTO -->|"Notification
Email + Slack"| REVIEWER
REVIEWER -->|"Merge to main"| GIT_REPO
%% =====================================
%% LAYER 7: CI/CD & PUBLISH
%% =====================================
subgraph CICD["β‘ CI/CD PIPELINE"]
GITLAB_CI["GitLab CI/CD
Runner: docker
Image: python:3.11-alpine
Stages: build, test, deploy"]:::git
PIPELINE_JOBS["Pipeline Jobs:
1. lint (markdownlint-cli)
2. build (mkdocs build)
3. test (link-checker)
4. deploy (rsync/s3)"]:::git
MKDOCS_CFG["MkDocs Config
Theme: material
Plugins: search, tags, mermaid
Extensions: admonition, codehilite"]:::git
end
GIT_REPO -->|"on: push to main
Webhook trigger"| GITLAB_CI
GITLAB_CI --> PIPELINE_JOBS
PIPELINE_JOBS --> MKDOCS_CFG
subgraph PUBLISH["π PUBLICATION"]
STATIC_SITE["Static Site
Generator: MkDocs
Output: HTML/CSS/JS
Assets: optimized images"]:::git
CDN["GitLab Pages / S3 + CloudFront
URL: docs.company.com
SSL: Let's Encrypt
Cache: 1h"]:::git
SEARCH["Search Index
Engine: Algolia/Meilisearch
Update: on publish
API: REST"]:::git
end
MKDOCS_CFG -->|"mkdocs build
--strict"| STATIC_SITE
STATIC_SITE --> CDN
STATIC_SITE --> SEARCH
%% =====================================
%% LAYER 8: MONITORING & OBSERVABILITY
%% =====================================
subgraph OBSERVABILITY["π MONITORING & LOGGING"]
PROMETHEUS["Prometheus
Metrics: collector lag, cache hit/miss
Scrape: 30s
Retention: 15d"]:::monitor
GRAFANA["Grafana Dashboards
β’ Kafka metrics
β’ Redis performance
β’ LLM response times
β’ Pipeline success rate"]:::monitor
ELK["ELK Stack
Logs: all components
Index: daily rotation
Retention: 30d"]:::monitor
ALERTS["Alerting
β’ Connector failures
β’ Kafka lag > 10k
β’ Redis OOM
β’ Pipeline failures
Channel: Slack + PagerDuty"]:::monitor
end
CONN_VM -.->|"metrics"| PROMETHEUS
CONN_K8S -.->|"metrics"| PROMETHEUS
KAFKA_TOPICS -.->|"metrics"| PROMETHEUS
REDIS_CLUSTER -.->|"metrics"| PROMETHEUS
MCP_SERVER -.->|"metrics"| PROMETHEUS
GITLAB_CI -.->|"metrics"| PROMETHEUS
PROMETHEUS --> GRAFANA
CONN_VM -.->|"logs"| ELK
CONSUMER_GRP -.->|"logs"| ELK
MCP_SERVER -.->|"logs"| ELK
GITLAB_CI -.->|"logs"| ELK
GRAFANA --> ALERTS
%% =====================================
%% SECURITY ANNOTATIONS
%% =====================================
SEC1["π SECURITY:
β’ All APIs use TLS 1.3
β’ Secrets in Vault/K8s Secrets
β’ Network: private VPC
β’ LLM has NO direct access"]:::monitor
SEC2["π AUTHENTICATION:
β’ API Tokens rotated 90d
β’ RBAC enforced
β’ Audit logs enabled
β’ MFA required for Git"]:::monitor
SEC1 -.-> MCP_SERVER
SEC2 -.-> GIT_REPO
```
---
## π§ Contatti
- **Team**: Infrastructure Documentation Team
- **Email**: infra-docs@company.com
- **GitLab**: https://gitlab.com/company/infra-docs-automation
---
**Versione**: 1.0.0
**Ultimo aggiornamento**: 2025-10-28