From 5b94e0a0469bf11dc73a3faf13164ef3915d4d52 Mon Sep 17 00:00:00 2001 From: Daniele Viti Date: Tue, 28 Oct 2025 11:16:45 +0000 Subject: [PATCH] Add scheme.md --- scheme.md | 374 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 374 insertions(+) create mode 100644 scheme.md diff --git a/scheme.md b/scheme.md new file mode 100644 index 0000000..9864a1e --- /dev/null +++ b/scheme.md @@ -0,0 +1,374 @@ +# πŸ“š Automated Infrastructure Documentation System + +Sistema automatizzato per la generazione e mantenimento della documentazione tecnica dell'infrastruttura aziendale tramite LLM locale con validazione umana e pubblicazione GitOps. + +[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) +[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/) +[![Kafka](https://img.shields.io/badge/Kafka-3.6+-red.svg)](https://kafka.apache.org/) + +## πŸ“‹ Indice + +- [Overview](#overview) +- [Architettura](#architettura) +- [Schema Architetturale](#schema-architetturale) +- [Schema Tecnico](#schema-tecnico) +- [Contatti](#contatti) + +## 🎯 Overview + +Sistema progettato per **automatizzare la creazione e l'aggiornamento della documentazione tecnica** di sistemi infrastrutturali complessi (VMware, Kubernetes, Linux, Cisco, ecc.) utilizzando un Large Language Model locale (Qwen). + +### Caratteristiche Principali + +- βœ… **Raccolta dati asincrona** da molteplici sistemi infrastrutturali +- βœ… **Isolamento di sicurezza**: LLM non accede mai ai sistemi live +- βœ… **Event-driven architecture** con Apache Kafka +- βœ… **LLM locale on-premise** (Qwen) tramite MCP Server +- βœ… **Human-in-the-loop validation** con workflow GitOps +- βœ… **CI/CD automatizzato** per pubblicazione + +## πŸ—οΈ Architettura + +Il sistema Γ¨ suddiviso in **3 flussi principali**: + +1. **Raccolta Dati (Background)**: Connettori interrogano periodicamente i sistemi infrastrutturali tramite API e pubblicano i dati su Kafka +2. **Generazione Documentazione (On-Demand)**: LLM locale (Qwen) genera markdown interrogando Kafka/Redis tramite MCP Server +3. **Validazione e Pubblicazione (GitOps)**: Review umana su Pull Request e deploy automatico via CI/CD + +> **Principio di Sicurezza**: L'LLM non ha mai accesso diretto ai sistemi infrastrutturali. Tutti i dati passano attraverso Kafka/Redis. + +--- + +## πŸ“Š Schema Architetturale + +### Management View + +Schema semplificato per presentazioni executive e management. + + + +```mermaid +graph TB + %% Styling + classDef infrastructure fill:#e1f5ff,stroke:#01579b,stroke-width:3px,color:#333 + classDef kafka fill:#fff3e0,stroke:#e65100,stroke-width:3px,color:#333 + classDef cache fill:#f3e5f5,stroke:#4a148c,stroke-width:3px,color:#333 + classDef llm fill:#e8f5e9,stroke:#1b5e20,stroke-width:3px,color:#333 + classDef git fill:#fce4ec,stroke:#880e4f,stroke-width:3px,color:#333 + classDef human fill:#fff9c4,stroke:#f57f17,stroke-width:3px,color:#333 + + %% ======================================== + %% FLUSSO 1: RACCOLTA DATI (Background) + %% ======================================== + + INFRA[("🏒 SISTEMI
INFRASTRUTTURALI

VMware | K8s | Linux | Cisco")]:::infrastructure + + CONN["πŸ”Œ CONNETTORI
Polling Automatico"]:::infrastructure + + KAFKA[("πŸ“¨ APACHE KAFKA
Message Broker
+ Persistenza")]:::kafka + + CONSUMER["βš™οΈ KAFKA CONSUMER
Processor Service"]:::kafka + + REDIS[("πŸ’Ύ REDIS CACHE
(Opzionale)
Performance Layer")]:::cache + + INFRA -->|"API Polling
Continuo"| CONN + CONN -->|"Publish
Eventi"| KAFKA + KAFKA -->|"Consume
Stream"| CONSUMER + CONSUMER -.->|"Update
Opzionale"| REDIS + + %% ======================================== + %% FLUSSO 2: GENERAZIONE DOCUMENTAZIONE + %% ======================================== + + USER["πŸ‘€ UTENTE
Richiesta Doc"]:::human + + LLM["πŸ€– LLM ENGINE
Claude / GPT"]:::llm + + MCP["πŸ”§ MCP SERVER
API Control Platform"]:::llm + + DOC["πŸ“„ DOCUMENTO
Markdown Generato"]:::llm + + USER -->|"1. Prompt"| LLM + LLM -->|"2. Tool Call"| MCP + MCP -->|"3a. Query"| KAFKA + MCP -.->|"3b. Query
Fast"| REDIS + KAFKA -->|"4a. Dati"| MCP + REDIS -.->|"4b. Dati"| MCP + MCP -->|"5. Context"| LLM + LLM -->|"6. Genera"| DOC + + %% ======================================== + %% FLUSSO 3: VALIDAZIONE E PUBBLICAZIONE + %% ======================================== + + GIT["πŸ“¦ GITLAB
Repository"]:::git + + PR["πŸ”€ PULL REQUEST
Review Automatica"]:::git + + TECH["πŸ‘¨β€πŸ’Ό TEAM TECNICO
Validazione Umana"]:::human + + PIPELINE["⚑ CI/CD PIPELINE
GitLab Runner"]:::git + + MKDOCS["πŸ“š MKDOCS
Static Site Generator"]:::git + + WEB["🌐 DOCUMENTAZIONE
GitLab Pages
(Pubblicata)"]:::git + + DOC -->|"Push +
Branch"| GIT + GIT -->|"Crea"| PR + PR -->|"Notifica"| TECH + TECH -->|"Approva +
Merge"| GIT + GIT -->|"Trigger"| PIPELINE + PIPELINE -->|"Build"| MKDOCS + MKDOCS -->|"Deploy"| WEB + + %% ======================================== + %% ANNOTAZIONI SICUREZZA + %% ======================================== + + SECURITY["πŸ”’ SICUREZZA
LLM isolato dai sistemi live"]:::human + PERF["⚑ PERFORMANCE
Cache Redis opzionale"]:::cache + + LLM -.->|"NESSUN
ACCESSO"| INFRA + + SECURITY -.-> LLM + PERF -.-> REDIS +``` + +--- + +## πŸ”§ Schema Tecnico + +### Implementation View + +Schema dettagliato per il team tecnico con specifiche implementative. + + + +```mermaid +graph TB + %% Styling tecnico + classDef infra fill:#e1f5ff,stroke:#01579b,stroke-width:2px,color:#333,font-size:11px + classDef connector fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#333,font-size:11px + classDef kafka fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#333,font-size:11px + classDef cache fill:#f3e5f5,stroke:#4a148c,stroke-width:2px,color:#333,font-size:11px + classDef llm fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px,color:#333,font-size:11px + classDef git fill:#fce4ec,stroke:#880e4f,stroke-width:2px,color:#333,font-size:11px + classDef monitor fill:#fff8e1,stroke:#f57f17,stroke-width:2px,color:#333,font-size:11px + + %% ===================================== + %% LAYER 1: SISTEMI SORGENTE + %% ===================================== + + subgraph SOURCES["🏒 INFRASTRUCTURE SOURCES"] + VCENTER["VMware vCenter
API: vSphere REST 7.0+
Port: 443/HTTPS
Auth: API Token"]:::infra + K8S_API["Kubernetes API
API: v1.28+
Port: 6443/HTTPS
Auth: ServiceAccount + RBAC"]:::infra + LINUX["Linux Servers
Protocol: SSH/Ansible
Port: 22
Auth: SSH Keys"]:::infra + CISCO["Cisco Devices
Protocol: NETCONF/RESTCONF
Port: 830/443
Auth: AAA"]:::infra + end + + %% ===================================== + %% LAYER 2: CONNETTORI + %% ===================================== + + subgraph CONNECTORS["πŸ”Œ DATA COLLECTORS (Python/Go)"] + CONN_VM["VMware Collector
Lang: Python 3.11
Lib: pyvmomi
Schedule: */15 * * * *
Output: JSON"]:::connector + + CONN_K8S["K8s Collector
Lang: Python 3.11
Lib: kubernetes-client
Schedule: */5 * * * *
Resources: pods,svc,ing,deploy"]:::connector + + CONN_LNX["Linux Collector
Lang: Python 3.11
Lib: paramiko/ansible
Schedule: */30 * * * *
Data: sysinfo,packages,services"]:::connector + + CONN_CSC["Cisco Collector
Lang: Python 3.11
Lib: ncclient
Schedule: */30 * * * *
Data: interfaces,routing,vlans"]:::connector + end + + VCENTER -->|"GET /api/vcenter/vm"| CONN_VM + K8S_API -->|"kubectl proxy
API calls"| CONN_K8S + LINUX -->|"SSH batch
commands"| CONN_LNX + CISCO -->|"NETCONF
get-config"| CONN_CSC + + %% ===================================== + %% LAYER 3: MESSAGE BROKER + %% ===================================== + + subgraph MESSAGING["πŸ“¨ KAFKA CLUSTER (3 brokers)"] + KAFKA_TOPICS["Kafka Topics:
β€’ vmware.inventory (P:6, R:3)
β€’ k8s.resources (P:12, R:3)
β€’ linux.systems (P:3, R:3)
β€’ cisco.network (P:3, R:3)
Retention: 7 days
Format: JSON + Schema Registry"]:::kafka + + SCHEMA["Schema Registry
Avro Schemas
Versioning enabled
Port: 8081"]:::kafka + end + + CONN_VM -->|"Producer
Batch 100 msg"| KAFKA_TOPICS + CONN_K8S -->|"Producer
Batch 100 msg"| KAFKA_TOPICS + CONN_LNX -->|"Producer
Batch 50 msg"| KAFKA_TOPICS + CONN_CSC -->|"Producer
Batch 50 msg"| KAFKA_TOPICS + + KAFKA_TOPICS <--> SCHEMA + + %% ===================================== + %% LAYER 4: PROCESSING & CACHE + %% ===================================== + + subgraph PROCESSING["βš™οΈ STREAM PROCESSING"] + CONSUMER_GRP["Kafka Consumer Group
Group ID: doc-consumers
Lang: Python 3.11
Lib: kafka-python
Workers: 6
Commit: auto (5s)"]:::kafka + + PROCESSOR["Data Processor
β€’ Validation
β€’ Transformation
β€’ Enrichment
β€’ Deduplication"]:::kafka + end + + KAFKA_TOPICS -->|"Subscribe
offset management"| CONSUMER_GRP + CONSUMER_GRP --> PROCESSOR + + subgraph STORAGE["πŸ’Ύ CACHE LAYER (Optional)"] + REDIS_CLUSTER["Redis Cluster
Mode: Cluster (6 nodes)
Port: 6379
Persistence: RDB + AOF
Memory: 64GB
Eviction: allkeys-lru"]:::cache + + REDIS_KEYS["Key Structure:
β€’ vmware:vcenter-id:vms
β€’ k8s:cluster:namespace:resource
β€’ linux:hostname:info
β€’ cisco:device-id:config
TTL: 1-24h based on type"]:::cache + end + + PROCESSOR -.->|"SET/HSET
Pipeline batch"| REDIS_CLUSTER + REDIS_CLUSTER --> REDIS_KEYS + + %% ===================================== + %% LAYER 5: LLM & MCP + %% ===================================== + + subgraph LLM_LAYER["πŸ€– AI GENERATION LAYER"] + LLM_ENGINE["LLM Engine
Model: Claude Sonnet 4 / GPT-4
API: Anthropic/OpenAI
Temp: 0.3
Max Tokens: 4096
Timeout: 120s"]:::llm + + MCP_SERVER["MCP Server
Lang: TypeScript/Node.js
Port: 3000
Protocol: JSON-RPC 2.0
Auth: JWT tokens"]:::llm + + MCP_TOOLS["MCP Tools:
β€’ getVMwareInventory(vcenter)
β€’ getK8sResources(cluster,ns,type)
β€’ getLinuxSystemInfo(hostname)
β€’ getCiscoConfig(device,section)
β€’ queryTimeRange(start,end)
Return: JSON + Metadata"]:::llm + end + + LLM_ENGINE <-->|"Tool calls
JSON-RPC"| MCP_SERVER + MCP_SERVER --> MCP_TOOLS + + MCP_TOOLS -->|"1. Query Kafka Consumer API
GET /api/v1/data"| CONSUMER_GRP + MCP_TOOLS -.->|"2. Fallback Redis
MGET/HGETALL"| REDIS_CLUSTER + + CONSUMER_GRP -->|"JSON Response
+ Timestamps"| MCP_TOOLS + REDIS_CLUSTER -.->|"Cached JSON
Fast response"| MCP_TOOLS + + MCP_TOOLS -->|"Structured Data
+ Context"| LLM_ENGINE + + subgraph OUTPUT["πŸ“ DOCUMENT GENERATION"] + TEMPLATE["Template Engine
Format: Jinja2
Templates: markdown/*.j2
Variables: from LLM"]:::llm + + MARKDOWN["Markdown Output
Format: CommonMark
Metadata: YAML frontmatter
Assets: diagrams in mermaid"]:::llm + + VALIDATOR["Doc Validator
β€’ Markdown linting
β€’ Link checking
β€’ Schema validation"]:::llm + end + + LLM_ENGINE --> TEMPLATE + TEMPLATE --> MARKDOWN + MARKDOWN --> VALIDATOR + + %% ===================================== + %% LAYER 6: GITOPS + %% ===================================== + + subgraph GITOPS["πŸ”„ GITOPS WORKFLOW"] + GIT_REPO["GitLab Repository
URL: gitlab.com/docs/infra
Branch strategy: main + feature/*
Protected: main (require approval)"]:::git + + GIT_API["GitLab API
API: v4
Auth: Project Access Token
Permissions: api, write_repo"]:::git + + PR_AUTO["Automated PR Creator
Lang: Python 3.11
Lib: python-gitlab
Template: .gitlab/merge_request.md"]:::git + end + + VALIDATOR -->|"git add/commit/push"| GIT_REPO + GIT_REPO <--> GIT_API + GIT_API --> PR_AUTO + + REVIEWER["πŸ‘¨β€πŸ’Ό Technical Reviewer
Role: Maintainer/Owner
Review: diff + validation
Approve: required (min 1)"]:::monitor + + PR_AUTO -->|"Notification
Email + Slack"| REVIEWER + REVIEWER -->|"Merge to main"| GIT_REPO + + %% ===================================== + %% LAYER 7: CI/CD & PUBLISH + %% ===================================== + + subgraph CICD["⚑ CI/CD PIPELINE"] + GITLAB_CI["GitLab CI/CD
Runner: docker
Image: python:3.11-alpine
Stages: build, test, deploy"]:::git + + PIPELINE_JOBS["Pipeline Jobs:
1. lint (markdownlint-cli)
2. build (mkdocs build)
3. test (link-checker)
4. deploy (rsync/s3)"]:::git + + MKDOCS_CFG["MkDocs Config
Theme: material
Plugins: search, tags, mermaid
Extensions: admonition, codehilite"]:::git + end + + GIT_REPO -->|"on: push to main
Webhook trigger"| GITLAB_CI + GITLAB_CI --> PIPELINE_JOBS + PIPELINE_JOBS --> MKDOCS_CFG + + subgraph PUBLISH["🌐 PUBLICATION"] + STATIC_SITE["Static Site
Generator: MkDocs
Output: HTML/CSS/JS
Assets: optimized images"]:::git + + CDN["GitLab Pages / S3 + CloudFront
URL: docs.company.com
SSL: Let's Encrypt
Cache: 1h"]:::git + + SEARCH["Search Index
Engine: Algolia/Meilisearch
Update: on publish
API: REST"]:::git + end + + MKDOCS_CFG -->|"mkdocs build
--strict"| STATIC_SITE + STATIC_SITE --> CDN + STATIC_SITE --> SEARCH + + %% ===================================== + %% LAYER 8: MONITORING & OBSERVABILITY + %% ===================================== + + subgraph OBSERVABILITY["πŸ“Š MONITORING & LOGGING"] + PROMETHEUS["Prometheus
Metrics: collector lag, cache hit/miss
Scrape: 30s
Retention: 15d"]:::monitor + + GRAFANA["Grafana Dashboards
β€’ Kafka metrics
β€’ Redis performance
β€’ LLM response times
β€’ Pipeline success rate"]:::monitor + + ELK["ELK Stack
Logs: all components
Index: daily rotation
Retention: 30d"]:::monitor + + ALERTS["Alerting
β€’ Connector failures
β€’ Kafka lag > 10k
β€’ Redis OOM
β€’ Pipeline failures
Channel: Slack + PagerDuty"]:::monitor + end + + CONN_VM -.->|"metrics"| PROMETHEUS + CONN_K8S -.->|"metrics"| PROMETHEUS + KAFKA_TOPICS -.->|"metrics"| PROMETHEUS + REDIS_CLUSTER -.->|"metrics"| PROMETHEUS + MCP_SERVER -.->|"metrics"| PROMETHEUS + GITLAB_CI -.->|"metrics"| PROMETHEUS + + PROMETHEUS --> GRAFANA + + CONN_VM -.->|"logs"| ELK + CONSUMER_GRP -.->|"logs"| ELK + MCP_SERVER -.->|"logs"| ELK + GITLAB_CI -.->|"logs"| ELK + + GRAFANA --> ALERTS + + %% ===================================== + %% SECURITY ANNOTATIONS + %% ===================================== + + SEC1["πŸ”’ SECURITY:
β€’ All APIs use TLS 1.3
β€’ Secrets in Vault/K8s Secrets
β€’ Network: private VPC
β€’ LLM has NO direct access"]:::monitor + + SEC2["πŸ” AUTHENTICATION:
β€’ API Tokens rotated 90d
β€’ RBAC enforced
β€’ Audit logs enabled
β€’ MFA required for Git"]:::monitor + + SEC1 -.-> MCP_SERVER + SEC2 -.-> GIT_REPO +``` + +--- + +## πŸ“§ Contatti + +- **Team**: Infrastructure Documentation Team +- **Email**: infra-docs@company.com +- **GitLab**: https://gitlab.com/company/infra-docs-automation + +--- + +**Versione**: 1.0.0 +**Ultimo aggiornamento**: 2025-10-28 \ No newline at end of file