diff --git a/scheme.md b/scheme.md
index fb3fce9..c8a4fdb 100644
--- a/scheme.md
+++ b/scheme.md
@@ -48,84 +48,72 @@ Il sistema Γ¨ suddiviso in **3 flussi principali**:
Schema semplificato per presentazioni executive e management.
-
-
```mermaid
graph TB
%% Styling
classDef infrastructure fill:#e1f5ff,stroke:#01579b,stroke-width:3px,color:#333
+ classDef kafka fill:#fff3e0,stroke:#e65100,stroke-width:3px,color:#333
classDef cache fill:#f3e5f5,stroke:#4a148c,stroke-width:3px,color:#333
- classDef change fill:#fff3e0,stroke:#e65100,stroke-width:3px,color:#333
classDef llm fill:#e8f5e9,stroke:#1b5e20,stroke-width:3px,color:#333
classDef git fill:#fce4ec,stroke:#880e4f,stroke-width:3px,color:#333
classDef human fill:#fff9c4,stroke:#f57f17,stroke-width:3px,color:#333
-
+
%% ========================================
%% FLUSSO 1: RACCOLTA DATI (Background)
%% ========================================
-
+
INFRA[("π’ SISTEMI
INFRASTRUTTURALI
VMware | K8s | Linux | Cisco")]:::infrastructure
-
+
CONN["π CONNETTORI
Polling Automatico"]:::infrastructure
-
- REDIS[("πΎ REDIS CACHE
Configurazione
Infrastruttura")]:::cache
-
+
+ KAFKA[("π¨ APACHE KAFKA
Message Broker
+ Persistenza")]:::kafka
+
+ CONSUMER["βοΈ KAFKA CONSUMER
Processor Service"]:::kafka
+
+ REDIS[("πΎ REDIS CACHE
(Opzionale)
Performance Layer")]:::cache
+
INFRA -->|"API Polling
Continuo"| CONN
- CONN -->|"Update
Configurazione"| REDIS
-
+ CONN -->|"Publish
Eventi"| KAFKA
+ KAFKA -->|"Consume
Stream"| CONSUMER
+ CONSUMER -.->|"Update
Opzionale"| REDIS
+
%% ========================================
- %% CHANGE DETECTION
+ %% FLUSSO 2: GENERAZIONE DOCUMENTAZIONE
%% ========================================
-
- CHANGE["π CHANGE DETECTOR
Rileva Modifiche
Configurazione"]:::change
-
- REDIS -->|"Monitor
Changes"| CHANGE
-
- %% ========================================
- %% FLUSSO 2: GENERAZIONE DOCUMENTAZIONE (Triggered)
- %% ========================================
-
- TRIGGER["β‘ TRIGGER
Solo se modifiche"]:::change
-
- USER["π€ UTENTE
Richiesta Manuale"]:::human
-
- LLM["π€ LLM ENGINE
Qwen (Locale)"]:::llm
-
+
+ USER["π€ UTENTE
Richiesta Doc"]:::human
+
+ LLM["π€ LLM ENGINE
Claude / GPT"]:::llm
+
MCP["π§ MCP SERVER
API Control Platform"]:::llm
-
+
DOC["π DOCUMENTO
Markdown Generato"]:::llm
-
- CHANGE -->|"Modifiche
Rilevate"| TRIGGER
- USER -.->|"Opzionale"| TRIGGER
-
- TRIGGER -->|"Avvia
Generazione"| LLM
- LLM -->|"Tool Call"| MCP
- MCP -->|"Query"| REDIS
- REDIS -->|"Dati Config"| MCP
- MCP -->|"Context"| LLM
- LLM -->|"Genera"| DOC
-
+
+ USER -->|"1. Prompt"| LLM
+ LLM -->|"2. Tool Call"| MCP
+ MCP -->|"3a. Query"| KAFKA
+ MCP -.->|"3b. Query
Fast"| REDIS
+ KAFKA -->|"4a. Dati"| MCP
+ REDIS -.->|"4b. Dati"| MCP
+ MCP -->|"5. Context"| LLM
+ LLM -->|"6. Genera"| DOC
+
%% ========================================
%% FLUSSO 3: VALIDAZIONE E PUBBLICAZIONE
%% ========================================
-
+
GIT["π¦ GITLAB
Repository"]:::git
-
+
PR["π PULL REQUEST
Review Automatica"]:::git
-
+
TECH["π¨βπΌ TEAM TECNICO
Validazione Umana"]:::human
-
+
PIPELINE["β‘ CI/CD PIPELINE
GitLab Runner"]:::git
-
+
MKDOCS["π MKDOCS
Static Site Generator"]:::git
-
+
WEB["π DOCUMENTAZIONE
GitLab Pages
(Pubblicata)"]:::git
-
+
DOC -->|"Push +
Branch"| GIT
GIT -->|"Crea"| PR
PR -->|"Notifica"| TECH
@@ -133,18 +121,18 @@ graph TB
GIT -->|"Trigger"| PIPELINE
PIPELINE -->|"Build"| MKDOCS
MKDOCS -->|"Deploy"| WEB
-
+
%% ========================================
- %% ANNOTAZIONI
+ %% ANNOTAZIONI SICUREZZA
%% ========================================
-
+
SECURITY["π SICUREZZA
LLM isolato dai sistemi live"]:::human
- EFFICIENCY["β‘ EFFICIENZA
Doc generata solo
su modifiche"]:::change
-
+ PERF["β‘ PERFORMANCE
Cache Redis opzionale"]:::cache
+
LLM -.->|"NESSUN
ACCESSO"| INFRA
-
+
SECURITY -.-> LLM
- EFFICIENCY -.-> CHANGE
+ PERF -.-> REDIS
```
---
@@ -155,229 +143,580 @@ graph TB
Schema dettagliato per il team tecnico con specifiche implementative.
-
-
```mermaid
graph TB
%% Styling tecnico
classDef infra fill:#e1f5ff,stroke:#01579b,stroke-width:2px,color:#333,font-size:11px
classDef connector fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#333,font-size:11px
+ classDef kafka fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#333,font-size:11px
classDef cache fill:#f3e5f5,stroke:#4a148c,stroke-width:2px,color:#333,font-size:11px
- classDef change fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#333,font-size:11px
classDef llm fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px,color:#333,font-size:11px
classDef git fill:#fce4ec,stroke:#880e4f,stroke-width:2px,color:#333,font-size:11px
classDef monitor fill:#fff8e1,stroke:#f57f17,stroke-width:2px,color:#333,font-size:11px
-
+
%% =====================================
%% LAYER 1: SISTEMI SORGENTE
%% =====================================
-
+
subgraph SOURCES["π’ INFRASTRUCTURE SOURCES"]
VCENTER["VMware vCenter
API: vSphere REST 7.0+
Port: 443/HTTPS
Auth: API Token"]:::infra
K8S_API["Kubernetes API
API: v1.28+
Port: 6443/HTTPS
Auth: ServiceAccount + RBAC"]:::infra
LINUX["Linux Servers
Protocol: SSH/Ansible
Port: 22
Auth: SSH Keys"]:::infra
CISCO["Cisco Devices
Protocol: NETCONF/RESTCONF
Port: 830/443
Auth: AAA"]:::infra
end
-
+
%% =====================================
%% LAYER 2: CONNETTORI
%% =====================================
-
+
subgraph CONNECTORS["π DATA COLLECTORS (Python/Go)"]
- CONN_VM["VMware Collector
Lang: Python 3.11
Lib: pyvmomi
Schedule: */15 * * * *
Output: JSON β Redis"]:::connector
-
+ CONN_VM["VMware Collector
Lang: Python 3.11
Lib: pyvmomi
Schedule: */15 * * * *
Output: JSON"]:::connector
+
CONN_K8S["K8s Collector
Lang: Python 3.11
Lib: kubernetes-client
Schedule: */5 * * * *
Resources: pods,svc,ing,deploy"]:::connector
-
+
CONN_LNX["Linux Collector
Lang: Python 3.11
Lib: paramiko/ansible
Schedule: */30 * * * *
Data: sysinfo,packages,services"]:::connector
-
+
CONN_CSC["Cisco Collector
Lang: Python 3.11
Lib: ncclient
Schedule: */30 * * * *
Data: interfaces,routing,vlans"]:::connector
end
-
+
VCENTER -->|"GET /api/vcenter/vm"| CONN_VM
K8S_API -->|"kubectl proxy
API calls"| CONN_K8S
LINUX -->|"SSH batch
commands"| CONN_LNX
CISCO -->|"NETCONF
get-config"| CONN_CSC
-
+
%% =====================================
- %% LAYER 3: REDIS STORAGE
+ %% LAYER 3: MESSAGE BROKER
%% =====================================
-
- subgraph STORAGE["πΎ REDIS CLUSTER"]
+
+ subgraph MESSAGING["π¨ KAFKA CLUSTER (3 brokers)"]
+ KAFKA_TOPICS["Kafka Topics:
β’ vmware.inventory (P:6, R:3)
β’ k8s.resources (P:12, R:3)
β’ linux.systems (P:3, R:3)
β’ cisco.network (P:3, R:3)
Retention: 7 days
Format: JSON + Schema Registry"]:::kafka
+
+ SCHEMA["Schema Registry
Avro Schemas
Versioning enabled
Port: 8081"]:::kafka
+ end
+
+ CONN_VM -->|"Producer
Batch 100 msg"| KAFKA_TOPICS
+ CONN_K8S -->|"Producer
Batch 100 msg"| KAFKA_TOPICS
+ CONN_LNX -->|"Producer
Batch 50 msg"| KAFKA_TOPICS
+ CONN_CSC -->|"Producer
Batch 50 msg"| KAFKA_TOPICS
+
+ KAFKA_TOPICS <--> SCHEMA
+
+ %% =====================================
+ %% LAYER 4: PROCESSING & CACHE
+ %% =====================================
+
+ subgraph PROCESSING["βοΈ STREAM PROCESSING"]
+ CONSUMER_GRP["Kafka Consumer Group
Group ID: doc-consumers
Lang: Python 3.11
Lib: kafka-python
Workers: 6
Commit: auto (5s)"]:::kafka
+
+ PROCESSOR["Data Processor
β’ Validation
β’ Transformation
β’ Enrichment
β’ Deduplication"]:::kafka
+ end
+
+ KAFKA_TOPICS -->|"Subscribe
offset management"| CONSUMER_GRP
+ CONSUMER_GRP --> PROCESSOR
+
+ subgraph STORAGE["πΎ CACHE LAYER (Optional)"]
REDIS_CLUSTER["Redis Cluster
Mode: Cluster (6 nodes)
Port: 6379
Persistence: RDB + AOF
Memory: 64GB
Eviction: allkeys-lru"]:::cache
-
- REDIS_KEYS["Key Structure:
β’ vmware:vcenter-id:vms:hash
β’ k8s:cluster:namespace:resource:hash
β’ linux:hostname:info:hash
β’ cisco:device-id:config:hash
β’ changelog:timestamp:diff
TTL: 30d for data, 90d for changelog"]:::cache
+
+ REDIS_KEYS["Key Structure:
β’ vmware:vcenter-id:vms
β’ k8s:cluster:namespace:resource
β’ linux:hostname:info
β’ cisco:device-id:config
TTL: 1-24h based on type"]:::cache
end
-
- CONN_VM -->|"HSET/HMSET
+ Hash Storage"| REDIS_CLUSTER
- CONN_K8S -->|"HSET/HMSET
+ Hash Storage"| REDIS_CLUSTER
- CONN_LNX -->|"HSET/HMSET
+ Hash Storage"| REDIS_CLUSTER
- CONN_CSC -->|"HSET/HMSET
+ Hash Storage"| REDIS_CLUSTER
-
+
+ PROCESSOR -.->|"SET/HSET
Pipeline batch"| REDIS_CLUSTER
REDIS_CLUSTER --> REDIS_KEYS
-
+
%% =====================================
- %% LAYER 4: CHANGE DETECTION
+ %% LAYER 5: LLM & MCP
%% =====================================
-
- subgraph CHANGE_DETECTION["π CHANGE DETECTION SYSTEM"]
- DETECTOR["Change Detector Service
Lang: Python 3.11
Lib: redis-py
Algorithm: Hash comparison
Check interval: */5 * * * *"]:::change
-
- DIFF_ENGINE["Diff Engine
β’ Deep object comparison
β’ JSON diff generation
β’ Change classification
β’ Severity assessment"]:::change
-
- CHANGE_LOG["Change Log Store
Key: changelog:*
Data: diff JSON + metadata
Indexed by: timestamp, resource"]:::change
-
- NOTIFIER["Change Notifier
β’ Webhook triggers
β’ Slack notifications
β’ Event emission
Target: LLM trigger"]:::change
- end
-
- REDIS_CLUSTER -->|"Monitor
key changes"| DETECTOR
- DETECTOR --> DIFF_ENGINE
- DIFF_ENGINE -->|"Store diff"| CHANGE_LOG
- CHANGE_LOG --> REDIS_CLUSTER
- DIFF_ENGINE -->|"Notify if
significant"| NOTIFIER
-
- %% =====================================
- %% LAYER 5: LLM TRIGGER & GENERATION
- %% =====================================
-
- subgraph TRIGGER_SYSTEM["β‘ TRIGGER SYSTEM"]
- TRIGGER_SVC["Trigger Service
Lang: Python 3.11
Listen: Webhook + Redis Pub/Sub
Debounce: 5 min
Batch: multiple changes"]:::change
-
- QUEUE["Generation Queue
Type: Redis List
Priority: High/Medium/Low
Processing: FIFO"]:::change
- end
-
- NOTIFIER -->|"Trigger event"| TRIGGER_SVC
- TRIGGER_SVC -->|"Enqueue
generation task"| QUEUE
-
+
subgraph LLM_LAYER["π€ AI GENERATION LAYER"]
- LLM_ENGINE["LLM Engine
Model: Qwen (Locale)
API: Ollama/vLLM/LM Studio
Port: 11434
Temp: 0.3
Max Tokens: 4096
Timeout: 120s"]:::llm
-
+ LLM_ENGINE["LLM Engine
Model: Claude Sonnet 4 / GPT-4
API: Anthropic/OpenAI
Temp: 0.3
Max Tokens: 4096
Timeout: 120s"]:::llm
+
MCP_SERVER["MCP Server
Lang: TypeScript/Node.js
Port: 3000
Protocol: JSON-RPC 2.0
Auth: JWT tokens"]:::llm
-
- MCP_TOOLS["MCP Tools:
β’ getVMwareInventory(vcenter)
β’ getK8sResources(cluster,ns,type)
β’ getLinuxSystemInfo(hostname)
β’ getCiscoConfig(device,section)
β’ getChangelog(start,end,resource)
Return: JSON + Metadata"]:::llm
+
+ MCP_TOOLS["MCP Tools:
β’ getVMwareInventory(vcenter)
β’ getK8sResources(cluster,ns,type)
β’ getLinuxSystemInfo(hostname)
β’ getCiscoConfig(device,section)
β’ queryTimeRange(start,end)
Return: JSON + Metadata"]:::llm
end
-
- QUEUE -->|"Dequeue
task"| LLM_ENGINE
-
+
LLM_ENGINE <-->|"Tool calls
JSON-RPC"| MCP_SERVER
MCP_SERVER --> MCP_TOOLS
-
- MCP_TOOLS -->|"HGETALL/MGET
Read data"| REDIS_CLUSTER
- REDIS_CLUSTER -->|"Config data
+ Changelog"| MCP_TOOLS
+
+ MCP_TOOLS -->|"1. Query Kafka Consumer API
GET /api/v1/data"| CONSUMER_GRP
+ MCP_TOOLS -.->|"2. Fallback Redis
MGET/HGETALL"| REDIS_CLUSTER
+
+ CONSUMER_GRP -->|"JSON Response
+ Timestamps"| MCP_TOOLS
+ REDIS_CLUSTER -.->|"Cached JSON
Fast response"| MCP_TOOLS
+
MCP_TOOLS -->|"Structured Data
+ Context"| LLM_ENGINE
-
+
subgraph OUTPUT["π DOCUMENT GENERATION"]
TEMPLATE["Template Engine
Format: Jinja2
Templates: markdown/*.j2
Variables: from LLM"]:::llm
-
- MARKDOWN["Markdown Output
Format: CommonMark
Metadata: YAML frontmatter
Change summary included
Assets: diagrams in mermaid"]:::llm
-
- VALIDATOR["Doc Validator
β’ Markdown linting
β’ Link checking
β’ Schema validation
β’ Change verification"]:::llm
+
+ MARKDOWN["Markdown Output
Format: CommonMark
Metadata: YAML frontmatter
Assets: diagrams in mermaid"]:::llm
+
+ VALIDATOR["Doc Validator
β’ Markdown linting
β’ Link checking
β’ Schema validation"]:::llm
end
-
+
LLM_ENGINE --> TEMPLATE
TEMPLATE --> MARKDOWN
MARKDOWN --> VALIDATOR
-
+
%% =====================================
%% LAYER 6: GITOPS
%% =====================================
-
+
subgraph GITOPS["π GITOPS WORKFLOW"]
GIT_REPO["GitLab Repository
URL: gitlab.com/docs/infra
Branch strategy: main + feature/*
Protected: main (require approval)"]:::git
-
+
GIT_API["GitLab API
API: v4
Auth: Project Access Token
Permissions: api, write_repo"]:::git
-
- PR_AUTO["Automated PR Creator
Lang: Python 3.11
Lib: python-gitlab
Template: .gitlab/merge_request.md
Include: change summary"]:::git
+
+ PR_AUTO["Automated PR Creator
Lang: Python 3.11
Lib: python-gitlab
Template: .gitlab/merge_request.md"]:::git
end
-
+
VALIDATOR -->|"git add/commit/push"| GIT_REPO
GIT_REPO <--> GIT_API
GIT_API --> PR_AUTO
-
- REVIEWER["π¨βπΌ Technical Reviewer
Role: Maintainer/Owner
Review: diff + validation
Check: change correlation
Approve: required (min 1)"]:::monitor
-
+
+ REVIEWER["π¨βπΌ Technical Reviewer
Role: Maintainer/Owner
Review: diff + validation
Approve: required (min 1)"]:::monitor
+
PR_AUTO -->|"Notification
Email + Slack"| REVIEWER
REVIEWER -->|"Merge to main"| GIT_REPO
-
+
%% =====================================
%% LAYER 7: CI/CD & PUBLISH
%% =====================================
-
+
subgraph CICD["β‘ CI/CD PIPELINE"]
GITLAB_CI["GitLab CI/CD
Runner: docker
Image: python:3.11-alpine
Stages: build, test, deploy"]:::git
-
+
PIPELINE_JOBS["Pipeline Jobs:
1. lint (markdownlint-cli)
2. build (mkdocs build)
3. test (link-checker)
4. deploy (rsync/s3)"]:::git
-
+
MKDOCS_CFG["MkDocs Config
Theme: material
Plugins: search, tags, mermaid
Extensions: admonition, codehilite"]:::git
end
-
+
GIT_REPO -->|"on: push to main
Webhook trigger"| GITLAB_CI
GITLAB_CI --> PIPELINE_JOBS
PIPELINE_JOBS --> MKDOCS_CFG
-
+
subgraph PUBLISH["π PUBLICATION"]
STATIC_SITE["Static Site
Generator: MkDocs
Output: HTML/CSS/JS
Assets: optimized images"]:::git
-
+
CDN["GitLab Pages / S3 + CloudFront
URL: docs.company.com
SSL: Let's Encrypt
Cache: 1h"]:::git
-
+
SEARCH["Search Index
Engine: Algolia/Meilisearch
Update: on publish
API: REST"]:::git
end
-
+
MKDOCS_CFG -->|"mkdocs build
--strict"| STATIC_SITE
STATIC_SITE --> CDN
STATIC_SITE --> SEARCH
-
+
%% =====================================
%% LAYER 8: MONITORING & OBSERVABILITY
%% =====================================
-
+
subgraph OBSERVABILITY["π MONITORING & LOGGING"]
- PROMETHEUS["Prometheus
Metrics: collector updates, changes detected
Scrape: 30s
Retention: 15d"]:::monitor
-
- GRAFANA["Grafana Dashboards
β’ Collector status
β’ Redis performance
β’ Change detection rate
β’ LLM response times
β’ Pipeline success rate"]:::monitor
-
+ PROMETHEUS["Prometheus
Metrics: collector lag, cache hit/miss
Scrape: 30s
Retention: 15d"]:::monitor
+
+ GRAFANA["Grafana Dashboards
β’ Kafka metrics
β’ Redis performance
β’ LLM response times
β’ Pipeline success rate"]:::monitor
+
ELK["ELK Stack
Logs: all components
Index: daily rotation
Retention: 30d"]:::monitor
-
- ALERTS["Alerting
β’ Collector failures
β’ Redis issues
β’ Change detection errors
β’ Pipeline failures
Channel: Slack + PagerDuty"]:::monitor
+
+ ALERTS["Alerting
β’ Connector failures
β’ Kafka lag > 10k
β’ Redis OOM
β’ Pipeline failures
Channel: Slack + PagerDuty"]:::monitor
end
-
+
CONN_VM -.->|"metrics"| PROMETHEUS
CONN_K8S -.->|"metrics"| PROMETHEUS
+ KAFKA_TOPICS -.->|"metrics"| PROMETHEUS
REDIS_CLUSTER -.->|"metrics"| PROMETHEUS
- DETECTOR -.->|"metrics"| PROMETHEUS
MCP_SERVER -.->|"metrics"| PROMETHEUS
GITLAB_CI -.->|"metrics"| PROMETHEUS
-
+
PROMETHEUS --> GRAFANA
-
+
CONN_VM -.->|"logs"| ELK
- DETECTOR -.->|"logs"| ELK
+ CONSUMER_GRP -.->|"logs"| ELK
MCP_SERVER -.->|"logs"| ELK
GITLAB_CI -.->|"logs"| ELK
-
+
GRAFANA --> ALERTS
-
+
%% =====================================
- %% SECURITY & EFFICIENCY ANNOTATIONS
+ %% SECURITY ANNOTATIONS
%% =====================================
-
+
SEC1["π SECURITY:
β’ All APIs use TLS 1.3
β’ Secrets in Vault/K8s Secrets
β’ Network: private VPC
β’ LLM has NO direct access"]:::monitor
-
+
SEC2["π AUTHENTICATION:
β’ API Tokens rotated 90d
β’ RBAC enforced
β’ Audit logs enabled
β’ MFA required for Git"]:::monitor
-
- EFF1["β‘ EFFICIENCY:
β’ Doc generation only on changes
β’ Debounce prevents spam
β’ Hash-based change detection
β’ Batch processing"]:::change
-
+
SEC1 -.-> MCP_SERVER
SEC2 -.-> GIT_REPO
- EFF1 -.-> DETECTOR
```
---
+## π¬ Sistema RAG Conversazionale
+
+### Interrogazione Documentazione con AI
+
+Sistema per "parlare" con la documentazione utilizzando Retrieval Augmented Generation (RAG). Permette agli utenti di porre domande in linguaggio naturale e ricevere risposte accurate basate sulla documentazione, con citazioni delle fonti.
+
+#### Caratteristiche Principali
+
+- β
**Semantic Search**: Ricerca vettoriale per comprendere l'intento della query
+- β
**ScalabilitΓ **: Gestione di grandi volumi di documentazione (100k+ documenti)
+- β
**Performance**: Risposte in <3 secondi con caching intelligente
+- β
**Accuratezza**: Re-ranking e source attribution per risposte precise
+- β
**LLM Locale**: Qwen on-premise per privacy e controllo
+
+### Schema RAG - Management View
+
+```mermaid
+graph TB
+ %% Styling
+ classDef docs fill:#e3f2fd,stroke:#1565c0,stroke-width:3px,color:#333
+ classDef process fill:#f3e5f5,stroke:#4a148c,stroke-width:3px,color:#333
+ classDef vector fill:#fff3e0,stroke:#e65100,stroke-width:3px,color:#333
+ classDef llm fill:#e8f5e9,stroke:#1b5e20,stroke-width:3px,color:#333
+ classDef user fill:#fff9c4,stroke:#f57f17,stroke-width:3px,color:#333
+ classDef cache fill:#fce4ec,stroke:#880e4f,stroke-width:3px,color:#333
+
+ %% ========================================
+ %% INGESTION PIPELINE (Offline)
+ %% ========================================
+
+ subgraph INGESTION["π INGESTION PIPELINE (Offline Process)"]
+ DOCS["π DOCUMENTAZIONE
MkDocs Output
Markdown Files"]:::docs
+
+ CHUNKER["βοΈ DOCUMENT CHUNKER
Split & Overlap
Metadata Extraction"]:::process
+
+ EMBEDDER["π§ EMBEDDING MODEL
Text β Vectors
Dimensione: 768/1024"]:::process
+
+ VECTORDB[("ποΈ VECTOR DATABASE
Qdrant/Milvus
Sharded & Replicated")]:::vector
+ end
+
+ DOCS -->|"Parse
Markdown"| CHUNKER
+ CHUNKER -->|"Text Chunks
+ Metadata"| EMBEDDER
+ EMBEDDER -->|"Store
Embeddings"| VECTORDB
+
+ %% ========================================
+ %% QUERY PIPELINE (Real-time)
+ %% ========================================
+
+ subgraph QUERY["π¬ QUERY PIPELINE (Real-time)"]
+ USER["π€ UTENTE
Domanda/Query"]:::user
+
+ QUERY_EMBED["π§ QUERY EMBEDDING
Query β Vector"]:::process
+
+ SEARCH["π SEMANTIC SEARCH
Vector Similarity
Top-K Results"]:::vector
+
+ RERANK["π RE-RANKING
Context Scoring
Relevance Filter"]:::process
+
+ CONTEXT["π CONTEXT BUILDER
Assemble Chunks
Add Metadata"]:::process
+ end
+
+ USER -->|"Natural Language
Question"| QUERY_EMBED
+ QUERY_EMBED -->|"Query Vector"| SEARCH
+ SEARCH -->|"Search"| VECTORDB
+ VECTORDB -->|"Top-K Chunks
+ Scores"| SEARCH
+ SEARCH -->|"Initial Results"| RERANK
+ RERANK -->|"Filtered
Chunks"| CONTEXT
+
+ %% ========================================
+ %% GENERATION (LLM)
+ %% ========================================
+
+ subgraph GENERATION["π€ ANSWER GENERATION"]
+ LLM_RAG["π€ LLM ENGINE
Qwen (Locale)
+ RAG Context"]:::llm
+
+ ANSWER["π‘ RISPOSTA
Generated Answer
+ Source Citations"]:::llm
+ end
+
+ CONTEXT -->|"Context
+ Sources"| LLM_RAG
+ LLM_RAG -->|"Generate"| ANSWER
+ ANSWER -->|"Display"| USER
+
+ %% ========================================
+ %% CACHING & OPTIMIZATION
+ %% ========================================
+
+ CACHE[("πΎ REDIS CACHE
Query Cache
Embedding Cache")]:::cache
+
+ QUERY_EMBED -.->|"Check Cache"| CACHE
+ CACHE -.->|"Cached
Embedding"| SEARCH
+
+ SEARCH -.->|"Cache
Results"| CACHE
+
+ %% ========================================
+ %% SCALING & UPDATE
+ %% ========================================
+
+ UPDATE["π INCREMENTAL UPDATE
On Doc Changes
Auto Re-index"]:::docs
+
+ DOCS -.->|"Doc Updated"| UPDATE
+ UPDATE -.->|"Re-process
Changed Docs"| CHUNKER
+
+ %% ========================================
+ %% ANNOTATIONS
+ %% ========================================
+
+ SCALE["π SCALABILITΓ
β’ Vector DB sharding
β’ Horizontal scaling
β’ Load balancing"]:::vector
+
+ PERF["β‘ PERFORMANCE
β’ Query cache
β’ Embedding cache
β’ Async processing"]:::cache
+
+ QUALITY["β
QUALITY
β’ Re-ranking
β’ Relevance scoring
β’ Source citations"]:::process
+
+ SCALE -.-> VECTORDB
+ PERF -.-> CACHE
+ QUALITY -.-> RERANK
+```
+
+### Schema RAG - Technical View
+
+```mermaid
+graph TB
+ %% Styling
+ classDef docs fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#333,font-size:11px
+ classDef process fill:#f3e5f5,stroke:#4a148c,stroke-width:2px,color:#333,font-size:11px
+ classDef vector fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#333,font-size:11px
+ classDef llm fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px,color:#333,font-size:11px
+ classDef user fill:#fff9c4,stroke:#f57f17,stroke-width:2px,color:#333,font-size:11px
+ classDef cache fill:#fce4ec,stroke:#880e4f,stroke-width:2px,color:#333,font-size:11px
+ classDef monitor fill:#fff8e1,stroke:#f57f17,stroke-width:2px,color:#333,font-size:11px
+
+ %% =====================================
+ %% LAYER 1: DOCUMENTATION SOURCE
+ %% =====================================
+
+ subgraph DOCSOURCE["π DOCUMENTATION SOURCE"]
+ MKDOCS_OUT["MkDocs Static Site
Path: /site/
Format: HTML + Markdown
Assets: images, diagrams
Update: on Git merge"]:::docs
+
+ DOC_WATCHER["Document Watcher
Lang: Python 3.11
Lib: watchdog
Trigger: file system events
Debounce: 30s"]:::docs
+
+ DOC_PARSER["Document Parser
HTML β Plain Text
Preserve structure
Extract metadata
Clean formatting"]:::docs
+ end
+
+ MKDOCS_OUT --> DOC_WATCHER
+ DOC_WATCHER -->|"New/Modified
Docs"| DOC_PARSER
+
+ %% =====================================
+ %% LAYER 2: CHUNKING STRATEGY
+ %% =====================================
+
+ subgraph CHUNKING["βοΈ INTELLIGENT CHUNKING"]
+ CHUNK_ENGINE["Chunking Engine
Lang: Python 3.11
Lib: langchain/llama-index
Strategy: Recursive Character"]:::process
+
+ CHUNK_CONFIG["Chunking Config:
β’ Chunk Size: 512 tokens
β’ Overlap: 128 tokens
β’ Separators: \\n\\n, \\n, . , ' '
β’ Min chunk: 100 tokens
β’ Max chunk: 1024 tokens"]:::process
+
+ METADATA_EXTRACTOR["Metadata Extractor
Extract:
β’ Document title
β’ Section headers
β’ Tags/keywords
β’ Creation date
β’ File path
β’ Doc type"]:::process
+ end
+
+ DOC_PARSER -->|"Parsed Text"| CHUNK_ENGINE
+ CHUNK_ENGINE --> CHUNK_CONFIG
+ CHUNK_ENGINE --> METADATA_EXTRACTOR
+
+ %% =====================================
+ %% LAYER 3: EMBEDDING GENERATION
+ %% =====================================
+
+ subgraph EMBEDDING["π§ EMBEDDING GENERATION"]
+ EMBED_MODEL["Embedding Model
Model: all-MiniLM-L6-v2 / BGE-M3
Dim: 384/768/1024
API: sentence-transformers
Batch size: 32
GPU: CUDA acceleration"]:::process
+
+ EMBED_CACHE["Embedding Cache
Type: Redis Hash
Key: hash(text)
TTL: 30d
Hit rate target: >80%"]:::cache
+
+ EMBED_QUEUE["Processing Queue
Type: Redis List
Workers: 4-8
Rate: 100 chunks/s
Retry: 3 attempts"]:::process
+ end
+
+ METADATA_EXTRACTOR -->|"Chunks
+ Metadata"| EMBED_QUEUE
+ EMBED_QUEUE --> EMBED_MODEL
+ EMBED_MODEL <-.->|"Cache
Check/Store"| EMBED_CACHE
+
+ %% =====================================
+ %% LAYER 4: VECTOR DATABASE
+ %% =====================================
+
+ subgraph VECTORDB["ποΈ VECTOR DATABASE CLUSTER"]
+ QDRANT["Qdrant Cluster
Version: 1.7+
Nodes: 3-6 (replicated)
Shards: auto per collection
Port: 6333/6334"]:::vector
+
+ COLLECTIONS["Collections:
β’ docs_main (dim: 768)
β’ docs_code (dim: 768)
β’ docs_api (dim: 768)
Distance: Cosine
Index: HNSW (M=16, ef=100)"]:::vector
+
+ SHARD_STRATEGY["Sharding Strategy:
β’ Auto-sharding enabled
β’ Shard size: 100k vectors
β’ Replication factor: 2
β’ Load balancing: Round-robin"]:::vector
+ end
+
+ EMBED_MODEL -->|"Store
Vectors"| QDRANT
+ QDRANT --> COLLECTIONS
+ QDRANT --> SHARD_STRATEGY
+
+ %% =====================================
+ %% LAYER 5: QUERY PROCESSING
+ %% =====================================
+
+ subgraph QUERYPROC["π¬ QUERY PROCESSING PIPELINE"]
+ USER_INPUT["User Input
Interface: Web UI / API
Auth: JWT tokens
Rate limit: 20 req/min
Timeout: 30s"]:::user
+
+ QUERY_PREPROCESS["Query Preprocessor
β’ Spelling correction
β’ Intent detection
β’ Query expansion
β’ Language detection"]:::process
+
+ QUERY_EMBEDDER["Query Embedder
Same model as docs
Cache: Redis
Latency: <50ms"]:::process
+
+ HYBRID_SEARCH["Hybrid Search
1. Vector search (semantic)
2. Keyword search (BM25)
3. Fusion: RRF algorithm
Top-K: 20 initial results"]:::vector
+ end
+
+ USER_INPUT -->|"Natural
Language"| QUERY_PREPROCESS
+ QUERY_PREPROCESS --> QUERY_EMBEDDER
+ QUERY_EMBEDDER <-.->|"Cache"| EMBED_CACHE
+ QUERY_EMBEDDER -->|"Query
Vector"| HYBRID_SEARCH
+ HYBRID_SEARCH -->|"Search"| QDRANT
+
+ %% =====================================
+ %% LAYER 6: RE-RANKING & FILTERING
+ %% =====================================
+
+ subgraph RERANK["π RE-RANKING & FILTERING"]
+ RERANKER["Cross-Encoder Re-ranker
Model: ms-marco-MiniLM
Purpose: Fine-grained relevance
Process: Top-20 β Top-5
Latency: 100-200ms"]:::process
+
+ FILTER_ENGINE["Filter Engine
β’ Relevance threshold: >0.7
β’ Deduplication
β’ Diversity scoring
β’ Metadata filtering"]:::process
+
+ CONTEXT_BUILDER["Context Builder
β’ Assemble top chunks
β’ Add source citations
β’ Format for LLM
β’ Max context: 4k tokens"]:::process
+ end
+
+ QDRANT -->|"Top-K
Results"| RERANKER
+ RERANKER --> FILTER_ENGINE
+ FILTER_ENGINE --> CONTEXT_BUILDER
+
+ %% =====================================
+ %% LAYER 7: LLM GENERATION
+ %% =====================================
+
+ subgraph LLMGEN["π€ LLM ANSWER GENERATION"]
+ RAG_PROMPT["RAG Prompt Template
Structure:
β’ System: You are a helpful assistant
β’ Context: Retrieved chunks
β’ Question: User query
β’ Instruction: Answer using context"]:::llm
+
+ LLM_ENGINE["LLM Engine
Model: Qwen 2.5 (14B/32B)
API: Ollama/vLLM
Port: 11434
Temp: 0.2 (factual)
Max tokens: 2048
Stream: enabled"]:::llm
+
+ ANSWER_POST["Answer Post-processor
β’ Citation formatting
β’ Source links
β’ Confidence scoring
β’ Fallback handling"]:::llm
+ end
+
+ CONTEXT_BUILDER -->|"Context
+ Sources"| RAG_PROMPT
+ QUERY_PREPROCESS -->|"Original
Question"| RAG_PROMPT
+ RAG_PROMPT --> LLM_ENGINE
+ LLM_ENGINE --> ANSWER_POST
+ ANSWER_POST -->|"Final
Answer"| USER_INPUT
+
+ %% =====================================
+ %% LAYER 8: CACHING LAYER
+ %% =====================================
+
+ subgraph CACHING["πΎ MULTI-LEVEL CACHE"]
+ REDIS_CACHE["Redis Cluster
Mode: Cluster
Nodes: 3
Memory: 16GB
Persistence: AOF"]:::cache
+
+ CACHE_TYPES["Cache Types:
β’ Query embeddings (TTL: 7d)
β’ Search results (TTL: 1h)
β’ LLM responses (TTL: 24h)
β’ Popular queries (no TTL)
Eviction: LRU"]:::cache
+
+ CACHE_WARMING["Cache Warming
Pre-compute:
β’ Top 100 queries
β’ Common patterns
Schedule: daily
Update: on doc changes"]:::cache
+ end
+
+ REDIS_CACHE --> CACHE_TYPES
+ CACHE_TYPES --> CACHE_WARMING
+
+ QUERY_EMBEDDER <-.-> REDIS_CACHE
+ HYBRID_SEARCH <-.-> REDIS_CACHE
+ LLM_ENGINE <-.-> REDIS_CACHE
+
+ %% =====================================
+ %% LAYER 9: SCALING & LOAD BALANCING
+ %% =====================================
+
+ subgraph SCALING["π SCALING INFRASTRUCTURE"]
+ LOAD_BALANCER["Load Balancer
Type: Nginx / HAProxy
Algorithm: Least connections
Health checks: /health
Timeout: 30s"]:::monitor
+
+ QUERY_API["Query API Instances
Replicas: 3-10 (auto-scale)
Lang: FastAPI
Container: Docker
Orchestration: K8s"]:::user
+
+ EMBED_WORKERS["Embedding Workers
Replicas: 4-8
GPU: Optional
Queue: Redis
Auto-scale: based on queue depth"]:::process
+ end
+
+ LOAD_BALANCER --> QUERY_API
+ QUERY_API --> USER_INPUT
+
+ %% =====================================
+ %% LAYER 10: MONITORING & OBSERVABILITY
+ %% =====================================
+
+ subgraph MONITORING["π MONITORING & ANALYTICS"]
+ METRICS["Prometheus Metrics
β’ Query latency (p50, p95, p99)
β’ Vector search time
β’ LLM response time
β’ Cache hit rate
β’ Embedding generation rate
Scrape: 15s"]:::monitor
+
+ DASHBOARDS["Grafana Dashboards
β’ RAG Performance
β’ Query analytics
β’ Resource utilization
β’ Error tracking
Refresh: real-time"]:::monitor
+
+ ANALYTICS["Query Analytics
Track:
β’ Popular queries
β’ Failed queries
β’ Avg relevance scores
β’ User satisfaction
Storage: TimescaleDB"]:::monitor
+
+ ALERTS["Alerting Rules
β’ Latency > 5s
β’ Error rate > 5%
β’ Cache hit < 70%
β’ Vector DB down
Channel: Slack + Email"]:::monitor
+ end
+
+ METRICS --> DASHBOARDS
+ DASHBOARDS --> ANALYTICS
+ ANALYTICS --> ALERTS
+
+ QUERY_API -.->|"metrics"| METRICS
+ HYBRID_SEARCH -.->|"metrics"| METRICS
+ LLM_ENGINE -.->|"metrics"| METRICS
+ QDRANT -.->|"metrics"| METRICS
+
+ %% =====================================
+ %% LAYER 11: FEEDBACK LOOP
+ %% =====================================
+
+ subgraph FEEDBACK["π FEEDBACK & IMPROVEMENT"]
+ USER_FEEDBACK["User Feedback
β’ Thumbs up/down
β’ Relevance rating
β’ Comments
Storage: PostgreSQL"]:::user
+
+ FEEDBACK_ANALYSIS["Feedback Analysis
β’ Identify bad answers
β’ Track improvement areas
β’ A/B testing results
Schedule: weekly"]:::monitor
+
+ MODEL_TUNING["Model Fine-tuning
β’ Re-rank model updates
β’ Prompt optimization
β’ Chunk size tuning
Cycle: monthly"]:::process
+ end
+
+ USER_INPUT -->|"Rate
Answer"| USER_FEEDBACK
+ USER_FEEDBACK --> FEEDBACK_ANALYSIS
+ FEEDBACK_ANALYSIS --> MODEL_TUNING
+ MODEL_TUNING -.->|"Improve"| RERANKER
+
+ %% =====================================
+ %% ANNOTATIONS
+ %% =====================================
+
+ SCALE_NOTE["π SCALABILITY:
β’ Vector DB: Horizontal sharding
β’ API: K8s auto-scaling (HPA)
β’ Workers: Queue-based scaling
β’ Cache: Redis cluster
Target: 100k+ docs, 1k+ QPS"]:::monitor
+
+ PERF_NOTE["β‘ PERFORMANCE TARGETS:
β’ Query latency: <3s (p95)
β’ Vector search: <100ms
β’ LLM generation: <2s
β’ Cache hit rate: >80%
β’ Throughput: 1000 QPS"]:::cache
+
+ QUALITY_NOTE["β
QUALITY ASSURANCE:
β’ Re-ranking for precision
β’ Source attribution
β’ Confidence scoring
β’ Fallback responses
β’ Human feedback loop"]:::process
+
+ SCALE_NOTE -.-> QDRANT
+ PERF_NOTE -.-> REDIS_CACHE
+ QUALITY_NOTE -.-> RERANKER
+```
+
+### Pipeline RAG
+
+**1. Ingestion Pipeline (Offline)**
+
+- Parsing documentazione MkDocs
+- Chunking intelligente (512 token, overlap 128)
+- Generazione embeddings (all-MiniLM-L6-v2)
+- Storage in Vector Database (Qdrant cluster)
+
+**2. Query Pipeline (Real-time)**
+
+- Embedding della query utente
+- Hybrid search (semantic + keyword)
+- Re-ranking con cross-encoder
+- Context assembly per LLM
+
+**3. Generation**
+
+- LLM locale (Qwen) con RAG context
+- Source attribution automatica
+- Streaming delle risposte
+
+**4. Scaling Strategy**
+
+- Vector DB sharding automatico
+- API instances con auto-scaling K8s
+- Redis cluster per caching multi-livello
+- Load balancing con Nginx
+
+---
+
## π§ Contatti
- **Team**: Infrastructure Documentation Team
@@ -386,5 +725,5 @@ graph TB
---
-**Versione**: 1.0.0
-**Ultimo aggiornamento**: 2025-10-28
\ No newline at end of file
+**Versione**: 1.0.0
+**Ultimo aggiornamento**: 2025-10-28