Update scheme.md
Some checks failed
Build / Build & Push Docker Images (api) (push) Has been cancelled
Build / Build & Push Docker Images (chat) (push) Has been cancelled
Build / Build & Push Docker Images (frontend) (push) Has been cancelled
Build / Build & Push Docker Images (worker) (push) Has been cancelled
Build / Code Quality Checks (push) Has been cancelled

This commit is contained in:
2025-10-28 11:38:04 +00:00
parent 40824d991f
commit 6519a4856b

693
scheme.md
View File

@@ -48,84 +48,72 @@ Il sistema è suddiviso in **3 flussi principali**:
Schema semplificato per presentazioni executive e management. Schema semplificato per presentazioni executive e management.
<!--
===========================================
INCOLLA QUI LO SCHEMA ARCHITETTURALE
(quello per il management, più semplice)
===========================================
-->
```mermaid ```mermaid
graph TB graph TB
%% Styling %% Styling
classDef infrastructure fill:#e1f5ff,stroke:#01579b,stroke-width:3px,color:#333 classDef infrastructure fill:#e1f5ff,stroke:#01579b,stroke-width:3px,color:#333
classDef kafka fill:#fff3e0,stroke:#e65100,stroke-width:3px,color:#333
classDef cache fill:#f3e5f5,stroke:#4a148c,stroke-width:3px,color:#333 classDef cache fill:#f3e5f5,stroke:#4a148c,stroke-width:3px,color:#333
classDef change fill:#fff3e0,stroke:#e65100,stroke-width:3px,color:#333
classDef llm fill:#e8f5e9,stroke:#1b5e20,stroke-width:3px,color:#333 classDef llm fill:#e8f5e9,stroke:#1b5e20,stroke-width:3px,color:#333
classDef git fill:#fce4ec,stroke:#880e4f,stroke-width:3px,color:#333 classDef git fill:#fce4ec,stroke:#880e4f,stroke-width:3px,color:#333
classDef human fill:#fff9c4,stroke:#f57f17,stroke-width:3px,color:#333 classDef human fill:#fff9c4,stroke:#f57f17,stroke-width:3px,color:#333
%% ======================================== %% ========================================
%% FLUSSO 1: RACCOLTA DATI (Background) %% FLUSSO 1: RACCOLTA DATI (Background)
%% ======================================== %% ========================================
INFRA[("🏢 SISTEMI<br/>INFRASTRUTTURALI<br/><br/>VMware | K8s | Linux | Cisco")]:::infrastructure INFRA[("🏢 SISTEMI<br/>INFRASTRUTTURALI<br/><br/>VMware | K8s | Linux | Cisco")]:::infrastructure
CONN["🔌 CONNETTORI<br/>Polling Automatico"]:::infrastructure CONN["🔌 CONNETTORI<br/>Polling Automatico"]:::infrastructure
REDIS[("💾 REDIS CACHE<br/>Configurazione<br/>Infrastruttura")]:::cache KAFKA[("📨 APACHE KAFKA<br/>Message Broker<br/>+ Persistenza")]:::kafka
CONSUMER["⚙️ KAFKA CONSUMER<br/>Processor Service"]:::kafka
REDIS[("💾 REDIS CACHE<br/>(Opzionale)<br/>Performance Layer")]:::cache
INFRA -->|"API Polling<br/>Continuo"| CONN INFRA -->|"API Polling<br/>Continuo"| CONN
CONN -->|"Update<br/>Configurazione"| REDIS CONN -->|"Publish<br/>Eventi"| KAFKA
KAFKA -->|"Consume<br/>Stream"| CONSUMER
CONSUMER -.->|"Update<br/>Opzionale"| REDIS
%% ======================================== %% ========================================
%% CHANGE DETECTION %% FLUSSO 2: GENERAZIONE DOCUMENTAZIONE
%% ======================================== %% ========================================
CHANGE["🔍 CHANGE DETECTOR<br/>Rileva Modifiche<br/>Configurazione"]:::change USER["👤 UTENTE<br/>Richiesta Doc"]:::human
REDIS -->|"Monitor<br/>Changes"| CHANGE LLM["🤖 LLM ENGINE<br/>Claude / GPT"]:::llm
%% ========================================
%% FLUSSO 2: GENERAZIONE DOCUMENTAZIONE (Triggered)
%% ========================================
TRIGGER["⚡ TRIGGER<br/>Solo se modifiche"]:::change
USER["👤 UTENTE<br/>Richiesta Manuale"]:::human
LLM["🤖 LLM ENGINE<br/>Qwen (Locale)"]:::llm
MCP["🔧 MCP SERVER<br/>API Control Platform"]:::llm MCP["🔧 MCP SERVER<br/>API Control Platform"]:::llm
DOC["📄 DOCUMENTO<br/>Markdown Generato"]:::llm DOC["📄 DOCUMENTO<br/>Markdown Generato"]:::llm
CHANGE -->|"Modifiche<br/>Rilevate"| TRIGGER USER -->|"1. Prompt"| LLM
USER -.->|"Opzionale"| TRIGGER LLM -->|"2. Tool Call"| MCP
MCP -->|"3a. Query"| KAFKA
TRIGGER -->|"Avvia<br/>Generazione"| LLM MCP -.->|"3b. Query<br/>Fast"| REDIS
LLM -->|"Tool Call"| MCP KAFKA -->|"4a. Dati"| MCP
MCP -->|"Query"| REDIS REDIS -.->|"4b. Dati"| MCP
REDIS -->|"Dati Config"| MCP MCP -->|"5. Context"| LLM
MCP -->|"Context"| LLM LLM -->|"6. Genera"| DOC
LLM -->|"Genera"| DOC
%% ======================================== %% ========================================
%% FLUSSO 3: VALIDAZIONE E PUBBLICAZIONE %% FLUSSO 3: VALIDAZIONE E PUBBLICAZIONE
%% ======================================== %% ========================================
GIT["📦 GITLAB<br/>Repository"]:::git GIT["📦 GITLAB<br/>Repository"]:::git
PR["🔀 PULL REQUEST<br/>Review Automatica"]:::git PR["🔀 PULL REQUEST<br/>Review Automatica"]:::git
TECH["👨‍💼 TEAM TECNICO<br/>Validazione Umana"]:::human TECH["👨‍💼 TEAM TECNICO<br/>Validazione Umana"]:::human
PIPELINE["⚡ CI/CD PIPELINE<br/>GitLab Runner"]:::git PIPELINE["⚡ CI/CD PIPELINE<br/>GitLab Runner"]:::git
MKDOCS["📚 MKDOCS<br/>Static Site Generator"]:::git MKDOCS["📚 MKDOCS<br/>Static Site Generator"]:::git
WEB["🌐 DOCUMENTAZIONE<br/>GitLab Pages<br/>(Pubblicata)"]:::git WEB["🌐 DOCUMENTAZIONE<br/>GitLab Pages<br/>(Pubblicata)"]:::git
DOC -->|"Push +<br/>Branch"| GIT DOC -->|"Push +<br/>Branch"| GIT
GIT -->|"Crea"| PR GIT -->|"Crea"| PR
PR -->|"Notifica"| TECH PR -->|"Notifica"| TECH
@@ -133,18 +121,18 @@ graph TB
GIT -->|"Trigger"| PIPELINE GIT -->|"Trigger"| PIPELINE
PIPELINE -->|"Build"| MKDOCS PIPELINE -->|"Build"| MKDOCS
MKDOCS -->|"Deploy"| WEB MKDOCS -->|"Deploy"| WEB
%% ======================================== %% ========================================
%% ANNOTAZIONI %% ANNOTAZIONI SICUREZZA
%% ======================================== %% ========================================
SECURITY["🔒 SICUREZZA<br/>LLM isolato dai sistemi live"]:::human SECURITY["🔒 SICUREZZA<br/>LLM isolato dai sistemi live"]:::human
EFFICIENCY["⚡ EFFICIENZA<br/>Doc generata solo<br/>su modifiche"]:::change PERF["⚡ PERFORMANCE<br/>Cache Redis opzionale"]:::cache
LLM -.->|"NESSUN<br/>ACCESSO"| INFRA LLM -.->|"NESSUN<br/>ACCESSO"| INFRA
SECURITY -.-> LLM SECURITY -.-> LLM
EFFICIENCY -.-> CHANGE PERF -.-> REDIS
``` ```
--- ---
@@ -155,229 +143,580 @@ graph TB
Schema dettagliato per il team tecnico con specifiche implementative. Schema dettagliato per il team tecnico con specifiche implementative.
<!--
===========================================
INCOLLA QUI LO SCHEMA TECNICO DETTAGLIATO
(quello con tutti i dettagli per gli sviluppatori)
===========================================
-->
```mermaid ```mermaid
graph TB graph TB
%% Styling tecnico %% Styling tecnico
classDef infra fill:#e1f5ff,stroke:#01579b,stroke-width:2px,color:#333,font-size:11px classDef infra fill:#e1f5ff,stroke:#01579b,stroke-width:2px,color:#333,font-size:11px
classDef connector fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#333,font-size:11px classDef connector fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#333,font-size:11px
classDef kafka fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#333,font-size:11px
classDef cache fill:#f3e5f5,stroke:#4a148c,stroke-width:2px,color:#333,font-size:11px classDef cache fill:#f3e5f5,stroke:#4a148c,stroke-width:2px,color:#333,font-size:11px
classDef change fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#333,font-size:11px
classDef llm fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px,color:#333,font-size:11px classDef llm fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px,color:#333,font-size:11px
classDef git fill:#fce4ec,stroke:#880e4f,stroke-width:2px,color:#333,font-size:11px classDef git fill:#fce4ec,stroke:#880e4f,stroke-width:2px,color:#333,font-size:11px
classDef monitor fill:#fff8e1,stroke:#f57f17,stroke-width:2px,color:#333,font-size:11px classDef monitor fill:#fff8e1,stroke:#f57f17,stroke-width:2px,color:#333,font-size:11px
%% ===================================== %% =====================================
%% LAYER 1: SISTEMI SORGENTE %% LAYER 1: SISTEMI SORGENTE
%% ===================================== %% =====================================
subgraph SOURCES["🏢 INFRASTRUCTURE SOURCES"] subgraph SOURCES["🏢 INFRASTRUCTURE SOURCES"]
VCENTER["VMware vCenter<br/>API: vSphere REST 7.0+<br/>Port: 443/HTTPS<br/>Auth: API Token"]:::infra VCENTER["VMware vCenter<br/>API: vSphere REST 7.0+<br/>Port: 443/HTTPS<br/>Auth: API Token"]:::infra
K8S_API["Kubernetes API<br/>API: v1.28+<br/>Port: 6443/HTTPS<br/>Auth: ServiceAccount + RBAC"]:::infra K8S_API["Kubernetes API<br/>API: v1.28+<br/>Port: 6443/HTTPS<br/>Auth: ServiceAccount + RBAC"]:::infra
LINUX["Linux Servers<br/>Protocol: SSH/Ansible<br/>Port: 22<br/>Auth: SSH Keys"]:::infra LINUX["Linux Servers<br/>Protocol: SSH/Ansible<br/>Port: 22<br/>Auth: SSH Keys"]:::infra
CISCO["Cisco Devices<br/>Protocol: NETCONF/RESTCONF<br/>Port: 830/443<br/>Auth: AAA"]:::infra CISCO["Cisco Devices<br/>Protocol: NETCONF/RESTCONF<br/>Port: 830/443<br/>Auth: AAA"]:::infra
end end
%% ===================================== %% =====================================
%% LAYER 2: CONNETTORI %% LAYER 2: CONNETTORI
%% ===================================== %% =====================================
subgraph CONNECTORS["🔌 DATA COLLECTORS (Python/Go)"] subgraph CONNECTORS["🔌 DATA COLLECTORS (Python/Go)"]
CONN_VM["VMware Collector<br/>Lang: Python 3.11<br/>Lib: pyvmomi<br/>Schedule: */15 * * * *<br/>Output: JSON → Redis"]:::connector CONN_VM["VMware Collector<br/>Lang: Python 3.11<br/>Lib: pyvmomi<br/>Schedule: */15 * * * *<br/>Output: JSON"]:::connector
CONN_K8S["K8s Collector<br/>Lang: Python 3.11<br/>Lib: kubernetes-client<br/>Schedule: */5 * * * *<br/>Resources: pods,svc,ing,deploy"]:::connector CONN_K8S["K8s Collector<br/>Lang: Python 3.11<br/>Lib: kubernetes-client<br/>Schedule: */5 * * * *<br/>Resources: pods,svc,ing,deploy"]:::connector
CONN_LNX["Linux Collector<br/>Lang: Python 3.11<br/>Lib: paramiko/ansible<br/>Schedule: */30 * * * *<br/>Data: sysinfo,packages,services"]:::connector CONN_LNX["Linux Collector<br/>Lang: Python 3.11<br/>Lib: paramiko/ansible<br/>Schedule: */30 * * * *<br/>Data: sysinfo,packages,services"]:::connector
CONN_CSC["Cisco Collector<br/>Lang: Python 3.11<br/>Lib: ncclient<br/>Schedule: */30 * * * *<br/>Data: interfaces,routing,vlans"]:::connector CONN_CSC["Cisco Collector<br/>Lang: Python 3.11<br/>Lib: ncclient<br/>Schedule: */30 * * * *<br/>Data: interfaces,routing,vlans"]:::connector
end end
VCENTER -->|"GET /api/vcenter/vm"| CONN_VM VCENTER -->|"GET /api/vcenter/vm"| CONN_VM
K8S_API -->|"kubectl proxy<br/>API calls"| CONN_K8S K8S_API -->|"kubectl proxy<br/>API calls"| CONN_K8S
LINUX -->|"SSH batch<br/>commands"| CONN_LNX LINUX -->|"SSH batch<br/>commands"| CONN_LNX
CISCO -->|"NETCONF<br/>get-config"| CONN_CSC CISCO -->|"NETCONF<br/>get-config"| CONN_CSC
%% ===================================== %% =====================================
%% LAYER 3: REDIS STORAGE %% LAYER 3: MESSAGE BROKER
%% ===================================== %% =====================================
subgraph STORAGE["💾 REDIS CLUSTER"] subgraph MESSAGING["📨 KAFKA CLUSTER (3 brokers)"]
KAFKA_TOPICS["Kafka Topics:<br/>• vmware.inventory (P:6, R:3)<br/>• k8s.resources (P:12, R:3)<br/>• linux.systems (P:3, R:3)<br/>• cisco.network (P:3, R:3)<br/>Retention: 7 days<br/>Format: JSON + Schema Registry"]:::kafka
SCHEMA["Schema Registry<br/>Avro Schemas<br/>Versioning enabled<br/>Port: 8081"]:::kafka
end
CONN_VM -->|"Producer<br/>Batch 100 msg"| KAFKA_TOPICS
CONN_K8S -->|"Producer<br/>Batch 100 msg"| KAFKA_TOPICS
CONN_LNX -->|"Producer<br/>Batch 50 msg"| KAFKA_TOPICS
CONN_CSC -->|"Producer<br/>Batch 50 msg"| KAFKA_TOPICS
KAFKA_TOPICS <--> SCHEMA
%% =====================================
%% LAYER 4: PROCESSING & CACHE
%% =====================================
subgraph PROCESSING["⚙️ STREAM PROCESSING"]
CONSUMER_GRP["Kafka Consumer Group<br/>Group ID: doc-consumers<br/>Lang: Python 3.11<br/>Lib: kafka-python<br/>Workers: 6<br/>Commit: auto (5s)"]:::kafka
PROCESSOR["Data Processor<br/>• Validation<br/>• Transformation<br/>• Enrichment<br/>• Deduplication"]:::kafka
end
KAFKA_TOPICS -->|"Subscribe<br/>offset management"| CONSUMER_GRP
CONSUMER_GRP --> PROCESSOR
subgraph STORAGE["💾 CACHE LAYER (Optional)"]
REDIS_CLUSTER["Redis Cluster<br/>Mode: Cluster (6 nodes)<br/>Port: 6379<br/>Persistence: RDB + AOF<br/>Memory: 64GB<br/>Eviction: allkeys-lru"]:::cache REDIS_CLUSTER["Redis Cluster<br/>Mode: Cluster (6 nodes)<br/>Port: 6379<br/>Persistence: RDB + AOF<br/>Memory: 64GB<br/>Eviction: allkeys-lru"]:::cache
REDIS_KEYS["Key Structure:<br/>• vmware:vcenter-id:vms:hash<br/>• k8s:cluster:namespace:resource:hash<br/>• linux:hostname:info:hash<br/>• cisco:device-id:config:hash<br/>• changelog:timestamp:diff<br/>TTL: 30d for data, 90d for changelog"]:::cache REDIS_KEYS["Key Structure:<br/>• vmware:vcenter-id:vms<br/>• k8s:cluster:namespace:resource<br/>• linux:hostname:info<br/>• cisco:device-id:config<br/>TTL: 1-24h based on type"]:::cache
end end
CONN_VM -->|"HSET/HMSET<br/>+ Hash Storage"| REDIS_CLUSTER PROCESSOR -.->|"SET/HSET<br/>Pipeline batch"| REDIS_CLUSTER
CONN_K8S -->|"HSET/HMSET<br/>+ Hash Storage"| REDIS_CLUSTER
CONN_LNX -->|"HSET/HMSET<br/>+ Hash Storage"| REDIS_CLUSTER
CONN_CSC -->|"HSET/HMSET<br/>+ Hash Storage"| REDIS_CLUSTER
REDIS_CLUSTER --> REDIS_KEYS REDIS_CLUSTER --> REDIS_KEYS
%% ===================================== %% =====================================
%% LAYER 4: CHANGE DETECTION %% LAYER 5: LLM & MCP
%% ===================================== %% =====================================
subgraph CHANGE_DETECTION["🔍 CHANGE DETECTION SYSTEM"]
DETECTOR["Change Detector Service<br/>Lang: Python 3.11<br/>Lib: redis-py<br/>Algorithm: Hash comparison<br/>Check interval: */5 * * * *"]:::change
DIFF_ENGINE["Diff Engine<br/>• Deep object comparison<br/>• JSON diff generation<br/>• Change classification<br/>• Severity assessment"]:::change
CHANGE_LOG["Change Log Store<br/>Key: changelog:*<br/>Data: diff JSON + metadata<br/>Indexed by: timestamp, resource"]:::change
NOTIFIER["Change Notifier<br/>• Webhook triggers<br/>• Slack notifications<br/>• Event emission<br/>Target: LLM trigger"]:::change
end
REDIS_CLUSTER -->|"Monitor<br/>key changes"| DETECTOR
DETECTOR --> DIFF_ENGINE
DIFF_ENGINE -->|"Store diff"| CHANGE_LOG
CHANGE_LOG --> REDIS_CLUSTER
DIFF_ENGINE -->|"Notify if<br/>significant"| NOTIFIER
%% =====================================
%% LAYER 5: LLM TRIGGER & GENERATION
%% =====================================
subgraph TRIGGER_SYSTEM["⚡ TRIGGER SYSTEM"]
TRIGGER_SVC["Trigger Service<br/>Lang: Python 3.11<br/>Listen: Webhook + Redis Pub/Sub<br/>Debounce: 5 min<br/>Batch: multiple changes"]:::change
QUEUE["Generation Queue<br/>Type: Redis List<br/>Priority: High/Medium/Low<br/>Processing: FIFO"]:::change
end
NOTIFIER -->|"Trigger event"| TRIGGER_SVC
TRIGGER_SVC -->|"Enqueue<br/>generation task"| QUEUE
subgraph LLM_LAYER["🤖 AI GENERATION LAYER"] subgraph LLM_LAYER["🤖 AI GENERATION LAYER"]
LLM_ENGINE["LLM Engine<br/>Model: Qwen (Locale)<br/>API: Ollama/vLLM/LM Studio<br/>Port: 11434<br/>Temp: 0.3<br/>Max Tokens: 4096<br/>Timeout: 120s"]:::llm LLM_ENGINE["LLM Engine<br/>Model: Claude Sonnet 4 / GPT-4<br/>API: Anthropic/OpenAI<br/>Temp: 0.3<br/>Max Tokens: 4096<br/>Timeout: 120s"]:::llm
MCP_SERVER["MCP Server<br/>Lang: TypeScript/Node.js<br/>Port: 3000<br/>Protocol: JSON-RPC 2.0<br/>Auth: JWT tokens"]:::llm MCP_SERVER["MCP Server<br/>Lang: TypeScript/Node.js<br/>Port: 3000<br/>Protocol: JSON-RPC 2.0<br/>Auth: JWT tokens"]:::llm
MCP_TOOLS["MCP Tools:<br/>• getVMwareInventory(vcenter)<br/>• getK8sResources(cluster,ns,type)<br/>• getLinuxSystemInfo(hostname)<br/>• getCiscoConfig(device,section)<br/>• getChangelog(start,end,resource)<br/>Return: JSON + Metadata"]:::llm MCP_TOOLS["MCP Tools:<br/>• getVMwareInventory(vcenter)<br/>• getK8sResources(cluster,ns,type)<br/>• getLinuxSystemInfo(hostname)<br/>• getCiscoConfig(device,section)<br/>• queryTimeRange(start,end)<br/>Return: JSON + Metadata"]:::llm
end end
QUEUE -->|"Dequeue<br/>task"| LLM_ENGINE
LLM_ENGINE <-->|"Tool calls<br/>JSON-RPC"| MCP_SERVER LLM_ENGINE <-->|"Tool calls<br/>JSON-RPC"| MCP_SERVER
MCP_SERVER --> MCP_TOOLS MCP_SERVER --> MCP_TOOLS
MCP_TOOLS -->|"HGETALL/MGET<br/>Read data"| REDIS_CLUSTER MCP_TOOLS -->|"1. Query Kafka Consumer API<br/>GET /api/v1/data"| CONSUMER_GRP
REDIS_CLUSTER -->|"Config data<br/>+ Changelog"| MCP_TOOLS MCP_TOOLS -.->|"2. Fallback Redis<br/>MGET/HGETALL"| REDIS_CLUSTER
CONSUMER_GRP -->|"JSON Response<br/>+ Timestamps"| MCP_TOOLS
REDIS_CLUSTER -.->|"Cached JSON<br/>Fast response"| MCP_TOOLS
MCP_TOOLS -->|"Structured Data<br/>+ Context"| LLM_ENGINE MCP_TOOLS -->|"Structured Data<br/>+ Context"| LLM_ENGINE
subgraph OUTPUT["📝 DOCUMENT GENERATION"] subgraph OUTPUT["📝 DOCUMENT GENERATION"]
TEMPLATE["Template Engine<br/>Format: Jinja2<br/>Templates: markdown/*.j2<br/>Variables: from LLM"]:::llm TEMPLATE["Template Engine<br/>Format: Jinja2<br/>Templates: markdown/*.j2<br/>Variables: from LLM"]:::llm
MARKDOWN["Markdown Output<br/>Format: CommonMark<br/>Metadata: YAML frontmatter<br/>Change summary included<br/>Assets: diagrams in mermaid"]:::llm MARKDOWN["Markdown Output<br/>Format: CommonMark<br/>Metadata: YAML frontmatter<br/>Assets: diagrams in mermaid"]:::llm
VALIDATOR["Doc Validator<br/>• Markdown linting<br/>• Link checking<br/>• Schema validation<br/>• Change verification"]:::llm VALIDATOR["Doc Validator<br/>• Markdown linting<br/>• Link checking<br/>• Schema validation"]:::llm
end end
LLM_ENGINE --> TEMPLATE LLM_ENGINE --> TEMPLATE
TEMPLATE --> MARKDOWN TEMPLATE --> MARKDOWN
MARKDOWN --> VALIDATOR MARKDOWN --> VALIDATOR
%% ===================================== %% =====================================
%% LAYER 6: GITOPS %% LAYER 6: GITOPS
%% ===================================== %% =====================================
subgraph GITOPS["🔄 GITOPS WORKFLOW"] subgraph GITOPS["🔄 GITOPS WORKFLOW"]
GIT_REPO["GitLab Repository<br/>URL: gitlab.com/docs/infra<br/>Branch strategy: main + feature/*<br/>Protected: main (require approval)"]:::git GIT_REPO["GitLab Repository<br/>URL: gitlab.com/docs/infra<br/>Branch strategy: main + feature/*<br/>Protected: main (require approval)"]:::git
GIT_API["GitLab API<br/>API: v4<br/>Auth: Project Access Token<br/>Permissions: api, write_repo"]:::git GIT_API["GitLab API<br/>API: v4<br/>Auth: Project Access Token<br/>Permissions: api, write_repo"]:::git
PR_AUTO["Automated PR Creator<br/>Lang: Python 3.11<br/>Lib: python-gitlab<br/>Template: .gitlab/merge_request.md<br/>Include: change summary"]:::git PR_AUTO["Automated PR Creator<br/>Lang: Python 3.11<br/>Lib: python-gitlab<br/>Template: .gitlab/merge_request.md"]:::git
end end
VALIDATOR -->|"git add/commit/push"| GIT_REPO VALIDATOR -->|"git add/commit/push"| GIT_REPO
GIT_REPO <--> GIT_API GIT_REPO <--> GIT_API
GIT_API --> PR_AUTO GIT_API --> PR_AUTO
REVIEWER["👨‍💼 Technical Reviewer<br/>Role: Maintainer/Owner<br/>Review: diff + validation<br/>Check: change correlation<br/>Approve: required (min 1)"]:::monitor REVIEWER["👨‍💼 Technical Reviewer<br/>Role: Maintainer/Owner<br/>Review: diff + validation<br/>Approve: required (min 1)"]:::monitor
PR_AUTO -->|"Notification<br/>Email + Slack"| REVIEWER PR_AUTO -->|"Notification<br/>Email + Slack"| REVIEWER
REVIEWER -->|"Merge to main"| GIT_REPO REVIEWER -->|"Merge to main"| GIT_REPO
%% ===================================== %% =====================================
%% LAYER 7: CI/CD & PUBLISH %% LAYER 7: CI/CD & PUBLISH
%% ===================================== %% =====================================
subgraph CICD["⚡ CI/CD PIPELINE"] subgraph CICD["⚡ CI/CD PIPELINE"]
GITLAB_CI["GitLab CI/CD<br/>Runner: docker<br/>Image: python:3.11-alpine<br/>Stages: build, test, deploy"]:::git GITLAB_CI["GitLab CI/CD<br/>Runner: docker<br/>Image: python:3.11-alpine<br/>Stages: build, test, deploy"]:::git
PIPELINE_JOBS["Pipeline Jobs:<br/>1. lint (markdownlint-cli)<br/>2. build (mkdocs build)<br/>3. test (link-checker)<br/>4. deploy (rsync/s3)"]:::git PIPELINE_JOBS["Pipeline Jobs:<br/>1. lint (markdownlint-cli)<br/>2. build (mkdocs build)<br/>3. test (link-checker)<br/>4. deploy (rsync/s3)"]:::git
MKDOCS_CFG["MkDocs Config<br/>Theme: material<br/>Plugins: search, tags, mermaid<br/>Extensions: admonition, codehilite"]:::git MKDOCS_CFG["MkDocs Config<br/>Theme: material<br/>Plugins: search, tags, mermaid<br/>Extensions: admonition, codehilite"]:::git
end end
GIT_REPO -->|"on: push to main<br/>Webhook trigger"| GITLAB_CI GIT_REPO -->|"on: push to main<br/>Webhook trigger"| GITLAB_CI
GITLAB_CI --> PIPELINE_JOBS GITLAB_CI --> PIPELINE_JOBS
PIPELINE_JOBS --> MKDOCS_CFG PIPELINE_JOBS --> MKDOCS_CFG
subgraph PUBLISH["🌐 PUBLICATION"] subgraph PUBLISH["🌐 PUBLICATION"]
STATIC_SITE["Static Site<br/>Generator: MkDocs<br/>Output: HTML/CSS/JS<br/>Assets: optimized images"]:::git STATIC_SITE["Static Site<br/>Generator: MkDocs<br/>Output: HTML/CSS/JS<br/>Assets: optimized images"]:::git
CDN["GitLab Pages / S3 + CloudFront<br/>URL: docs.company.com<br/>SSL: Let's Encrypt<br/>Cache: 1h"]:::git CDN["GitLab Pages / S3 + CloudFront<br/>URL: docs.company.com<br/>SSL: Let's Encrypt<br/>Cache: 1h"]:::git
SEARCH["Search Index<br/>Engine: Algolia/Meilisearch<br/>Update: on publish<br/>API: REST"]:::git SEARCH["Search Index<br/>Engine: Algolia/Meilisearch<br/>Update: on publish<br/>API: REST"]:::git
end end
MKDOCS_CFG -->|"mkdocs build<br/>--strict"| STATIC_SITE MKDOCS_CFG -->|"mkdocs build<br/>--strict"| STATIC_SITE
STATIC_SITE --> CDN STATIC_SITE --> CDN
STATIC_SITE --> SEARCH STATIC_SITE --> SEARCH
%% ===================================== %% =====================================
%% LAYER 8: MONITORING & OBSERVABILITY %% LAYER 8: MONITORING & OBSERVABILITY
%% ===================================== %% =====================================
subgraph OBSERVABILITY["📊 MONITORING & LOGGING"] subgraph OBSERVABILITY["📊 MONITORING & LOGGING"]
PROMETHEUS["Prometheus<br/>Metrics: collector updates, changes detected<br/>Scrape: 30s<br/>Retention: 15d"]:::monitor PROMETHEUS["Prometheus<br/>Metrics: collector lag, cache hit/miss<br/>Scrape: 30s<br/>Retention: 15d"]:::monitor
GRAFANA["Grafana Dashboards<br/>• Collector status<br/>• Redis performance<br/>• Change detection rate<br/>• LLM response times<br/>• Pipeline success rate"]:::monitor GRAFANA["Grafana Dashboards<br/>• Kafka metrics<br/>• Redis performance<br/>• LLM response times<br/>• Pipeline success rate"]:::monitor
ELK["ELK Stack<br/>Logs: all components<br/>Index: daily rotation<br/>Retention: 30d"]:::monitor ELK["ELK Stack<br/>Logs: all components<br/>Index: daily rotation<br/>Retention: 30d"]:::monitor
ALERTS["Alerting<br/>• Collector failures<br/>• Redis issues<br/>• Change detection errors<br/>• Pipeline failures<br/>Channel: Slack + PagerDuty"]:::monitor ALERTS["Alerting<br/>• Connector failures<br/>• Kafka lag > 10k<br/>• Redis OOM<br/>• Pipeline failures<br/>Channel: Slack + PagerDuty"]:::monitor
end end
CONN_VM -.->|"metrics"| PROMETHEUS CONN_VM -.->|"metrics"| PROMETHEUS
CONN_K8S -.->|"metrics"| PROMETHEUS CONN_K8S -.->|"metrics"| PROMETHEUS
KAFKA_TOPICS -.->|"metrics"| PROMETHEUS
REDIS_CLUSTER -.->|"metrics"| PROMETHEUS REDIS_CLUSTER -.->|"metrics"| PROMETHEUS
DETECTOR -.->|"metrics"| PROMETHEUS
MCP_SERVER -.->|"metrics"| PROMETHEUS MCP_SERVER -.->|"metrics"| PROMETHEUS
GITLAB_CI -.->|"metrics"| PROMETHEUS GITLAB_CI -.->|"metrics"| PROMETHEUS
PROMETHEUS --> GRAFANA PROMETHEUS --> GRAFANA
CONN_VM -.->|"logs"| ELK CONN_VM -.->|"logs"| ELK
DETECTOR -.->|"logs"| ELK CONSUMER_GRP -.->|"logs"| ELK
MCP_SERVER -.->|"logs"| ELK MCP_SERVER -.->|"logs"| ELK
GITLAB_CI -.->|"logs"| ELK GITLAB_CI -.->|"logs"| ELK
GRAFANA --> ALERTS GRAFANA --> ALERTS
%% ===================================== %% =====================================
%% SECURITY & EFFICIENCY ANNOTATIONS %% SECURITY ANNOTATIONS
%% ===================================== %% =====================================
SEC1["🔒 SECURITY:<br/>• All APIs use TLS 1.3<br/>• Secrets in Vault/K8s Secrets<br/>• Network: private VPC<br/>• LLM has NO direct access"]:::monitor SEC1["🔒 SECURITY:<br/>• All APIs use TLS 1.3<br/>• Secrets in Vault/K8s Secrets<br/>• Network: private VPC<br/>• LLM has NO direct access"]:::monitor
SEC2["🔐 AUTHENTICATION:<br/>• API Tokens rotated 90d<br/>• RBAC enforced<br/>• Audit logs enabled<br/>• MFA required for Git"]:::monitor SEC2["🔐 AUTHENTICATION:<br/>• API Tokens rotated 90d<br/>• RBAC enforced<br/>• Audit logs enabled<br/>• MFA required for Git"]:::monitor
EFF1["⚡ EFFICIENCY:<br/>• Doc generation only on changes<br/>• Debounce prevents spam<br/>• Hash-based change detection<br/>• Batch processing"]:::change
SEC1 -.-> MCP_SERVER SEC1 -.-> MCP_SERVER
SEC2 -.-> GIT_REPO SEC2 -.-> GIT_REPO
EFF1 -.-> DETECTOR
``` ```
--- ---
## 💬 Sistema RAG Conversazionale
### Interrogazione Documentazione con AI
Sistema per "parlare" con la documentazione utilizzando Retrieval Augmented Generation (RAG). Permette agli utenti di porre domande in linguaggio naturale e ricevere risposte accurate basate sulla documentazione, con citazioni delle fonti.
#### Caratteristiche Principali
-**Semantic Search**: Ricerca vettoriale per comprendere l'intento della query
-**Scalabilità**: Gestione di grandi volumi di documentazione (100k+ documenti)
-**Performance**: Risposte in <3 secondi con caching intelligente
- **Accuratezza**: Re-ranking e source attribution per risposte precise
- **LLM Locale**: Qwen on-premise per privacy e controllo
### Schema RAG - Management View
```mermaid
graph TB
%% Styling
classDef docs fill:#e3f2fd,stroke:#1565c0,stroke-width:3px,color:#333
classDef process fill:#f3e5f5,stroke:#4a148c,stroke-width:3px,color:#333
classDef vector fill:#fff3e0,stroke:#e65100,stroke-width:3px,color:#333
classDef llm fill:#e8f5e9,stroke:#1b5e20,stroke-width:3px,color:#333
classDef user fill:#fff9c4,stroke:#f57f17,stroke-width:3px,color:#333
classDef cache fill:#fce4ec,stroke:#880e4f,stroke-width:3px,color:#333
%% ========================================
%% INGESTION PIPELINE (Offline)
%% ========================================
subgraph INGESTION["📚 INGESTION PIPELINE (Offline Process)"]
DOCS["📄 DOCUMENTAZIONE<br/>MkDocs Output<br/>Markdown Files"]:::docs
CHUNKER["✂️ DOCUMENT CHUNKER<br/>Split & Overlap<br/>Metadata Extraction"]:::process
EMBEDDER["🧠 EMBEDDING MODEL<br/>Text → Vectors<br/>Dimensione: 768/1024"]:::process
VECTORDB[("🗄️ VECTOR DATABASE<br/>Qdrant/Milvus<br/>Sharded & Replicated")]:::vector
end
DOCS -->|"Parse<br/>Markdown"| CHUNKER
CHUNKER -->|"Text Chunks<br/>+ Metadata"| EMBEDDER
EMBEDDER -->|"Store<br/>Embeddings"| VECTORDB
%% ========================================
%% QUERY PIPELINE (Real-time)
%% ========================================
subgraph QUERY["💬 QUERY PIPELINE (Real-time)"]
USER["👤 UTENTE<br/>Domanda/Query"]:::user
QUERY_EMBED["🧠 QUERY EMBEDDING<br/>Query → Vector"]:::process
SEARCH["🔍 SEMANTIC SEARCH<br/>Vector Similarity<br/>Top-K Results"]:::vector
RERANK["📊 RE-RANKING<br/>Context Scoring<br/>Relevance Filter"]:::process
CONTEXT["📋 CONTEXT BUILDER<br/>Assemble Chunks<br/>Add Metadata"]:::process
end
USER -->|"Natural Language<br/>Question"| QUERY_EMBED
QUERY_EMBED -->|"Query Vector"| SEARCH
SEARCH -->|"Search"| VECTORDB
VECTORDB -->|"Top-K Chunks<br/>+ Scores"| SEARCH
SEARCH -->|"Initial Results"| RERANK
RERANK -->|"Filtered<br/>Chunks"| CONTEXT
%% ========================================
%% GENERATION (LLM)
%% ========================================
subgraph GENERATION["🤖 ANSWER GENERATION"]
LLM_RAG["🤖 LLM ENGINE<br/>Qwen (Locale)<br/>+ RAG Context"]:::llm
ANSWER["💡 RISPOSTA<br/>Generated Answer<br/>+ Source Citations"]:::llm
end
CONTEXT -->|"Context<br/>+ Sources"| LLM_RAG
LLM_RAG -->|"Generate"| ANSWER
ANSWER -->|"Display"| USER
%% ========================================
%% CACHING & OPTIMIZATION
%% ========================================
CACHE[("💾 REDIS CACHE<br/>Query Cache<br/>Embedding Cache")]:::cache
QUERY_EMBED -.->|"Check Cache"| CACHE
CACHE -.->|"Cached<br/>Embedding"| SEARCH
SEARCH -.->|"Cache<br/>Results"| CACHE
%% ========================================
%% SCALING & UPDATE
%% ========================================
UPDATE["🔄 INCREMENTAL UPDATE<br/>On Doc Changes<br/>Auto Re-index"]:::docs
DOCS -.->|"Doc Updated"| UPDATE
UPDATE -.->|"Re-process<br/>Changed Docs"| CHUNKER
%% ========================================
%% ANNOTATIONS
%% ========================================
SCALE["📈 SCALABILITÀ<br/>• Vector DB sharding<br/>• Horizontal scaling<br/>• Load balancing"]:::vector
PERF["⚡ PERFORMANCE<br/>• Query cache<br/>• Embedding cache<br/>• Async processing"]:::cache
QUALITY["✅ QUALITY<br/>• Re-ranking<br/>• Relevance scoring<br/>• Source citations"]:::process
SCALE -.-> VECTORDB
PERF -.-> CACHE
QUALITY -.-> RERANK
```
### Schema RAG - Technical View
```mermaid
graph TB
%% Styling
classDef docs fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#333,font-size:11px
classDef process fill:#f3e5f5,stroke:#4a148c,stroke-width:2px,color:#333,font-size:11px
classDef vector fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#333,font-size:11px
classDef llm fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px,color:#333,font-size:11px
classDef user fill:#fff9c4,stroke:#f57f17,stroke-width:2px,color:#333,font-size:11px
classDef cache fill:#fce4ec,stroke:#880e4f,stroke-width:2px,color:#333,font-size:11px
classDef monitor fill:#fff8e1,stroke:#f57f17,stroke-width:2px,color:#333,font-size:11px
%% =====================================
%% LAYER 1: DOCUMENTATION SOURCE
%% =====================================
subgraph DOCSOURCE["📚 DOCUMENTATION SOURCE"]
MKDOCS_OUT["MkDocs Static Site<br/>Path: /site/<br/>Format: HTML + Markdown<br/>Assets: images, diagrams<br/>Update: on Git merge"]:::docs
DOC_WATCHER["Document Watcher<br/>Lang: Python 3.11<br/>Lib: watchdog<br/>Trigger: file system events<br/>Debounce: 30s"]:::docs
DOC_PARSER["Document Parser<br/>HTML → Plain Text<br/>Preserve structure<br/>Extract metadata<br/>Clean formatting"]:::docs
end
MKDOCS_OUT --> DOC_WATCHER
DOC_WATCHER -->|"New/Modified<br/>Docs"| DOC_PARSER
%% =====================================
%% LAYER 2: CHUNKING STRATEGY
%% =====================================
subgraph CHUNKING["✂️ INTELLIGENT CHUNKING"]
CHUNK_ENGINE["Chunking Engine<br/>Lang: Python 3.11<br/>Lib: langchain/llama-index<br/>Strategy: Recursive Character"]:::process
CHUNK_CONFIG["Chunking Config:<br/>• Chunk Size: 512 tokens<br/>• Overlap: 128 tokens<br/>• Separators: \\n\\n, \\n, . , ' '<br/>• Min chunk: 100 tokens<br/>• Max chunk: 1024 tokens"]:::process
METADATA_EXTRACTOR["Metadata Extractor<br/>Extract:<br/>• Document title<br/>• Section headers<br/>• Tags/keywords<br/>• Creation date<br/>• File path<br/>• Doc type"]:::process
end
DOC_PARSER -->|"Parsed Text"| CHUNK_ENGINE
CHUNK_ENGINE --> CHUNK_CONFIG
CHUNK_ENGINE --> METADATA_EXTRACTOR
%% =====================================
%% LAYER 3: EMBEDDING GENERATION
%% =====================================
subgraph EMBEDDING["🧠 EMBEDDING GENERATION"]
EMBED_MODEL["Embedding Model<br/>Model: all-MiniLM-L6-v2 / BGE-M3<br/>Dim: 384/768/1024<br/>API: sentence-transformers<br/>Batch size: 32<br/>GPU: CUDA acceleration"]:::process
EMBED_CACHE["Embedding Cache<br/>Type: Redis Hash<br/>Key: hash(text)<br/>TTL: 30d<br/>Hit rate target: >80%"]:::cache
EMBED_QUEUE["Processing Queue<br/>Type: Redis List<br/>Workers: 4-8<br/>Rate: 100 chunks/s<br/>Retry: 3 attempts"]:::process
end
METADATA_EXTRACTOR -->|"Chunks<br/>+ Metadata"| EMBED_QUEUE
EMBED_QUEUE --> EMBED_MODEL
EMBED_MODEL <-.->|"Cache<br/>Check/Store"| EMBED_CACHE
%% =====================================
%% LAYER 4: VECTOR DATABASE
%% =====================================
subgraph VECTORDB["🗄️ VECTOR DATABASE CLUSTER"]
QDRANT["Qdrant Cluster<br/>Version: 1.7+<br/>Nodes: 3-6 (replicated)<br/>Shards: auto per collection<br/>Port: 6333/6334"]:::vector
COLLECTIONS["Collections:<br/>• docs_main (dim: 768)<br/>• docs_code (dim: 768)<br/>• docs_api (dim: 768)<br/>Distance: Cosine<br/>Index: HNSW (M=16, ef=100)"]:::vector
SHARD_STRATEGY["Sharding Strategy:<br/>• Auto-sharding enabled<br/>• Shard size: 100k vectors<br/>• Replication factor: 2<br/>• Load balancing: Round-robin"]:::vector
end
EMBED_MODEL -->|"Store<br/>Vectors"| QDRANT
QDRANT --> COLLECTIONS
QDRANT --> SHARD_STRATEGY
%% =====================================
%% LAYER 5: QUERY PROCESSING
%% =====================================
subgraph QUERYPROC["💬 QUERY PROCESSING PIPELINE"]
USER_INPUT["User Input<br/>Interface: Web UI / API<br/>Auth: JWT tokens<br/>Rate limit: 20 req/min<br/>Timeout: 30s"]:::user
QUERY_PREPROCESS["Query Preprocessor<br/>• Spelling correction<br/>• Intent detection<br/>• Query expansion<br/>• Language detection"]:::process
QUERY_EMBEDDER["Query Embedder<br/>Same model as docs<br/>Cache: Redis<br/>Latency: <50ms"]:::process
HYBRID_SEARCH["Hybrid Search<br/>1. Vector search (semantic)<br/>2. Keyword search (BM25)<br/>3. Fusion: RRF algorithm<br/>Top-K: 20 initial results"]:::vector
end
USER_INPUT -->|"Natural<br/>Language"| QUERY_PREPROCESS
QUERY_PREPROCESS --> QUERY_EMBEDDER
QUERY_EMBEDDER <-.->|"Cache"| EMBED_CACHE
QUERY_EMBEDDER -->|"Query<br/>Vector"| HYBRID_SEARCH
HYBRID_SEARCH -->|"Search"| QDRANT
%% =====================================
%% LAYER 6: RE-RANKING & FILTERING
%% =====================================
subgraph RERANK["📊 RE-RANKING & FILTERING"]
RERANKER["Cross-Encoder Re-ranker<br/>Model: ms-marco-MiniLM<br/>Purpose: Fine-grained relevance<br/>Process: Top-20 → Top-5<br/>Latency: 100-200ms"]:::process
FILTER_ENGINE["Filter Engine<br/>• Relevance threshold: >0.7<br/>• Deduplication<br/>• Diversity scoring<br/>• Metadata filtering"]:::process
CONTEXT_BUILDER["Context Builder<br/>• Assemble top chunks<br/>• Add source citations<br/>• Format for LLM<br/>• Max context: 4k tokens"]:::process
end
QDRANT -->|"Top-K<br/>Results"| RERANKER
RERANKER --> FILTER_ENGINE
FILTER_ENGINE --> CONTEXT_BUILDER
%% =====================================
%% LAYER 7: LLM GENERATION
%% =====================================
subgraph LLMGEN["🤖 LLM ANSWER GENERATION"]
RAG_PROMPT["RAG Prompt Template<br/>Structure:<br/>• System: You are a helpful assistant<br/>• Context: Retrieved chunks<br/>• Question: User query<br/>• Instruction: Answer using context"]:::llm
LLM_ENGINE["LLM Engine<br/>Model: Qwen 2.5 (14B/32B)<br/>API: Ollama/vLLM<br/>Port: 11434<br/>Temp: 0.2 (factual)<br/>Max tokens: 2048<br/>Stream: enabled"]:::llm
ANSWER_POST["Answer Post-processor<br/>• Citation formatting<br/>• Source links<br/>• Confidence scoring<br/>• Fallback handling"]:::llm
end
CONTEXT_BUILDER -->|"Context<br/>+ Sources"| RAG_PROMPT
QUERY_PREPROCESS -->|"Original<br/>Question"| RAG_PROMPT
RAG_PROMPT --> LLM_ENGINE
LLM_ENGINE --> ANSWER_POST
ANSWER_POST -->|"Final<br/>Answer"| USER_INPUT
%% =====================================
%% LAYER 8: CACHING LAYER
%% =====================================
subgraph CACHING["💾 MULTI-LEVEL CACHE"]
REDIS_CACHE["Redis Cluster<br/>Mode: Cluster<br/>Nodes: 3<br/>Memory: 16GB<br/>Persistence: AOF"]:::cache
CACHE_TYPES["Cache Types:<br/>• Query embeddings (TTL: 7d)<br/>• Search results (TTL: 1h)<br/>• LLM responses (TTL: 24h)<br/>• Popular queries (no TTL)<br/>Eviction: LRU"]:::cache
CACHE_WARMING["Cache Warming<br/>Pre-compute:<br/>• Top 100 queries<br/>• Common patterns<br/>Schedule: daily<br/>Update: on doc changes"]:::cache
end
REDIS_CACHE --> CACHE_TYPES
CACHE_TYPES --> CACHE_WARMING
QUERY_EMBEDDER <-.-> REDIS_CACHE
HYBRID_SEARCH <-.-> REDIS_CACHE
LLM_ENGINE <-.-> REDIS_CACHE
%% =====================================
%% LAYER 9: SCALING & LOAD BALANCING
%% =====================================
subgraph SCALING["📈 SCALING INFRASTRUCTURE"]
LOAD_BALANCER["Load Balancer<br/>Type: Nginx / HAProxy<br/>Algorithm: Least connections<br/>Health checks: /health<br/>Timeout: 30s"]:::monitor
QUERY_API["Query API Instances<br/>Replicas: 3-10 (auto-scale)<br/>Lang: FastAPI<br/>Container: Docker<br/>Orchestration: K8s"]:::user
EMBED_WORKERS["Embedding Workers<br/>Replicas: 4-8<br/>GPU: Optional<br/>Queue: Redis<br/>Auto-scale: based on queue depth"]:::process
end
LOAD_BALANCER --> QUERY_API
QUERY_API --> USER_INPUT
%% =====================================
%% LAYER 10: MONITORING & OBSERVABILITY
%% =====================================
subgraph MONITORING["📊 MONITORING & ANALYTICS"]
METRICS["Prometheus Metrics<br/>• Query latency (p50, p95, p99)<br/>• Vector search time<br/>• LLM response time<br/>• Cache hit rate<br/>• Embedding generation rate<br/>Scrape: 15s"]:::monitor
DASHBOARDS["Grafana Dashboards<br/>• RAG Performance<br/>• Query analytics<br/>• Resource utilization<br/>• Error tracking<br/>Refresh: real-time"]:::monitor
ANALYTICS["Query Analytics<br/>Track:<br/>• Popular queries<br/>• Failed queries<br/>• Avg relevance scores<br/>• User satisfaction<br/>Storage: TimescaleDB"]:::monitor
ALERTS["Alerting Rules<br/>• Latency > 5s<br/>• Error rate > 5%<br/>• Cache hit < 70%<br/>• Vector DB down<br/>Channel: Slack + Email"]:::monitor
end
METRICS --> DASHBOARDS
DASHBOARDS --> ANALYTICS
ANALYTICS --> ALERTS
QUERY_API -.->|"metrics"| METRICS
HYBRID_SEARCH -.->|"metrics"| METRICS
LLM_ENGINE -.->|"metrics"| METRICS
QDRANT -.->|"metrics"| METRICS
%% =====================================
%% LAYER 11: FEEDBACK LOOP
%% =====================================
subgraph FEEDBACK["🔄 FEEDBACK & IMPROVEMENT"]
USER_FEEDBACK["User Feedback<br/>• Thumbs up/down<br/>• Relevance rating<br/>• Comments<br/>Storage: PostgreSQL"]:::user
FEEDBACK_ANALYSIS["Feedback Analysis<br/>• Identify bad answers<br/>• Track improvement areas<br/>• A/B testing results<br/>Schedule: weekly"]:::monitor
MODEL_TUNING["Model Fine-tuning<br/>• Re-rank model updates<br/>• Prompt optimization<br/>• Chunk size tuning<br/>Cycle: monthly"]:::process
end
USER_INPUT -->|"Rate<br/>Answer"| USER_FEEDBACK
USER_FEEDBACK --> FEEDBACK_ANALYSIS
FEEDBACK_ANALYSIS --> MODEL_TUNING
MODEL_TUNING -.->|"Improve"| RERANKER
%% =====================================
%% ANNOTATIONS
%% =====================================
SCALE_NOTE["📈 SCALABILITY:<br/>• Vector DB: Horizontal sharding<br/>• API: K8s auto-scaling (HPA)<br/>• Workers: Queue-based scaling<br/>• Cache: Redis cluster<br/>Target: 100k+ docs, 1k+ QPS"]:::monitor
PERF_NOTE["⚡ PERFORMANCE TARGETS:<br/>• Query latency: <3s (p95)<br/>• Vector search: <100ms<br/>• LLM generation: <2s<br/>• Cache hit rate: >80%<br/>• Throughput: 1000 QPS"]:::cache
QUALITY_NOTE["✅ QUALITY ASSURANCE:<br/>• Re-ranking for precision<br/>• Source attribution<br/>• Confidence scoring<br/>• Fallback responses<br/>• Human feedback loop"]:::process
SCALE_NOTE -.-> QDRANT
PERF_NOTE -.-> REDIS_CACHE
QUALITY_NOTE -.-> RERANKER
```
### Pipeline RAG
**1. Ingestion Pipeline (Offline)**
- Parsing documentazione MkDocs
- Chunking intelligente (512 token, overlap 128)
- Generazione embeddings (all-MiniLM-L6-v2)
- Storage in Vector Database (Qdrant cluster)
**2. Query Pipeline (Real-time)**
- Embedding della query utente
- Hybrid search (semantic + keyword)
- Re-ranking con cross-encoder
- Context assembly per LLM
**3. Generation**
- LLM locale (Qwen) con RAG context
- Source attribution automatica
- Streaming delle risposte
**4. Scaling Strategy**
- Vector DB sharding automatico
- API instances con auto-scaling K8s
- Redis cluster per caching multi-livello
- Load balancing con Nginx
---
## 📧 Contatti ## 📧 Contatti
- **Team**: Infrastructure Documentation Team - **Team**: Infrastructure Documentation Team
@@ -386,5 +725,5 @@ graph TB
--- ---
**Versione**: 1.0.0 **Versione**: 1.0.0
**Ultimo aggiornamento**: 2025-10-28 **Ultimo aggiornamento**: 2025-10-28