14 KiB
API Service - FastAPI Backend
A RESTful API service built with FastAPI featuring automatic Swagger documentation, CRUD operations, and integrated LLM endpoints with OpenAI-compatible API support.
📋 Overview
This API backend provides:
- RESTful CRUD API: Items and Users management
- LLM Integration: OpenAI-compatible chat endpoints for AI-powered features
- Automatic Documentation: Swagger UI and ReDoc
- Data Validation: Pydantic models with type checking
- Health Monitoring: Health check endpoints
- Production Ready: Designed for Kubernetes deployment with API7 Gateway
✨ Features
REST API
- Items Management: Create, read, update, delete items
- Users Management: User CRUD operations
- Pagination Support: Query parameters for data filtering
- Validation: Automatic request/response validation with Pydantic
LLM Integration
- Chat Endpoint: OpenAI-compatible chat completions API
- Model Management: List available LLM models
- Token Tracking: Returns token usage per request
- Configurable: Supports custom OpenAI-compatible backends (Open WebUI, Ollama, etc.)
- Rate Limited: Designed to work with API7's AI rate limiting (100 tokens/60s)
Documentation
- Swagger UI: Interactive API documentation at
/docs - ReDoc: Alternative documentation at
/redoc - OpenAPI Schema: JSON schema at
/openapi.json
🚀 Quick Start
Local Development
Prerequisites
python >= 3.8
pip
Install Dependencies
cd api
pip install -r requirements.txt
Run the Application
# Basic run
python main.py
# Or use uvicorn directly
uvicorn main:app --reload --host 0.0.0.0 --port 8001
# With custom port
uvicorn main:app --reload --port 8080
Access the API
- Root: http://localhost:8001/
- Swagger UI: http://localhost:8001/docs
- ReDoc: http://localhost:8001/redoc
- Health Check: http://localhost:8001/health
Docker
Build Image
docker build -t api-service .
Run Container
# Basic run
docker run -p 8001:8001 api-service
# With environment variables
docker run -p 8001:8001 \
-e OPENAI_API_BASE="http://host.docker.internal:11434/api" \
-e OPENAI_API_KEY="your-api-key" \
-e DEFAULT_LLM_MODEL="videogame-expert" \
api-service
🔌 API Endpoints
Information Endpoints
GET /
Root endpoint with API information.
Response:
{
"message": "Welcome to API Demo",
"version": "1.0.0",
"docs": "/docs",
"timestamp": "2025-10-07T10:00:00"
}
GET /health
Health check endpoint.
Response:
{
"status": "healthy",
"service": "api",
"timestamp": "2025-10-07T10:00:00"
}
Items Endpoints
GET /items
Get all items with optional pagination.
Query Parameters:
- None (returns all items)
Response:
[
{
"id": 1,
"name": "Laptop",
"description": "High-performance laptop",
"price": 999.99,
"in_stock": true
},
{
"id": 2,
"name": "Mouse",
"description": "Wireless mouse",
"price": 29.99,
"in_stock": true
}
]
GET /items/{item_id}
Get a specific item by ID.
Response:
{
"id": 1,
"name": "Laptop",
"description": "High-performance laptop",
"price": 999.99,
"in_stock": true
}
POST /items
Create a new item.
Request Body:
{
"name": "Monitor",
"description": "4K Display",
"price": 299.99,
"in_stock": true
}
Response:
{
"id": 4,
"name": "Monitor",
"description": "4K Display",
"price": 299.99,
"in_stock": true
}
PUT /items/{item_id}
Update an existing item.
Request Body:
{
"name": "Monitor",
"description": "Updated 4K Display",
"price": 279.99,
"in_stock": true
}
DELETE /items/{item_id}
Delete an item.
Response:
{
"message": "Item deleted successfully"
}
Users Endpoints
GET /users
Get all users.
Response:
[
{
"id": 1,
"username": "john_doe",
"email": "john@example.com",
"active": true
}
]
GET /users/{user_id}
Get a specific user by ID.
POST /users
Create a new user.
Request Body:
{
"username": "jane_doe",
"email": "jane@example.com",
"active": true
}
LLM Endpoints
POST /llm/chat
Send a chat message to the LLM.
Request Body:
{
"prompt": "What is The Legend of Zelda?",
"max_tokens": 150,
"temperature": 0.7,
"model": "videogame-expert"
}
Response:
{
"response": "The Legend of Zelda is a high-fantasy action-adventure video game franchise created by Japanese game designers Shigeru Miyamoto and Takashi Tezuka...",
"tokens_used": 85,
"model": "videogame-expert",
"timestamp": "2025-10-07T10:00:00"
}
Rate Limiting: When deployed with API7 Gateway, this endpoint is limited to 100 tokens per 60 seconds using AI rate limiting.
GET /llm/models
List available LLM models.
Response:
{
"models": [
{
"id": "videogame-expert",
"name": "Videogame Expert",
"max_tokens": 4096,
"provider": "Open WebUI"
}
],
"default_model": "videogame-expert",
"timestamp": "2025-10-07T10:00:00"
}
GET /llm/health
LLM service health check.
Response:
{
"status": "healthy",
"service": "llm-api",
"provider": "Open WebUI",
"endpoint": "http://localhost/api",
"default_model": "videogame-expert",
"rate_limit": "ai-rate-limiting enabled (100 tokens/60s)",
"timestamp": "2025-10-07T10:00:00"
}
🔧 Configuration
Environment Variables
Configure the API service using environment variables:
| Variable | Description | Default | Required |
|---|---|---|---|
OPENAI_API_BASE |
OpenAI-compatible API endpoint URL | http://localhost/api |
No |
OPENAI_API_KEY |
API key for LLM service | your-api-key |
No |
DEFAULT_LLM_MODEL |
Default LLM model ID | your-model-id |
No |
Example Configuration
Development:
export OPENAI_API_BASE="http://localhost:11434/api"
export OPENAI_API_KEY="not-required-for-ollama"
export DEFAULT_LLM_MODEL="llama2"
Production (Open WebUI):
export OPENAI_API_BASE="https://openwebui.example.com/api"
export OPENAI_API_KEY="sk-xxxxxxxxxxxxxxxx"
export DEFAULT_LLM_MODEL="videogame-expert"
Kubernetes Deployment:
env:
- name: OPENAI_API_BASE
value: "http://openwebui.ai:8080/api"
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: llm-secrets
key: api-key
- name: DEFAULT_LLM_MODEL
value: "videogame-expert"
📊 Data Models
Item Model
class Item(BaseModel):
id: Optional[int] = None
name: str
description: Optional[str] = None
price: float
in_stock: bool = True
User Model
class User(BaseModel):
id: Optional[int] = None
username: str
email: str
active: bool = True
LLM Request Model
class LLMRequest(BaseModel):
prompt: str
max_tokens: Optional[int] = 150
temperature: Optional[float] = 0.7
model: Optional[str] = DEFAULT_MODEL
LLM Response Model
class LLMResponse(BaseModel):
response: str
tokens_used: int
model: str
timestamp: str
🧪 Testing
cURL Examples
Create Item:
curl -X POST "http://localhost:8001/items" \
-H "Content-Type: application/json" \
-d '{
"name": "Keyboard",
"description": "Mechanical keyboard",
"price": 79.99,
"in_stock": true
}'
Get Items:
curl "http://localhost:8001/items"
Update Item:
curl -X PUT "http://localhost:8001/items/1" \
-H "Content-Type: application/json" \
-d '{
"name": "Laptop Pro",
"description": "Updated laptop",
"price": 1099.99,
"in_stock": true
}'
Delete Item:
curl -X DELETE "http://localhost:8001/items/3"
LLM Chat:
curl -X POST "http://localhost:8001/llm/chat" \
-H "Content-Type: application/json" \
-d '{
"prompt": "Tell me about Mario Bros",
"max_tokens": 100,
"temperature": 0.7,
"model": "videogame-expert"
}'
Health Checks:
curl "http://localhost:8001/health"
curl "http://localhost:8001/llm/health"
Python Testing
import httpx
# Test item creation
async with httpx.AsyncClient() as client:
response = await client.post(
"http://localhost:8001/items",
json={
"name": "Monitor",
"description": "4K Display",
"price": 299.99,
"in_stock": True
}
)
print(response.json())
# Test LLM chat
async with httpx.AsyncClient() as client:
response = await client.post(
"http://localhost:8001/llm/chat",
json={
"prompt": "What is Minecraft?",
"max_tokens": 150,
"model": "videogame-expert"
}
)
print(response.json())
🐳 Docker
Dockerfile
The Dockerfile uses Python 3.11 slim image and installs dependencies directly:
FROM python:3.11-slim
WORKDIR /app
# Install dependencies
RUN pip install --no-cache-dir fastapi uvicorn[standard] pydantic
# Copy application
COPY db.json .
COPY main.py .
EXPOSE 8001
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8001"]
Build and Run
# Build
docker build -t api-service:latest .
# Run with environment variables
docker run -d \
--name api-service \
-p 8001:8001 \
-e OPENAI_API_BASE="http://host.docker.internal:11434/api" \
-e OPENAI_API_KEY="your-key" \
api-service:latest
# View logs
docker logs -f api-service
# Stop and remove
docker stop api-service
docker rm api-service
☸️ Kubernetes Deployment
Basic Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-service
namespace: api7ee
spec:
replicas: 3
selector:
matchLabels:
app: api
template:
metadata:
labels:
app: api
spec:
containers:
- name: api
image: git.commandware.com/demos/api7-demo/api:main
ports:
- containerPort: 8001
name: http
env:
- name: OPENAI_API_BASE
value: "http://openwebui.ai:8080/api"
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: llm-secrets
key: api-key
- name: DEFAULT_LLM_MODEL
value: "videogame-expert"
livenessProbe:
httpGet:
path: /health
port: 8001
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 8001
initialDelaySeconds: 10
periodSeconds: 5
resources:
limits:
cpu: 1000m
memory: 1Gi
requests:
cpu: 500m
memory: 512Mi
---
apiVersion: v1
kind: Service
metadata:
name: api-service
namespace: api7ee
spec:
type: ClusterIP
ports:
- port: 8080
targetPort: 8001
name: http
selector:
app: api
With Helm
See the Helm chart README for full deployment options.
helm install api7ee-demo ./helm/api7ee-demo-k8s \
--set api.image.tag=v1.0.0 \
--set api.replicaCount=5 \
--namespace api7ee
🔒 Security
Best Practices
- Environment Variables: Store sensitive data (API keys) in Kubernetes Secrets
- Non-root User: Container runs as non-root user (UID 1000)
- Read-only Filesystem: Root filesystem is read-only
- Input Validation: All requests validated with Pydantic
- CORS: Configure CORS policies in API7 Gateway
- Rate Limiting: API7 Gateway enforces rate limits
Example Secret
kubectl create secret generic llm-secrets \
--from-literal=api-key='your-openai-api-key' \
-n api7ee
📦 Dependencies
Python Requirements
fastapi==0.104.1
uvicorn[standard]==0.24.0
pydantic==2.5.0
httpx==0.26.0
Install
pip install -r requirements.txt
🚀 Production Deployment
Recommended Configuration
api:
replicaCount: 5
autoscaling:
enabled: true
minReplicas: 5
maxReplicas: 30
targetCPUUtilizationPercentage: 70
resources:
limits:
cpu: 2000m
memory: 2Gi
requests:
cpu: 1000m
memory: 1Gi
env:
- name: LOG_LEVEL
value: "warn"
- name: ENVIRONMENT
value: "production"
API7 Gateway Integration
When deployed behind API7 Gateway:
Rate Limiting:
/api/*: 100 requests/60s per IP (standard rate limiting)/api/llm/*: 100 tokens/60s (AI rate limiting)
Routing:
- Priority 10:
/api/*routes - Priority 20:
/api/llm/*routes (higher priority)
Plugins:
redirect: HTTP → HTTPSlimit-count: IP-based rate limitingai-rate-limiting: Token-based LLM rate limiting
📚 Resources
Documentation
- FastAPI: https://fastapi.tiangolo.com/
- Pydantic: https://docs.pydantic.dev/
- Uvicorn: https://www.uvicorn.org/
- OpenAI API: https://platform.openai.com/docs/api-reference
Related
🐛 Troubleshooting
Common Issues
Issue: LLM endpoints return errors
# Check environment variables
echo $OPENAI_API_BASE
echo $OPENAI_API_KEY
# Test LLM backend directly
curl -X POST "$OPENAI_API_BASE/chat/completions" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"videogame-expert","messages":[{"role":"user","content":"test"}]}'
Issue: Rate limiting triggered
# Check API7 Gateway logs
kubectl logs -n api7ee -l app=api7-gateway
# Response: HTTP 429
# Cause: Exceeded 100 tokens/60s or 100 req/60s
# Solution: Wait for rate limit window to reset
Issue: Health check fails
# Check if service is running
curl http://localhost:8001/health
# Check logs
docker logs api-service
# or
kubectl logs -n api7ee -l app=api
Version: 1.0.0 | Port: 8001 | Framework: FastAPI 0.104.1