9.3 KiB

Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Brief Bench FastAPI is a FastAPI backend for a RAG (Retrieval-Augmented Generation) testing system with multi-user support. The system allows users to test RAG backends across three environments (IFT, PSI, PROD) in two modes: Bench mode (batch testing) and Backend mode (bot simulation).

Key Concepts

Multi-Environment Architecture

Three environments: IFT, PSI, PROD (each with separate RAG backend hosts)
Per-user settings: Each user has individual settings for each environment (stored in external DB API)
Server configuration: RAG hosts, ports, endpoints, and mTLS cert paths are defined in .env (not user-editable)

Two Query Modes

Bench Mode: Send all questions in a single batch request to RAG backend
- Endpoint: POST /api/v1/query/bench
- Uses headers: Request-Id, System-Id, Authorization, System-Platform
Backend Mode: Send questions one-by-one, simulating bot behavior
- Endpoint: POST /api/v1/query/backend
- Can reset session after each question (controlled by resetSessionMode setting)
- Uses headers: Platform-User-Id, Platform-Id, Authorization

Authentication Flow

User sends POST /api/v1/auth/login?login=12345678 (8-digit login)
FastAPI forwards to DB API: POST /users/login
DB API returns user info (user_id, login, timestamps)
FastAPI generates JWT token (30-day expiration) with {user_id, login, exp}
Returns {access_token, user} to client
All subsequent requests use Authorization: Bearer {token}
Middleware get_current_user validates JWT on protected endpoints

Development Commands

Local Development

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Linux/Mac
venv\Scripts\activate     # Windows

# Install dependencies
pip install -r requirements.txt

# Run development server (with auto-reload)
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Docker

# Build and run
docker-compose up -d

# View logs
docker-compose logs -f fastapi

# Stop
docker-compose down

Environment Setup

# Copy example and edit
cp .env.example .env

# Required variables:
# - JWT_SECRET_KEY (generate a new secret!)
# - DB_API_URL (external DB API service URL)
# - IFT_RAG_HOST, PSI_RAG_HOST, PROD_RAG_HOST
# - Port and endpoint configs for each environment
# - Optional: mTLS certificate paths (certs/ directory)

Architecture Details

Layered Structure

app/
├── api/v1/          # API endpoints (FastAPI routers)
├── models/          # Pydantic models for validation
├── services/        # Business logic layer
├── interfaces/      # External API clients
├── middleware/      # Request/response middleware
├── utils/           # Utilities (JWT, security)
├── config.py        # Settings from .env (pydantic-settings)
├── dependencies.py  # Dependency injection
└── main.py          # FastAPI app entry point

TgBackendInterface Pattern

The TgBackendInterface class (app/interfaces/base.py) is the base for all HTTP API clients:

Wraps httpx.AsyncClient with Pydantic model serialization/deserialization
Provides get(), post(), put(), delete() methods
Handles error logging and HTTP status errors
Auto-serializes Pydantic models to JSON and deserializes JSON to Pydantic models
Initialized with api_prefix (base URL)

DBApiClient (app/interfaces/db_api_client.py) extends this for DB API integration:

login_user() - authenticate user
get_user_settings(), update_user_settings() - manage per-user settings
save_session(), get_sessions(), get_session(), delete_session() - analysis sessions

See DB_API_CONTRACT.md for full DB API specification.

RagService Pattern

The RagService class (app/services/rag_service.py) manages RAG backend communication:

Creates separate httpx.AsyncClient for each environment (IFT, PSI, PROD)
Configures mTLS certificates from .env settings
send_bench_query() - batch mode requests
send_backend_query() - sequential requests with optional session reset
Builds environment-specific headers from user settings

Dependency Injection

Located in app/dependencies.py:

get_db_client() - Returns DBApiClient instance
get_current_user() - JWT auth middleware, validates Bearer token, returns {user_id, login}

Use in endpoints:

@router.get("/protected")
async def protected_endpoint(
    current_user: dict = Depends(get_current_user),
    db_client: DBApiClient = Depends(get_db_client)
):
    user_id = current_user["user_id"]
    # ...

User Settings vs Server Configuration

Server Config (.env): RAG hosts, ports, endpoints, mTLS cert paths - applies to all users User Settings (DB API): Per-environment settings for each user:

apiMode: "bench" or "backend"
bearerToken: Authorization token for RAG backend
systemPlatform, systemPlatformUser: Platform identifiers
platformUserId, platformId: Platform-specific IDs
withClassify: Enable classification in backend mode
resetSessionMode: Reset session after each question in backend mode

Important Implementation Notes

mTLS Certificate Handling

Certificate paths are configured per-environment in .env
Format: /app/certs/{env}/ca.crt, /app/certs/{env}/client.key, /app/certs/{env}/client.crt
RagService loads certificates from these paths when creating httpx clients
Certificates are mounted read-only in Docker: ./certs:/app/certs:ro

Error Handling

All API client methods can raise httpx.HTTPStatusError for HTTP errors
Endpoints should catch these and convert to FastAPI HTTPException
Use status.HTTP_502_BAD_GATEWAY for external service errors
Use status.HTTP_500_INTERNAL_SERVER_ERROR for unexpected errors
Always log errors with context (user_id, environment, request details)

Async Context Managers

Both TgBackendInterface and RagService support async context managers:

async with RagService() as rag_service:
    response = await rag_service.send_bench_query(...)
# Client automatically closed

Or manually close:

rag_service = RagService()
try:
    response = await rag_service.send_bench_query(...)
finally:
    await rag_service.close()

Long-Running Requests

RAG requests can take up to 30 minutes (RAG_REQUEST_TIMEOUT=1800)
RagService uses 30-minute timeout for httpx clients
DB API has separate timeout (DB_API_TIMEOUT=30, default 30s)

API Endpoints

Authentication

POST /api/v1/auth/login?login={8-digit} - Login with 8-digit code, returns JWT token

Settings

GET /api/v1/settings - Get current user's settings for all environments
PUT /api/v1/settings - Update user settings

Query

POST /api/v1/query/bench - Send batch query (bench mode)
POST /api/v1/query/backend - Send sequential queries (backend mode)

Analysis Sessions

POST /api/v1/analysis/sessions - Save analysis session
GET /api/v1/analysis/sessions - List sessions (filterable by environment)
GET /api/v1/analysis/sessions/{session_id} - Get specific session
DELETE /api/v1/analysis/sessions/{session_id} - Delete session

Health

GET /health - Health check
GET / - API info

Testing Strategy

When adding tests:

Unit test services in isolation (mock httpx responses)
Integration test API endpoints (use TestClient from FastAPI)
Mock external dependencies (DB API, RAG backends)
Test error paths (network failures, invalid tokens, missing settings)

Common Development Scenarios

Adding a New API Endpoint

Define Pydantic models in app/models/
Create endpoint in appropriate app/api/v1/ router
Add business logic to app/services/ if needed
Register router in app/main.py with app.include_router()

Modifying DB API Integration

Update contract in DB_API_CONTRACT.md
Update Pydantic models in app/models/
Add/modify methods in app/interfaces/db_api_client.py

Adding RAG Backend Endpoints

Add endpoint config to .env.example and app/config.py
Update user settings model if needed (app/models/settings.py)
Modify RagService methods to use new endpoints
Update query endpoint logic in app/api/v1/query.py

Security Considerations

JWT tokens have 30-day expiration (configurable via JWT_EXPIRE_MINUTES)
mTLS certificates are stored server-side only (never sent to client)
CORS is currently set to allow_origins=["*"] - configure properly for production
Never commit .env file (use .env.example as template)
Secrets (JWT_SECRET_KEY, bearer tokens) must be kept secure
User settings may contain sensitive tokens - handle with care

Project Status

See PROJECT_STATUS.md for detailed implementation status and TODOs. Key points:

Core infrastructure is complete (auth, DB API client, RAG service)
All main API endpoints are implemented
TgBackendInterface is fully implemented (not a stub)
Frontend integration pending (static/ directory is empty)
No tests yet (tests/ directory is empty)

9.3 KiB Raw Blame History