brief-rags-bench/CLAUDE.md

11 KiB

Developer Guide

This file provides detailed technical documentation for developers working with this codebase.

Project Overview

Brief Bench FastAPI is a FastAPI backend for a RAG (Retrieval-Augmented Generation) testing system with multi-user support. The system allows users to test RAG backends across three environments (IFT, PSI, PROD) in two modes: Bench mode (batch testing) and Backend mode (bot simulation).

Key Concepts

Multi-Environment Architecture

  • Three environments: IFT, PSI, PROD (each with separate RAG backend hosts)
  • Per-user settings: Each user has individual settings for each environment (stored in external DB API)
  • Server configuration: RAG hosts, ports, endpoints, and mTLS cert paths are defined in .env (not user-editable)

Two Query Modes

  1. Bench Mode: Send all questions in a single batch request to RAG backend

    • Endpoint: POST /api/v1/query/bench
    • Uses headers: Request-Id, System-Id, Authorization, System-Platform
  2. Backend Mode: Send questions one-by-one, simulating bot behavior

    • Endpoint: POST /api/v1/query/backend
    • Can reset session after each question (controlled by resetSessionMode setting)
    • Uses headers: Platform-User-Id, Platform-Id, Authorization

Authentication Flow

  1. User sends POST /api/v1/auth/login?login=12345678 (8-digit login)
  2. FastAPI forwards to DB API: POST /users/login
  3. DB API returns user info (user_id, login, timestamps)
  4. FastAPI generates JWT token (30-day expiration) with {user_id, login, exp}
  5. Returns {access_token, user} to client
  6. All subsequent requests use Authorization: Bearer {token}
  7. Middleware get_current_user validates JWT on protected endpoints

Development Commands

Local Development

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Linux/Mac
venv\Scripts\activate     # Windows

# Install dependencies
pip install -r requirements.txt

# Run development server (with auto-reload)
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Docker

# Build and run
docker-compose up -d

# View logs
docker-compose logs -f fastapi

# Stop
docker-compose down

Environment Setup

# Copy example and edit
cp .env.example .env

# Required variables:
# - JWT_SECRET_KEY (generate a new secret!)
# - DB_API_URL (external DB API service URL)
# - IFT_RAG_HOST, PSI_RAG_HOST, PROD_RAG_HOST
# - Port and endpoint configs for each environment
# - Optional: mTLS certificate paths (certs/ directory)

Architecture Details

Layered Structure

app/
├── api/v1/          # API endpoints (FastAPI routers)
├── models/          # Pydantic models for validation
├── services/        # Business logic layer
├── interfaces/      # External API clients
├── middleware/      # Request/response middleware
├── utils/           # Utilities (JWT, security)
├── config.py        # Settings from .env (pydantic-settings)
├── dependencies.py  # Dependency injection
└── main.py          # FastAPI app entry point

TgBackendInterface Pattern

The TgBackendInterface class (app/interfaces/base.py) is the base for all HTTP API clients:

  • Wraps httpx.AsyncClient with Pydantic model serialization/deserialization
  • Provides get(), post(), put(), delete() methods
  • Handles error logging and HTTP status errors
  • Auto-serializes Pydantic models to JSON and deserializes JSON to Pydantic models
  • Initialized with api_prefix (base URL)

DBApiClient (app/interfaces/db_api_client.py) extends this for DB API integration:

  • login_user() - authenticate user
  • get_user_settings(), update_user_settings() - manage per-user settings
  • save_session(), get_sessions(), get_session(), delete_session() - analysis sessions

See DB_API_CONTRACT.md for full DB API specification.

RagService Pattern

The RagService class (app/services/rag_service.py) manages RAG backend communication:

  • Creates separate httpx.AsyncClient for each environment (IFT, PSI, PROD)
  • Configures mTLS certificates from .env settings
  • send_bench_query() - batch mode requests
  • send_backend_query() - sequential requests with optional session reset
  • Builds environment-specific headers from user settings

Dependency Injection

Located in app/dependencies.py:

  • get_db_client() - Returns DBApiClient instance
  • get_current_user() - JWT auth middleware, validates Bearer token, returns {user_id, login}

Use in endpoints:

@router.get("/protected")
async def protected_endpoint(
    current_user: dict = Depends(get_current_user),
    db_client: DBApiClient = Depends(get_db_client)
):
    user_id = current_user["user_id"]
    # ...

User Settings vs Server Configuration

Server Config (.env): RAG hosts, ports, endpoints, mTLS cert paths - applies to all users User Settings (DB API): Per-environment settings for each user:

  • apiMode: "bench" or "backend"
  • bearerToken: Authorization token for RAG backend
  • systemPlatform, systemPlatformUser: Platform identifiers
  • platformUserId, platformId: Platform-specific IDs
  • withClassify: Enable classification in backend mode
  • resetSessionMode: Reset session after each question in backend mode

Important Implementation Notes

mTLS Certificate Handling

  • Certificate paths are configured per-environment in .env
  • Format: /app/certs/{env}/ca.crt, /app/certs/{env}/client.key, /app/certs/{env}/client.crt
  • RagService loads certificates from these paths when creating httpx clients
  • Certificates are mounted read-only in Docker: ./certs:/app/certs:ro

Error Handling

  • All API client methods can raise httpx.HTTPStatusError for HTTP errors
  • Endpoints should catch these and convert to FastAPI HTTPException
  • Use status.HTTP_502_BAD_GATEWAY for external service errors
  • Use status.HTTP_500_INTERNAL_SERVER_ERROR for unexpected errors
  • Always log errors with context (user_id, environment, request details)

Async Context Managers

Both TgBackendInterface and RagService support async context managers:

async with RagService() as rag_service:
    response = await rag_service.send_bench_query(...)
# Client automatically closed

Or manually close:

rag_service = RagService()
try:
    response = await rag_service.send_bench_query(...)
finally:
    await rag_service.close()

Long-Running Requests

  • RAG requests can take up to 30 minutes (RAG_REQUEST_TIMEOUT=1800)
  • RagService uses 30-minute timeout for httpx clients
  • DB API has separate timeout (DB_API_TIMEOUT=30, default 30s)

API Endpoints

Authentication

  • POST /api/v1/auth/login?login={8-digit} - Login with 8-digit code, returns JWT token

Settings

  • GET /api/v1/settings - Get current user's settings for all environments
  • PUT /api/v1/settings - Update user settings

Query

  • POST /api/v1/query/bench - Send batch query (bench mode)
  • POST /api/v1/query/backend - Send sequential queries (backend mode)

Analysis Sessions

  • POST /api/v1/analysis/sessions - Save analysis session
  • GET /api/v1/analysis/sessions - List sessions (filterable by environment)
  • GET /api/v1/analysis/sessions/{session_id} - Get specific session
  • DELETE /api/v1/analysis/sessions/{session_id} - Delete session

Health

  • GET /health - Health check
  • GET / - API info

Testing Strategy

The project has comprehensive test coverage across three levels:

Test Pyramid

  1. Unit Tests (119 tests, 99% coverage) - tests/unit/

    • Fast, isolated tests with all dependencies mocked
    • Test business logic, models, utilities in isolation
    • Run constantly during development
    • Command: .\run_unit_tests.bat or pytest tests/unit/ -m unit
  2. Integration Tests (DB API integration) - tests/integration/

    • Test FastAPI endpoints with real DB API
    • Requires DB API service running
    • Mock RAG backends (only DB integration tested)
    • Run before commits
    • Command: .\run_integration_tests.bat or pytest tests/integration/ -m integration
    • Configuration: tests/integration/.env.integration (see .env.integration.example)
  3. End-to-End Tests (Full stack) - tests/e2e/

    • Test complete workflows from auth to RAG queries
    • Requires ALL services: FastAPI + DB API + RAG backends
    • Real network calls, no mocking
    • Run before deployment
    • Command: .\run_e2e_tests.bat or pytest tests/e2e/ -m e2e
    • Configuration: tests/e2e/.env.e2e (see .env.e2e.example)

Running Tests

# All tests (unit + integration)
.\run_all_tests.bat

# Specific test level
.\run_unit_tests.bat
.\run_integration_tests.bat
.\run_e2e_tests.bat

# Using pytest markers
pytest -m unit              # Unit tests only
pytest -m integration       # Integration tests only
pytest -m e2e               # E2E tests only
pytest -m e2e_ift           # E2E for IFT environment only

Test Documentation

See TESTING.md for comprehensive testing guide including:

  • Detailed setup instructions for each test level
  • Environment configuration
  • Troubleshooting common issues
  • CI/CD integration examples
  • Best practices

When Adding New Features

  1. Unit Tests: Test business logic in isolation (always required)
  2. Integration Tests: Test DB API interaction if feature uses DB
  3. E2E Tests: Add workflow test if feature is user-facing
  4. Run all tests: Verify nothing broke before committing

Common Development Scenarios

Adding a New API Endpoint

  1. Define Pydantic models in app/models/
  2. Create endpoint in appropriate app/api/v1/ router
  3. Add business logic to app/services/ if needed
  4. Register router in app/main.py with app.include_router()

Modifying DB API Integration

  1. Update contract in DB_API_CONTRACT.md
  2. Update Pydantic models in app/models/
  3. Add/modify methods in app/interfaces/db_api_client.py

Adding RAG Backend Endpoints

  1. Add endpoint config to .env.example and app/config.py
  2. Update user settings model if needed (app/models/settings.py)
  3. Modify RagService methods to use new endpoints
  4. Update query endpoint logic in app/api/v1/query.py

Security Considerations

  • JWT tokens have 30-day expiration (configurable via JWT_EXPIRE_MINUTES)
  • mTLS certificates are stored server-side only (never sent to client)
  • CORS is currently set to allow_origins=["*"] - configure properly for production
  • Never commit .env file (use .env.example as template)
  • Secrets (JWT_SECRET_KEY, bearer tokens) must be kept secure
  • User settings may contain sensitive tokens - handle with care

Project Status

See PROJECT_STATUS.md for detailed implementation status and TODOs. Key points:

  • Core infrastructure is complete (auth, DB API client, RAG service)
  • All main API endpoints are implemented
  • TgBackendInterface is fully implemented
  • 99% test coverage (unit + integration + E2E tests)
  • Frontend integration complete