brief-rags-bench/CLAUDE.md

245 lines
9.3 KiB
Markdown

# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
Brief Bench FastAPI is a FastAPI backend for a RAG (Retrieval-Augmented Generation) testing system with multi-user support. The system allows users to test RAG backends across three environments (IFT, PSI, PROD) in two modes: Bench mode (batch testing) and Backend mode (bot simulation).
## Key Concepts
### Multi-Environment Architecture
- **Three environments**: IFT, PSI, PROD (each with separate RAG backend hosts)
- **Per-user settings**: Each user has individual settings for each environment (stored in external DB API)
- **Server configuration**: RAG hosts, ports, endpoints, and mTLS cert paths are defined in `.env` (not user-editable)
### Two Query Modes
1. **Bench Mode**: Send all questions in a single batch request to RAG backend
- Endpoint: `POST /api/v1/query/bench`
- Uses headers: `Request-Id`, `System-Id`, `Authorization`, `System-Platform`
2. **Backend Mode**: Send questions one-by-one, simulating bot behavior
- Endpoint: `POST /api/v1/query/backend`
- Can reset session after each question (controlled by `resetSessionMode` setting)
- Uses headers: `Platform-User-Id`, `Platform-Id`, `Authorization`
### Authentication Flow
1. User sends `POST /api/v1/auth/login?login=12345678` (8-digit login)
2. FastAPI forwards to DB API: `POST /users/login`
3. DB API returns user info (user_id, login, timestamps)
4. FastAPI generates JWT token (30-day expiration) with {user_id, login, exp}
5. Returns {access_token, user} to client
6. All subsequent requests use `Authorization: Bearer {token}`
7. Middleware `get_current_user` validates JWT on protected endpoints
## Development Commands
### Local Development
```bash
# Create virtual environment
python -m venv venv
source venv/bin/activate # Linux/Mac
venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
# Run development server (with auto-reload)
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
```
### Docker
```bash
# Build and run
docker-compose up -d
# View logs
docker-compose logs -f fastapi
# Stop
docker-compose down
```
### Environment Setup
```bash
# Copy example and edit
cp .env.example .env
# Required variables:
# - JWT_SECRET_KEY (generate a new secret!)
# - DB_API_URL (external DB API service URL)
# - IFT_RAG_HOST, PSI_RAG_HOST, PROD_RAG_HOST
# - Port and endpoint configs for each environment
# - Optional: mTLS certificate paths (certs/ directory)
```
## Architecture Details
### Layered Structure
```
app/
├── api/v1/ # API endpoints (FastAPI routers)
├── models/ # Pydantic models for validation
├── services/ # Business logic layer
├── interfaces/ # External API clients
├── middleware/ # Request/response middleware
├── utils/ # Utilities (JWT, security)
├── config.py # Settings from .env (pydantic-settings)
├── dependencies.py # Dependency injection
└── main.py # FastAPI app entry point
```
### TgBackendInterface Pattern
The `TgBackendInterface` class ([app/interfaces/base.py](app/interfaces/base.py)) is the base for all HTTP API clients:
- Wraps `httpx.AsyncClient` with Pydantic model serialization/deserialization
- Provides `get()`, `post()`, `put()`, `delete()` methods
- Handles error logging and HTTP status errors
- Auto-serializes Pydantic models to JSON and deserializes JSON to Pydantic models
- Initialized with `api_prefix` (base URL)
**DBApiClient** ([app/interfaces/db_api_client.py](app/interfaces/db_api_client.py)) extends this for DB API integration:
- `login_user()` - authenticate user
- `get_user_settings()`, `update_user_settings()` - manage per-user settings
- `save_session()`, `get_sessions()`, `get_session()`, `delete_session()` - analysis sessions
See [DB_API_CONTRACT.md](DB_API_CONTRACT.md) for full DB API specification.
### RagService Pattern
The `RagService` class ([app/services/rag_service.py](app/services/rag_service.py)) manages RAG backend communication:
- Creates separate `httpx.AsyncClient` for each environment (IFT, PSI, PROD)
- Configures mTLS certificates from `.env` settings
- `send_bench_query()` - batch mode requests
- `send_backend_query()` - sequential requests with optional session reset
- Builds environment-specific headers from user settings
### Dependency Injection
Located in [app/dependencies.py](app/dependencies.py):
- `get_db_client()` - Returns DBApiClient instance
- `get_current_user()` - JWT auth middleware, validates Bearer token, returns {user_id, login}
Use in endpoints:
```python
@router.get("/protected")
async def protected_endpoint(
current_user: dict = Depends(get_current_user),
db_client: DBApiClient = Depends(get_db_client)
):
user_id = current_user["user_id"]
# ...
```
### User Settings vs Server Configuration
**Server Config (`.env`)**: RAG hosts, ports, endpoints, mTLS cert paths - applies to all users
**User Settings (DB API)**: Per-environment settings for each user:
- `apiMode`: "bench" or "backend"
- `bearerToken`: Authorization token for RAG backend
- `systemPlatform`, `systemPlatformUser`: Platform identifiers
- `platformUserId`, `platformId`: Platform-specific IDs
- `withClassify`: Enable classification in backend mode
- `resetSessionMode`: Reset session after each question in backend mode
## Important Implementation Notes
### mTLS Certificate Handling
- Certificate paths are configured per-environment in `.env`
- Format: `/app/certs/{env}/ca.crt`, `/app/certs/{env}/client.key`, `/app/certs/{env}/client.crt`
- RagService loads certificates from these paths when creating httpx clients
- Certificates are mounted read-only in Docker: `./certs:/app/certs:ro`
### Error Handling
- All API client methods can raise `httpx.HTTPStatusError` for HTTP errors
- Endpoints should catch these and convert to FastAPI `HTTPException`
- Use `status.HTTP_502_BAD_GATEWAY` for external service errors
- Use `status.HTTP_500_INTERNAL_SERVER_ERROR` for unexpected errors
- Always log errors with context (user_id, environment, request details)
### Async Context Managers
Both TgBackendInterface and RagService support async context managers:
```python
async with RagService() as rag_service:
response = await rag_service.send_bench_query(...)
# Client automatically closed
```
Or manually close:
```python
rag_service = RagService()
try:
response = await rag_service.send_bench_query(...)
finally:
await rag_service.close()
```
### Long-Running Requests
- RAG requests can take up to 30 minutes (`RAG_REQUEST_TIMEOUT=1800`)
- RagService uses 30-minute timeout for httpx clients
- DB API has separate timeout (`DB_API_TIMEOUT=30`, default 30s)
## API Endpoints
### Authentication
- `POST /api/v1/auth/login?login={8-digit}` - Login with 8-digit code, returns JWT token
### Settings
- `GET /api/v1/settings` - Get current user's settings for all environments
- `PUT /api/v1/settings` - Update user settings
### Query
- `POST /api/v1/query/bench` - Send batch query (bench mode)
- `POST /api/v1/query/backend` - Send sequential queries (backend mode)
### Analysis Sessions
- `POST /api/v1/analysis/sessions` - Save analysis session
- `GET /api/v1/analysis/sessions` - List sessions (filterable by environment)
- `GET /api/v1/analysis/sessions/{session_id}` - Get specific session
- `DELETE /api/v1/analysis/sessions/{session_id}` - Delete session
### Health
- `GET /health` - Health check
- `GET /` - API info
## Testing Strategy
When adding tests:
1. Unit test services in isolation (mock httpx responses)
2. Integration test API endpoints (use TestClient from FastAPI)
3. Mock external dependencies (DB API, RAG backends)
4. Test error paths (network failures, invalid tokens, missing settings)
## Common Development Scenarios
### Adding a New API Endpoint
1. Define Pydantic models in `app/models/`
2. Create endpoint in appropriate `app/api/v1/` router
3. Add business logic to `app/services/` if needed
4. Register router in `app/main.py` with `app.include_router()`
### Modifying DB API Integration
1. Update contract in [DB_API_CONTRACT.md](DB_API_CONTRACT.md)
2. Update Pydantic models in `app/models/`
3. Add/modify methods in [app/interfaces/db_api_client.py](app/interfaces/db_api_client.py)
### Adding RAG Backend Endpoints
1. Add endpoint config to `.env.example` and `app/config.py`
2. Update user settings model if needed (`app/models/settings.py`)
3. Modify `RagService` methods to use new endpoints
4. Update query endpoint logic in `app/api/v1/query.py`
## Security Considerations
- JWT tokens have 30-day expiration (configurable via `JWT_EXPIRE_MINUTES`)
- mTLS certificates are stored server-side only (never sent to client)
- CORS is currently set to `allow_origins=["*"]` - configure properly for production
- Never commit `.env` file (use `.env.example` as template)
- Secrets (JWT_SECRET_KEY, bearer tokens) must be kept secure
- User settings may contain sensitive tokens - handle with care
## Project Status
See [PROJECT_STATUS.md](PROJECT_STATUS.md) for detailed implementation status and TODOs. Key points:
- Core infrastructure is complete (auth, DB API client, RAG service)
- All main API endpoints are implemented
- TgBackendInterface is fully implemented (not a stub)
- Frontend integration pending (static/ directory is empty)
- No tests yet (tests/ directory is empty)