brief-rags-bench/CLAUDE.md

299 lines
11 KiB
Markdown

# Developer Guide
This file provides detailed technical documentation for developers working with this codebase.
## Project Overview
Brief Bench FastAPI is a FastAPI backend for a RAG (Retrieval-Augmented Generation) testing system with multi-user support. The system allows users to test RAG backends across three environments (IFT, PSI, PROD) in two modes: Bench mode (batch testing) and Backend mode (bot simulation).
## Key Concepts
### Multi-Environment Architecture
- **Three environments**: IFT, PSI, PROD (each with separate RAG backend hosts)
- **Per-user settings**: Each user has individual settings for each environment (stored in external DB API)
- **Server configuration**: RAG hosts, ports, endpoints, and mTLS cert paths are defined in `.env` (not user-editable)
### Two Query Modes
1. **Bench Mode**: Send all questions in a single batch request to RAG backend
- Endpoint: `POST /api/v1/query/bench`
- Uses headers: `Request-Id`, `System-Id`, `Authorization`, `System-Platform`
2. **Backend Mode**: Send questions one-by-one, simulating bot behavior
- Endpoint: `POST /api/v1/query/backend`
- Can reset session after each question (controlled by `resetSessionMode` setting)
- Uses headers: `Platform-User-Id`, `Platform-Id`, `Authorization`
### Authentication Flow
1. User sends `POST /api/v1/auth/login?login=12345678` (8-digit login)
2. FastAPI forwards to DB API: `POST /users/login`
3. DB API returns user info (user_id, login, timestamps)
4. FastAPI generates JWT token (30-day expiration) with {user_id, login, exp}
5. Returns {access_token, user} to client
6. All subsequent requests use `Authorization: Bearer {token}`
7. Middleware `get_current_user` validates JWT on protected endpoints
## Development Commands
### Local Development
```bash
# Create virtual environment
python -m venv venv
source venv/bin/activate # Linux/Mac
venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
# Run development server (with auto-reload)
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
```
### Docker
```bash
# Build and run
docker-compose up -d
# View logs
docker-compose logs -f fastapi
# Stop
docker-compose down
```
### Environment Setup
```bash
# Copy example and edit
cp .env.example .env
# Required variables:
# - JWT_SECRET_KEY (generate a new secret!)
# - DB_API_URL (external DB API service URL)
# - IFT_RAG_HOST, PSI_RAG_HOST, PROD_RAG_HOST
# - Port and endpoint configs for each environment
# - Optional: mTLS certificate paths (certs/ directory)
```
## Architecture Details
### Layered Structure
```
app/
├── api/v1/ # API endpoints (FastAPI routers)
├── models/ # Pydantic models for validation
├── services/ # Business logic layer
├── interfaces/ # External API clients
├── middleware/ # Request/response middleware
├── utils/ # Utilities (JWT, security)
├── config.py # Settings from .env (pydantic-settings)
├── dependencies.py # Dependency injection
└── main.py # FastAPI app entry point
```
### TgBackendInterface Pattern
The `TgBackendInterface` class ([app/interfaces/base.py](app/interfaces/base.py)) is the base for all HTTP API clients:
- Wraps `httpx.AsyncClient` with Pydantic model serialization/deserialization
- Provides `get()`, `post()`, `put()`, `delete()` methods
- Handles error logging and HTTP status errors
- Auto-serializes Pydantic models to JSON and deserializes JSON to Pydantic models
- Initialized with `api_prefix` (base URL)
**DBApiClient** ([app/interfaces/db_api_client.py](app/interfaces/db_api_client.py)) extends this for DB API integration:
- `login_user()` - authenticate user
- `get_user_settings()`, `update_user_settings()` - manage per-user settings
- `save_session()`, `get_sessions()`, `get_session()`, `delete_session()` - analysis sessions
See [DB_API_CONTRACT.md](DB_API_CONTRACT.md) for full DB API specification.
### RagService Pattern
The `RagService` class ([app/services/rag_service.py](app/services/rag_service.py)) manages RAG backend communication:
- Creates separate `httpx.AsyncClient` for each environment (IFT, PSI, PROD)
- Configures mTLS certificates from `.env` settings
- `send_bench_query()` - batch mode requests
- `send_backend_query()` - sequential requests with optional session reset
- Builds environment-specific headers from user settings
### Dependency Injection
Located in [app/dependencies.py](app/dependencies.py):
- `get_db_client()` - Returns DBApiClient instance
- `get_current_user()` - JWT auth middleware, validates Bearer token, returns {user_id, login}
Use in endpoints:
```python
@router.get("/protected")
async def protected_endpoint(
current_user: dict = Depends(get_current_user),
db_client: DBApiClient = Depends(get_db_client)
):
user_id = current_user["user_id"]
# ...
```
### User Settings vs Server Configuration
**Server Config (`.env`)**: RAG hosts, ports, endpoints, mTLS cert paths - applies to all users
**User Settings (DB API)**: Per-environment settings for each user:
- `apiMode`: "bench" or "backend"
- `bearerToken`: Authorization token for RAG backend
- `systemPlatform`, `systemPlatformUser`: Platform identifiers
- `platformUserId`, `platformId`: Platform-specific IDs
- `withClassify`: Enable classification in backend mode
- `resetSessionMode`: Reset session after each question in backend mode
## Important Implementation Notes
### mTLS Certificate Handling
- Certificate paths are configured per-environment in `.env`
- Format: `/app/certs/{env}/ca.crt`, `/app/certs/{env}/client.key`, `/app/certs/{env}/client.crt`
- RagService loads certificates from these paths when creating httpx clients
- Certificates are mounted read-only in Docker: `./certs:/app/certs:ro`
### Error Handling
- All API client methods can raise `httpx.HTTPStatusError` for HTTP errors
- Endpoints should catch these and convert to FastAPI `HTTPException`
- Use `status.HTTP_502_BAD_GATEWAY` for external service errors
- Use `status.HTTP_500_INTERNAL_SERVER_ERROR` for unexpected errors
- Always log errors with context (user_id, environment, request details)
### Async Context Managers
Both TgBackendInterface and RagService support async context managers:
```python
async with RagService() as rag_service:
response = await rag_service.send_bench_query(...)
# Client automatically closed
```
Or manually close:
```python
rag_service = RagService()
try:
response = await rag_service.send_bench_query(...)
finally:
await rag_service.close()
```
### Long-Running Requests
- RAG requests can take up to 30 minutes (`RAG_REQUEST_TIMEOUT=1800`)
- RagService uses 30-minute timeout for httpx clients
- DB API has separate timeout (`DB_API_TIMEOUT=30`, default 30s)
## API Endpoints
### Authentication
- `POST /api/v1/auth/login?login={8-digit}` - Login with 8-digit code, returns JWT token
### Settings
- `GET /api/v1/settings` - Get current user's settings for all environments
- `PUT /api/v1/settings` - Update user settings
### Query
- `POST /api/v1/query/bench` - Send batch query (bench mode)
- `POST /api/v1/query/backend` - Send sequential queries (backend mode)
### Analysis Sessions
- `POST /api/v1/analysis/sessions` - Save analysis session
- `GET /api/v1/analysis/sessions` - List sessions (filterable by environment)
- `GET /api/v1/analysis/sessions/{session_id}` - Get specific session
- `DELETE /api/v1/analysis/sessions/{session_id}` - Delete session
### Health
- `GET /health` - Health check
- `GET /` - API info
## Testing Strategy
The project has comprehensive test coverage across three levels:
### Test Pyramid
1. **Unit Tests** (119 tests, 99% coverage) - `tests/unit/`
- Fast, isolated tests with all dependencies mocked
- Test business logic, models, utilities in isolation
- Run constantly during development
- Command: `.\run_unit_tests.bat` or `pytest tests/unit/ -m unit`
2. **Integration Tests** (DB API integration) - `tests/integration/`
- Test FastAPI endpoints with real DB API
- Requires DB API service running
- Mock RAG backends (only DB integration tested)
- Run before commits
- Command: `.\run_integration_tests.bat` or `pytest tests/integration/ -m integration`
- Configuration: `tests/integration/.env.integration` (see `.env.integration.example`)
3. **End-to-End Tests** (Full stack) - `tests/e2e/`
- Test complete workflows from auth to RAG queries
- Requires ALL services: FastAPI + DB API + RAG backends
- Real network calls, no mocking
- Run before deployment
- Command: `.\run_e2e_tests.bat` or `pytest tests/e2e/ -m e2e`
- Configuration: `tests/e2e/.env.e2e` (see `.env.e2e.example`)
### Running Tests
```bash
# All tests (unit + integration)
.\run_all_tests.bat
# Specific test level
.\run_unit_tests.bat
.\run_integration_tests.bat
.\run_e2e_tests.bat
# Using pytest markers
pytest -m unit # Unit tests only
pytest -m integration # Integration tests only
pytest -m e2e # E2E tests only
pytest -m e2e_ift # E2E for IFT environment only
```
### Test Documentation
See [TESTING.md](TESTING.md) for comprehensive testing guide including:
- Detailed setup instructions for each test level
- Environment configuration
- Troubleshooting common issues
- CI/CD integration examples
- Best practices
### When Adding New Features
1. **Unit Tests**: Test business logic in isolation (always required)
2. **Integration Tests**: Test DB API interaction if feature uses DB
3. **E2E Tests**: Add workflow test if feature is user-facing
4. **Run all tests**: Verify nothing broke before committing
## Common Development Scenarios
### Adding a New API Endpoint
1. Define Pydantic models in `app/models/`
2. Create endpoint in appropriate `app/api/v1/` router
3. Add business logic to `app/services/` if needed
4. Register router in `app/main.py` with `app.include_router()`
### Modifying DB API Integration
1. Update contract in [DB_API_CONTRACT.md](DB_API_CONTRACT.md)
2. Update Pydantic models in `app/models/`
3. Add/modify methods in [app/interfaces/db_api_client.py](app/interfaces/db_api_client.py)
### Adding RAG Backend Endpoints
1. Add endpoint config to `.env.example` and `app/config.py`
2. Update user settings model if needed (`app/models/settings.py`)
3. Modify `RagService` methods to use new endpoints
4. Update query endpoint logic in `app/api/v1/query.py`
## Security Considerations
- JWT tokens have 30-day expiration (configurable via `JWT_EXPIRE_MINUTES`)
- mTLS certificates are stored server-side only (never sent to client)
- CORS is currently set to `allow_origins=["*"]` - configure properly for production
- Never commit `.env` file (use `.env.example` as template)
- Secrets (JWT_SECRET_KEY, bearer tokens) must be kept secure
- User settings may contain sensitive tokens - handle with care
## Project Status
See [PROJECT_STATUS.md](PROJECT_STATUS.md) for detailed implementation status and TODOs. Key points:
- Core infrastructure is complete (auth, DB API client, RAG service)
- All main API endpoints are implemented
- TgBackendInterface is fully implemented
- 99% test coverage (unit + integration + E2E tests)
- Frontend integration complete