brief-rags-bench/tests/e2e/README.md

# End-to-End (E2E) Tests

End-to-end tests for the complete Brief Bench FastAPI system, testing the entire stack from authentication through RAG queries to data persistence.

## Overview

E2E tests validate:
- Complete user workflows from login to query results
- Integration between FastAPI backend, DB API, and RAG backends
- Real API calls to all external services (no mocking)
- Data persistence and retrieval
- Cross-environment functionality (IFT, PSI, PROD)
- Error handling and edge cases

## Prerequisites

**CRITICAL**: All external services must be running before E2E tests can execute.

### Required Services

1. **DB API** (database service)
   - Must be running and accessible
   - Default: `http://localhost:8081`
   - Health check: `GET http://localhost:8081/health`

2. **RAG Backends** (one or more environments)
   - **IFT RAG**: Development/test environment
   - **PSI RAG**: Backend mode testing
   - **PROD RAG**: Production-like testing
   - Each environment needs its own RAG backend server running

3. **Test User Account**
   - A valid 8-digit test login must exist in DB API
   - Recommended: Use a dedicated test account (e.g., `99999999`)
   - This account will be used for all E2E test operations

### Service Availability Check

E2E tests automatically check prerequisites before running:
- DB API health endpoint
- RAG backend host configurations
- Test credentials presence

If any prerequisite is not met, tests will be **skipped** with a detailed error message.

## Environment Configuration

### Create `.env.e2e` File

Copy the example file and configure for your environment:

```bash
cp tests/e2e/.env.e2e.example tests/e2e/.env.e2e
```

### Required Environment Variables

Edit `tests/e2e/.env.e2e` with your configuration:

```bash
# DB API Configuration
E2E_DB_API_URL=http://localhost:8081/api/v1

# Test User Credentials
E2E_TEST_LOGIN=99999999  # 8-digit test user login

# IFT Environment Settings (Bench Mode)
E2E_IFT_RAG_HOST=ift-rag.example.com
E2E_IFT_BEARER_TOKEN=your_ift_bearer_token_here
E2E_IFT_SYSTEM_PLATFORM=telegram
E2E_IFT_SYSTEM_PLATFORM_USER=test_user_ift

# PSI Environment Settings (Backend Mode)
E2E_PSI_RAG_HOST=psi-rag.example.com
E2E_PSI_BEARER_TOKEN=your_psi_bearer_token_here
E2E_PSI_PLATFORM_USER_ID=test_user_psi
E2E_PSI_PLATFORM_ID=telegram

# PROD Environment Settings (Bench Mode)
E2E_PROD_RAG_HOST=prod-rag.example.com
E2E_PROD_BEARER_TOKEN=your_prod_bearer_token_here
E2E_PROD_SYSTEM_PLATFORM=telegram
E2E_PROD_SYSTEM_PLATFORM_USER=test_user_prod
```

### Security Note

⚠️ **NEVER commit `.env.e2e` to version control!**

The `.env.e2e` file contains:
- Real bearer tokens for RAG backends
- Production-like credentials
- Sensitive configuration

Always use `.env.e2e.example` as a template.

## Running E2E Tests

### Prerequisites Check

First, ensure all services are running:

```bash
# Check DB API
curl http://localhost:8081/health

# Check that your .env.e2e is configured
cat tests/e2e/.env.e2e
```

### Run All E2E Tests

```bash
# Activate virtual environment
.venv\Scripts\activate  # Windows
source .venv/bin/activate  # Linux/Mac

# Run all E2E tests
pytest tests/e2e/ -v -m e2e

# Run with detailed output
pytest tests/e2e/ -v -m e2e -s
```

### Run Specific Test Categories

```bash
# Run only IFT environment tests
pytest tests/e2e/ -v -m e2e_ift

# Run only PSI environment tests
pytest tests/e2e/ -v -m e2e_psi

# Run only PROD environment tests
pytest tests/e2e/ -v -m e2e_prod

# Run only workflow tests
pytest tests/e2e/test_full_flow_e2e.py -v

# Run only error scenario tests
pytest tests/e2e/test_error_scenarios_e2e.py -v

# Run only RAG backend tests
pytest tests/e2e/test_rag_backends_e2e.py -v
```

### Run Individual Test

```bash
# Run a specific test function
pytest tests/e2e/test_full_flow_e2e.py::TestCompleteUserFlow::test_full_workflow_bench_mode -v
```

### Useful pytest Options

```bash
# Show print statements
pytest tests/e2e/ -v -s

# Stop on first failure
pytest tests/e2e/ -v -x

# Show local variables on failure
pytest tests/e2e/ -v -l

# Run with coverage (not typical for E2E)
pytest tests/e2e/ -v --cov=app

# Increase timeout for slow RAG backends
pytest tests/e2e/ -v --timeout=300
```

## Test Structure

### Test Files

```
tests/e2e/
├── conftest.py                    # E2E fixtures and configuration
├── .env.e2e.example              # Example environment variables
├── .env.e2e                      # Your actual config (not in git)
├── README.md                      # This file
├── test_full_flow_e2e.py         # Complete user workflow tests
├── test_rag_backends_e2e.py      # RAG backend integration tests
└── test_error_scenarios_e2e.py   # Error handling and edge cases
```

### Test Markers

E2E tests use pytest markers for categorization:

- `@pytest.mark.e2e` - All E2E tests
- `@pytest.mark.e2e_ift` - IFT environment specific
- `@pytest.mark.e2e_psi` - PSI environment specific
- `@pytest.mark.e2e_prod` - PROD environment specific

### Fixtures

Key fixtures from `conftest.py`:

- `check_prerequisites` - Verifies all services are available
- `e2e_client` - FastAPI TestClient instance
- `e2e_auth_headers` - Authenticated headers with JWT token
- `setup_test_settings` - Configures user settings for all environments
- `cleanup_test_sessions` - Removes test data after each test

## Test Coverage

### Complete User Workflows (`test_full_flow_e2e.py`)

1. **Full Workflow - Bench Mode**
   - Authenticate → Get settings → Send bench query → Save session → Retrieve → Delete

2. **Full Workflow - Backend Mode**
   - Authenticate → Verify PSI settings → Send backend query → Save session

3. **Settings Change Affects Queries**
   - Change settings → Verify mode compatibility → Restore settings

4. **Multiple Sessions Management**
   - Create sessions for all environments → List → Filter → Delete all

5. **User Data Isolation**
   - Verify authentication requirements → Test access controls

### RAG Backend Tests (`test_rag_backends_e2e.py`)

1. **Environment-Specific Queries**
   - IFT bench mode queries
   - PSI backend mode queries
   - PROD bench mode queries

2. **Backend Mode Features**
   - Session management
   - Session reset functionality
   - Sequential question processing

3. **Query Parameters**
   - `with_docs` parameter handling
   - Multiple questions in one request
   - Cross-environment queries

### Error Scenarios (`test_error_scenarios_e2e.py`)

1. **Authentication Errors**
   - Missing auth token
   - Invalid JWT token
   - Malformed authorization header

2. **Validation Errors**
   - Invalid environment names
   - Empty questions list
   - Missing required fields
   - Invalid data structures

3. **Mode Compatibility**
   - Bench query with backend mode settings
   - Backend query with bench mode settings

4. **Resource Not Found**
   - Nonexistent session IDs
   - Invalid UUID formats

5. **Settings Errors**
   - Invalid API modes
   - Invalid environments

6. **Edge Cases**
   - Very long questions
   - Special characters
   - Large number of questions
   - Pagination edge cases

## Timeouts

E2E tests use generous timeouts due to real RAG backend processing:

- **Default query timeout**: 120 seconds (2 minutes)
- **Large batch queries**: 180 seconds (3 minutes)
- **DB API operations**: 30 seconds

If tests timeout frequently, check:
1. RAG backend performance
2. Network connectivity
3. Server load

## Cleanup

Tests automatically clean up after themselves:

- `cleanup_test_sessions` fixture removes all sessions created during tests
- Each test is isolated and doesn't affect other tests
- Failed tests may leave orphaned sessions (check manually if needed)

### Manual Cleanup

If needed, clean up test data manually:

```python
# Get all sessions for test user
GET /api/v1/analysis/sessions?limit=1000

# Delete specific session
DELETE /api/v1/analysis/sessions/{session_id}
```

## Troubleshooting

### Tests Are Skipped

**Symptom**: All tests show as "SKIPPED"

**Causes**:
1. DB API not running
2. RAG backends not configured
3. Missing `.env.e2e` file
4. Test user doesn't exist

**Solution**: Check prerequisite error messages for details.

### Authentication Failures

**Symptom**: Tests fail with 401 Unauthorized

**Causes**:
1. Test user doesn't exist in DB API
2. Invalid test login in `.env.e2e`
3. JWT secret mismatch between environments

**Solution**: Verify test user exists and credentials are correct.

### Timeout Errors

**Symptom**: Tests timeout during RAG queries

**Causes**:
1. RAG backend is slow or overloaded
2. Network issues
3. Invalid bearer tokens
4. mTLS certificate problems

**Solution**:
- Check RAG backend health
- Verify bearer tokens are valid
- Increase timeout values if needed

### Connection Refused

**Symptom**: Connection errors to services

**Causes**:
1. Service not running
2. Wrong host/port in configuration
3. Firewall blocking connections

**Solution**: Verify all services are accessible and configuration is correct.

### Validation Errors (422)

**Symptom**: Tests fail with 422 Unprocessable Entity

**Causes**:
1. Incorrect data format in `.env.e2e`
2. Missing required settings
3. Invalid enum values

**Solution**: Check `.env.e2e.example` for correct format.

## Best Practices

### When to Run E2E Tests

- **Before deploying**: Always run E2E tests before production deployment
- **After major changes**: Run when modifying API endpoints or services
- **Regularly in CI/CD**: Set up automated E2E testing in your pipeline
- **Not during development**: Use unit/integration tests for rapid feedback

### Test Data Management

- Use dedicated test user account (not production users)
- Tests create and delete their own data
- Don't rely on existing data in DB
- Clean up manually if tests fail catastrophically

### Performance Considerations

- E2E tests are slow (real network calls)
- Run unit/integration tests first
- Consider running E2E tests in parallel (with caution)
- Use environment-specific markers to run subset of tests

### CI/CD Integration

Example GitHub Actions workflow:

```yaml
e2e-tests:
  runs-on: ubuntu-latest
  services:
    db-api:
      image: your-db-api:latest
      ports:
        - 8081:8081
  steps:
    - uses: actions/checkout@v3
    - name: Set up Python
      uses: actions/setup-python@v4
      with:
        python-version: '3.11'
    - name: Install dependencies
      run: pip install -r requirements.txt
    - name: Create .env.e2e
      run: |
        echo "E2E_DB_API_URL=${{ secrets.E2E_DB_API_URL }}" > tests/e2e/.env.e2e
        echo "E2E_TEST_LOGIN=${{ secrets.E2E_TEST_LOGIN }}" >> tests/e2e/.env.e2e
        # ... other env vars
    - name: Run E2E tests
      run: pytest tests/e2e/ -v -m e2e
```

## Contributing

When adding new E2E tests:

1. Add test to appropriate file (workflow/backend/error)
2. Use existing fixtures from `conftest.py`
3. Add cleanup logic if creating persistent data
4. Document any new environment variables
5. Use appropriate pytest markers
6. Add realistic timeout values
7. Test both success and failure paths

## Related Documentation

- [Integration Tests](../integration/README.md) - Tests for DB API integration only
- [Unit Tests](../unit/) - Fast isolated tests
- [DB API Contract](../../DB_API_CONTRACT.md) - External DB API specification