Skip to content

Configuration Guide

This guide covers all configuration options available in the API Reliability Suite. The application uses Pydantic Settings for robust environment variable management and validation.


๐Ÿ“„ Environment Variables

All settings can be configured via environment variables or a .env file in the project root. When ENVIRONMENT is set to staging or production, the app enforces a non-default SECRET_KEY and a shared RATE_LIMIT_STORAGE_URI.

Core Application Settings

The following settings are defined in src/core/config.py:

Variable Default Description
PROJECT_NAME "API Reliability Suite" Application name, used in logs and as the OpenTelemetry Service Name.
ENVIRONMENT "development" Deployment environment (development, test, staging, production).
DEBUG False Enable debug mode.
LOG_LEVEL "info" Logging level (debug, info, warning, error, critical).
LOG_FILE_PATH "app.json" Path to the structured log file used by the AI summarizer and file logging handler.
DATABASE_URL "sqlite+aiosqlite:///./data/reliability_suite.db" SQLAlchemy database URL. Use Postgres for shared or production-style environments.
DATABASE_ECHO False Enables SQLAlchemy SQL logging.
SEED_DEMO_USER True Seeds the demo admin account on startup for local runs.
SECRET_KEY "change-me-in-production" Secret key used for JWT signing. Must be changed for production!
ACCESS_TOKEN_EXPIRE_MINUTES 30 JWT token expiration time in minutes.
REFRESH_TOKEN_EXPIRE_DAYS 14 Refresh-token lifetime used for session rotation.
RATE_LIMIT_STORAGE_URI "memory://" Rate limit storage backend (use Redis in shared environments).
RATE_LIMIT_HEADERS_ENABLED False Adds standard rate limit headers to responses.
RATE_LIMIT_IN_MEMORY_FALLBACK_ENABLED False Allow in-memory fallback if storage is unavailable.
RATE_LIMIT_KEY_PREFIX "api-reliability-suite" Prefix for rate limit keys in shared storage.
TRUSTED_HOSTS "*" Comma-separated public hostnames allowed by TrustedHostMiddleware.
CORS_ALLOW_ORIGINS "" Comma-separated origins allowed by CORS middleware.
HTTPS_REDIRECT_ENABLED False Redirect incoming http traffic to https.
SETTINGS_SECRETS_DIR None Optional secrets directory path (defaults to /run/secrets when present).

Observability Configuration

Variable Default Description
OTLP_ENDPOINT None The OTLP collector endpoint (e.g., http://jaeger:4317). If not set, traces are exported to the console.
PROMETHEUS_BASE_URL None Prometheus API base URL used by /slo/report to retrieve recording-rule values.
CIRCUIT_BREAKER_CACHE_URL None Redis URL for cache-backed circuit-breaker fallback payloads.
CIRCUIT_BREAKER_CACHE_TTL_SECONDS 300 TTL for the cached upstream payload returned during degraded fallback.
SLO_TARGET_SUCCESS_RATIO 0.99 Availability target used in SLO/error-budget reporting.
SLO_TARGET_P99_LATENCY_SECONDS 1.0 Latency objective used in SLO/error-budget reporting.
HTTP_CLIENT_TIMEOUT_SECONDS 10.0 Default timeout for outbound HTTP requests.
HTTP_CLIENT_MAX_CONNECTIONS 20 Global connection cap for the shared outbound HTTP client.
HTTP_CLIENT_MAX_KEEPALIVE_CONNECTIONS 10 Keep-alive pool size for the shared outbound HTTP client.
LLM_REQUEST_TIMEOUT_SECONDS 20.0 Timeout for AI summarization requests.
LLM_HEALTHCHECK_TIMEOUT_SECONDS 5.0 Timeout for configured LLM provider readiness checks.
LLM_MAX_RETRIES 2 Retry count for provider SDK calls that support retries.
LLM_MAX_CONCURRENCY 4 Bulkhead limit for concurrent LLM summarization requests.
ENABLE_LLM_READINESS_CHECKS True Include configured LLM provider health in /ready.

AI/LLM Provider Keys

To use the AI-powered CLI Debugger or the /debug/summarize-errors endpoint, you must provide at least one of the following keys:

Variable Default Description
OPENAI_API_KEY None OpenAI API Key.
GROQ_API_KEY None Groq API Key.
GOOGLE_API_KEY None Google AI (Gemini) API Key.

[!NOTE] The application automatically selects the first available provider in this order: GROQ_API_KEY, OPENAI_API_KEY, then GOOGLE_API_KEY.


๐Ÿ”ง Configuration Files

.env File

Copy the provided example (if available) or create a .env file in the root directory:

PROJECT_NAME="My Reliability Template"
ENVIRONMENT="development"
LOG_LEVEL=debug
LOG_FILE_PATH=app.json
DATABASE_URL="postgresql+asyncpg://app:app@localhost:5432/reliability_suite"
SECRET_KEY=y0ur-5ecur3-k3y-h3r3
RATE_LIMIT_STORAGE_URI="redis://localhost:6379/0"
CIRCUIT_BREAKER_CACHE_URL="redis://localhost:6379/1"
PROMETHEUS_BASE_URL="http://localhost:9099"
GROQ_API_KEY=gsk_...

Logging Configuration (src/core/logging.py)

Logging is pre-configured with the following defaults: - Format: Structured JSON for files, human-readable console output. - Log File: Uses LOG_FILE_PATH (defaults to app.json) and rotates at 10MB, keeping 5 backups. - Enrichment: Automatically includes trace_id and span_id for every log entry if a trace is active.

Tracing Configuration (src/core/tracing.py)

Tracing is handled via OpenTelemetry: - Exporter: OTLP (gRPC) if OTLP_ENDPOINT is set; otherwise, ConsoleSpanExporter. - Instrumentation: Automatically instruments the FastAPI app.


๐Ÿงช Testing Configuration

The test suite uses its own configuration, often overriding settings in tests/conftest.py or via environment variables during the test run.

To run tests with code coverage:

make test


๐Ÿš€ Quick Config Commands

# Verify current settings (dump)
python -c "from src.core.config import settings; print(settings.model_dump())"

๐Ÿ” Secrets Files

For Docker and Kubernetes deployments, you can provide secrets as files by mounting them under /run/secrets or setting SETTINGS_SECRETS_DIR to a custom path. Each secret file should be named after its setting, for example:

/run/secrets/SECRET_KEY
/run/secrets/RATE_LIMIT_STORAGE_URI

๐ŸŒ Reverse Proxy Settings

When the API sits behind ingress or a TLS-terminating proxy, configure the middleware settings together:

  • TRUSTED_HOSTS=api.example.com
  • CORS_ALLOW_ORIGINS=https://frontend.example.com
  • HTTPS_REDIRECT_ENABLED=true

The middleware is only enabled when these settings are configured.


๐Ÿ“‹ Production Readiness Standard

Operational Configuration Audit

  • [ ] Secret Management: Verify SECRET_KEY is not using the default value.
  • [ ] Persistent Identity Store: Point DATABASE_URL at Postgres or another server-grade relational database for shared environments.
  • [ ] Shared Rate Limiting: Use RATE_LIMIT_STORAGE_URI with Redis for distributed deployments.
  • [ ] Fallback Cache: Configure CIRCUIT_BREAKER_CACHE_URL with Redis for cache-backed degraded responses.
  • [ ] Log Level Alignment: Confirm LOG_LEVEL is set to info or warning for production stability.
  • [ ] LLM Connectivity: Ensure at least one valid API key for an LLM provider is present in the .env file.
  • [ ] Tracing Setup: Verify OTLP_ENDPOINT points to a valid collector if distributed tracing is required.
  • [ ] Trusted Hosts: Replace TRUSTED_HOSTS=* with the real public hostnames for the deployment.
  • [ ] CORS Policy: Restrict CORS_ALLOW_ORIGINS to the frontends that actually call the API.
  • [ ] Dependency Checks: Keep ENABLE_LLM_READINESS_CHECKS=true when AI summarization is a required dependency.