Deployment Guide
The API Reliability Suite is designed to be cloud-native and highly portable, utilizing Docker for containerization and Kubernetes for orchestration.
馃惓 Containerization
Dockerfile
The project uses a multi-stage Docker build to minimize the final image size and reduce the attack surface.
- Stage 1 (Builder): Installs Poetry and downloads all runtime dependencies.
- Stage 2 (Runtime): Copies the installed packages, the
src/application code, and the runtime helper scripts into a pinned Python 3.12 slim image. - Runtime User: The container drops root and runs as the dedicated
appuser (UID/GID 10001).
Building the Image
[!NOTE] The local
docker-compose.ymlstack includes Postgres for user persistence, Redis for rate limiting and breaker fallback caching, Prometheus, Alertmanager, Jaeger, and Grafana. Production deployments should configure equivalent shared services through environment variables and secrets.
鈽革笍 Kubernetes (Infrastructure as Code)
Example manifests are located in infra/k8s/ to show a baseline deployment layout and scaling strategy.
1. Deployment (deployment.yaml)
The deployment ensures at least 2 replicas are always running.
- Probes:
livenessProbeuses/health, whilereadinessProbeuses/readyso traffic is only sent to pods whose database and shared dependencies are reachable. - Resources: Sets explicit CPU and Memory requests/limits to prevent resource contention.
- Metrics: Annotated for automatic Prometheus scraping.
- Security Context: Runs as a non-root user, drops Linux capabilities, and uses the runtime-default seccomp profile.
- Rollout Strategy: Uses a rolling update strategy with
revisionHistoryLimitso rollbacks stay available.
2. Auto-scaling (hpa.yaml)
A HorizontalPodAutoscaler is configured to scale the API based on CPU utilization.
- Min/Max: Scales between 2 and 10 pods.
- Target: Triggers scaling when average CPU utilization hits 70%.
- Stability: Includes a
stabilizationWindowSecondsof 300 to prevent "flapping" (rapid scaling up and down during minor load fluctuations).
Applying to Cluster
[!NOTE] The templates expect secrets to be injected through Kubernetes Secrets (see
infra/k8s/deployment.yaml), and shared environments should provideDATABASE_URL,RATE_LIMIT_STORAGE_URI, andCIRCUIT_BREAKER_CACHE_URL.
Release and Rollback Discipline
For shared environments, avoid mutable image tags such as latest.
Recommended release flow:
- Build and push a versioned image tag such as
ghcr.io/daretechie/api-reliability-suite:1.2.0. - Update
infra/k8s/deployment.yamlto that immutable release tag. - Apply the manifest and watch rollout health with
kubectl rollout status deployment/reliability-api. - If the rollout is unhealthy, use
kubectl rollout undo deployment/reliability-api.
Backups and Restore Notes
Once Postgres is your primary identity store, backup and restore need to be part of the deploy plan.
- Capture logical backups with
pg_dump -Fc. - Test restores with
pg_restoreinto a fresh database before calling a backup policy complete. - Run
poetry run alembic upgrade headduring deployment so schema state stays aligned with application code.
馃搳 Observability Stack
When deploying to a cluster, ensure the following services are available (or use the provided docker-compose.yml for local testing):
- Prometheus: Scrapes metrics from the
/metricsendpoint. - Alertmanager: Receives Prometheus alerts and keeps alert routing close to the metrics stack.
- Jaeger: Receives distributed traces via the OTLP exporter (if
OTLP_ENDPOINTis configured). - Grafana: Visualizes the "Golden Signals" (Latency, Errors, Traffic, Saturation).
馃殌 CI/CD Integration
The provided Docker and K8s assets are designed to integrate seamlessly with CI/CD runners (GitHub Actions, GitLab CI).
- Build: Create the image using the multi-stage Dockerfile.
- Push: Push the image to your private registry (ECR, GCR, ACR).
- Deploy: Update the
imagefield ininfra/k8s/deployment.yamland apply.