Skip to content

Non-Functional Requirements

6. Non-Functional Requirements

Document status: Draft v0.1 Depends on: sec2_architecture.md, sec3_api_unification.md, sec4_auth.md, sec5_sync.md Feeds into: Implementation, deployment docs, performance benchmarking plan


6.1 Performance

6.1.1 Target Workloads

Bridge is designed for small-to-medium multi-user research deployments. Performance targets assume Bridge is the single gateway for all component traffic.

Workload profile Concurrent users Requests/sec Typical operation
Small team 2–10 < 20 req/s Interactive Aperture queries, occasional pipeline runs
Medium lab 10–50 < 100 req/s Batch ingestion + active Aperture users
Enterprise 50+ 100+ req/s Kubernetes deployment; horizontal scaling

6.1.2 Latency Targets

Bridge adds overhead on top of the component's own response time. The target is that Bridge's routing and auth enforcement cost no more than 15ms p99 on the hot path.

Operation Bridge overhead target Notes
JWT validation (cached key) < 2ms p99 PyJWT signature check; public key cached in memory
API key lookup < 5ms p99 SQLite indexed lookup by hashed key
RBAC check < 1ms p99 In-memory role/permission table
Request routing + header injection < 5ms p99 Path parsing + header write
Total Bridge overhead (hot path) < 15ms p99 Cumulative; excludes component response time

Upstream latency budget: Bridge adds at most 15ms. If total response latency must stay under a target, the component is responsible for the remainder.

6.1.3 Health Check Latency

GET /api/v1/bridge/health should respond within 200ms. Results are cached with a 5-second TTL. A cold health check (cache miss) contacts all components in parallel; the response time is the max of component probe latencies plus Bridge overhead.

6.1.4 Sync Engine Throughput

Post-run consistency checks run asynchronously and must not block the request path. Full periodic consistency scans run at low priority (configurable: scan_priority: low) and are rate-limited to avoid overwhelming component REST APIs.


6.2 Availability and Reliability

6.2.1 Availability Target

Bridge itself targets 99.5% monthly uptime for team-server deployments. This allows for ~3.6 hours of downtime per month for maintenance.

Bridge is a single-process gateway in v0.1. High availability (multiple Bridge instances behind a load balancer) is supported via Kubernetes deployment but not required for the target workloads.

6.2.2 Component Unavailability Handling

When a component is unavailable:

  • Bridge returns 503 component_unavailable immediately (no hang).
  • Requests that do not require the unavailable component continue to be served normally.
  • Bridge health endpoint reflects degraded or unhealthy status for the affected component.
  • Sync events involving the unavailable component are retried with exponential backoff (base: 5s, max: 5min, max attempts: 10 before marking stale).

6.2.3 Graceful Shutdown

Bridge supports graceful shutdown: in-flight requests are completed before the process exits. Drain timeout is configurable (default: 30 seconds). Outstanding sync checks are checkpointed to the sync event log before shutdown.

6.2.4 Token Store Durability

The token store (refresh tokens, revocation records) uses SQLite in WAL mode with synchronous = NORMAL by default. This matches Hippo's storage durability posture. Production deployments should use PostgreSQL for the token store.


6.3 Security

6.3.1 Secrets Management

Secret Storage Notes
JWT signing key Env var or file path (bridge.yaml reference) Never logged
OIDC client secret Env var Never logged
API key plaintext Never stored Only shown once at creation; Bridge stores the hash
Token store password (PostgreSQL) Env var Connection string via ${BRIDGE_TOKEN_DB}

Bridge never logs credential material. Request logging redacts the Authorization header value.

6.3.2 Network Exposure

  • Bridge listens on 0.0.0.0 by default; this should be restricted to the LAN interface in team-server deployments.
  • Components should be bound to 127.0.0.1 or a private Docker/Kubernetes network; their ports should not be exposed externally.
  • TLS is expected to be terminated at a reverse proxy (nginx, Caddy, AWS ALB) in front of Bridge. Bridge can be configured to terminate TLS directly if needed.

6.3.3 Audit Log Integrity

Audit log entries must not be modifiable or deletable via the Bridge API. The audit log backend (file, PostgreSQL) should be configured for append-only access in production. Audit log events include:

  • Auth events: login, token refresh, token revocation, API key creation/revocation
  • Request events: actor, method, path, response status, latency (for non-200 responses)
  • Sync events: mismatch detected, repair attempted, repair outcome

6.4 Scalability

6.4.1 Horizontal Scaling

Bridge is stateless in the request path (auth state is in the token store, not in memory of a specific Bridge instance). Multiple Bridge instances can run behind a load balancer provided they share the same token store and API key database (PostgreSQL).

6.4.2 Token Store Scaling

SQLite is sufficient for deployments with fewer than 50 active users and low token churn. PostgreSQL is required when:

  • Multiple Bridge instances share state
  • Token volume exceeds ~1,000 active refresh tokens
  • Audit log volume exceeds the capacity of a single append-only file

6.4.3 Rate Limiting Behavior

Rate limiting in v0.1 is per-instance (in-memory). When multiple Bridge instances run behind a load balancer, the effective rate limit is per_actor_rps × instance_count. Distributed rate limiting (Redis-backed) is deferred to v1.1.


6.5 Observability

6.5.1 Logging

All Bridge logs are structured JSON, emitted to stdout. Log levels:

Level Usage
INFO Request accepted, auth OK, sync check completed
WARNING Rate limit hit, component slow response, sync mismatch
ERROR Component unreachable, token store write failure, sync repair failure
DEBUG Detailed JWT claim inspection, routing decisions (disabled in production)

6.5.2 Metrics

Bridge emits Prometheus-compatible metrics at GET /api/v1/bridge/metrics:

Metric Type Description
bridge_requests_total Counter Total requests by method, path prefix, status
bridge_request_duration_seconds Histogram Request latency by method, path prefix
bridge_auth_failures_total Counter Auth failures by type (invalid_token, expired, etc.)
bridge_component_health Gauge 1 = healthy, 0 = unhealthy, per component
bridge_sync_mismatches_total Counter Sync mismatches by type
bridge_active_tokens Gauge Active refresh token count

6.5.3 Health Checks

GET /api/v1/bridge/health — JSON response (see sec3 §3.6 for schema). HTTP status: 200 OK (healthy/degraded), 503 Service Unavailable (unhealthy).

Kubernetes liveness probe: GET /api/v1/bridge/health with a 5-second timeout. Kubernetes readiness probe: same endpoint; Bridge is ready when all critical components respond within their SLO.


6.6 Maintainability

6.6.1 Configuration Validation

Bridge validates bridge.yaml on startup and rejects invalid configuration with a descriptive error. It does not silently use defaults for required fields.

6.6.2 Dependency Constraints

Bridge's Python dependencies are pinned in pyproject.toml. The package list is minimal:

Package Purpose
fastapi HTTP server
uvicorn ASGI runner
httpx Async HTTP proxy client
pyjwt JWT signing and verification
cryptography RS256 key handling
sqlalchemy ORM for token/key stores

Bridge does not depend on Hippo, Cappella, or Canon as Python packages.

6.6.3 Test Coverage

Unit test targets:

Module Target coverage
Auth middleware (JWT, API key) ≥ 90%
RBAC enforcer ≥ 90%
Request router + header injection ≥ 85%
Sync engine consistency checks ≥ 80%
Audit log writer ≥ 80%

Integration test requirements:

  • Full request lifecycle with real Hippo (SQLite) as upstream
  • Auth failure cases: invalid token, expired token, revoked token, insufficient role
  • API key create/rotate/revoke lifecycle
  • Sync mismatch detection with mocked Cappella run response

6.7 Deployment

6.7.1 Installation

pip install bass-bridge          # PyPI
# or
uv add bass-bridge               # via uv (preferred)

6.7.2 Minimum System Requirements

Resource Minimum Recommended (team)
CPU 1 core 2 cores
RAM 256 MB 512 MB
Disk (token/key store) 100 MB 1 GB
Python 3.11+ 3.13

6.7.3 Startup Command

bass-bridge serve --config bridge.yaml
# or
uvicorn bridge.server:app --host 0.0.0.0 --port 8000 --workers 4

6.7.4 Database Initialization

bass-mgr bridge db init --config bridge.yaml

Creates token store and API key tables. Safe to re-run (idempotent).


6.8 Open Questions

Question Priority Status
Should Bridge support zero-downtime config reload (SIGHUP) without restart? Medium Open
Distributed rate limiting via Redis — is it needed before v1.1? Medium Open; depends on horizontal scaling adoption speed
Audit log retention policy — how long should auth events be kept, and who is responsible for rotation? High Open — likely institution-dependent; document recommended minimums