Skip to main content

Resilience: hbf-data-retention

Error handling and retry patterns for this service. Platform-wide patterns: docs/architecture/resilience.md

HTTP Retry

  • Library: hbf-core-api (inherited, uses axios)
  • Attempts: 3 (hbf-core-api, network errors only); application-level retry via THRESHOLD_OF_DELETION_RETIRES config
  • Backoff: Exponential (hbf-core-api, network errors only); linear at application level (no backoff between deletion retries)
  • On failure: Failed org/tenant IDs collected and retried up to N times; permanently failed items logged, execution continues

Queue Retry (if applicable)

N/A

Timeouts

CallTimeoutConfigured in
hbf-core-api callsNot set (axios default: no timeout)N/A

Circuit Breakers

None.

Fallback Strategy

Failure scenarioBehaviourUser impact
Deletion failure for org/tenantFailed IDs collected and retried up to THRESHOLD_OF_DELETION_RETIRES timesDeletion delayed but eventually attempted
Permanent deletion failureLogged, execution continues (runs in infinite loop)Stale data persists until manual intervention
Uncaught errorLogged at ERROR level, loop restartsService self-recovers but may miss deletion window

Known Gaps

  • No timeout on hbf-core-api calls (requests may hang indefinitely).
  • No circuit breaker (cascading failures possible if hbf-core is down).
  • No DLQ for permanently failed deletions (only logged, no structured tracking).
  • Application-level retry is linear with no exponential backoff between attempts.
  • No graceful shutdown mechanism.
  • No health endpoint.