Skip to main content

Resilience: hbf-session-manager

Error handling and retry patterns for this service. Platform-wide patterns: docs/architecture/resilience.md

HTTP Retry

  • Library: got (v11.8.5)
  • Attempts: Configurable via options.retries (trainOne method only)
  • Backoff: None (immediate retry, no delay between attempts)
  • On failure: Returns false if all retries exhausted

Queue Retry (if applicable)

N/A

Timeouts

CallTimeoutConfigured in
POST/PATCH requestsSESSION_SERVICE_REQUEST_TIMEOUT env var or 5000ms defaultEnvironment / service config
GET requestsMay have no timeout (this.timeout may be unset)Service config
NLP pipeline pollingNLP_PIPELINE_POLL_TIMEOUT_IN_SECS (default 360s / 6 min), poll interval 2sEnvironment config

Circuit Breakers

None.

Fallback Strategy

Failure scenarioBehaviourUser impact
NLP training failureReturns false after retries exhaustedTraining silently fails, caller must handle
NLP polling timeoutThrows TimeoutErrorCaller receives error, must handle
HTTP errorsLogged and re-thrownError propagated to caller

Known Gaps

  • GET requests may have no timeout, risking indefinite hangs.
  • Manual retry loop has no backoff delay (thundering herd risk under load).
  • No health endpoint.
  • No circuit breaker for downstream dependencies.