Skip to main content

Resilience: hbf-lcg

Timeouts

ComponentTimeoutConfigDefault
Redis microservice responseConfigurableMICROSERVICE_RESPONSE_TIMEOUT_MILLIS5000ms
Redis microservice heartbeatConfigurableMICROSERVICE_HEARTBEAT_TIMEOUT_MILLIS5000ms
Cisco polling intervalFixedHardcoded5000ms
Genesys inactive session checkFixedHardcoded10000ms
Zendesk session expirationConfigurableZENDESK_MONITOR_CONVERSATION_EXPIRATION_SECONDS3600s
Genesys inactive expirationConfigurableGENESYS_MONITOR_CONVERSATION_EXPIRATION_SECONDS3600s
Genesys pending expirationConfigurableGENESYS_MONITOR_PENDING_EXPIRATION_SECONDS120s

Fallback Strategies

ScenarioBehavior
Leader node failureAutomatic failover via Redis heartbeat. Surviving instances detect missed heartbeat and elect new leader.
Expired sessionsGatewayCleaner (leader-only cron) removes sessions past expiration thresholds. Sessions silently cleaned.
Genesys WebSocket disconnectGenesysSocketAdapterDistributedService coordinates reconnection across cluster via Redis RPC.
Pending session timeoutGenesys pending sessions not established within 120s are cleaned up.
Redis cache unavailableFalls back to in-memory cache (configured via CACHE_REDIS_ENABLE).

Health Checks

None. No /health endpoint exists.

Retry Policies

No explicit retry policies on outbound HTTP calls. HttpClientService wraps Axios with logging but does not retry on failure.

Known Gaps

  1. No health endpoint: No way for orchestrators to check service liveness.
  2. No per-call timeouts on polling adapters: Cisco and Zendesk polling adapters have interval timers but no per-request timeout on individual HTTP calls.
  3. No retry cap on Genesys WebSocket reconnection: Reconnection attempts have no maximum limit.
  4. Silent session expiry: Expired sessions are cleaned up without notifying the caller (hbf-bot). The bot may not know a session ended.