Resilience: hbf-lcg

Timeouts

Component	Timeout	Config	Default
Redis microservice response	Configurable	`MICROSERVICE_RESPONSE_TIMEOUT_MILLIS`	5000ms
Redis microservice heartbeat	Configurable	`MICROSERVICE_HEARTBEAT_TIMEOUT_MILLIS`	5000ms
Cisco polling interval	Fixed	Hardcoded	5000ms
Genesys inactive session check	Fixed	Hardcoded	10000ms
Zendesk session expiration	Configurable	`ZENDESK_MONITOR_CONVERSATION_EXPIRATION_SECONDS`	3600s
Genesys inactive expiration	Configurable	`GENESYS_MONITOR_CONVERSATION_EXPIRATION_SECONDS`	3600s
Genesys pending expiration	Configurable	`GENESYS_MONITOR_PENDING_EXPIRATION_SECONDS`	120s

Scenario	Behavior
Leader node failure	Automatic failover via Redis heartbeat. Surviving instances detect missed heartbeat and elect new leader.
Expired sessions	GatewayCleaner (leader-only cron) removes sessions past expiration thresholds. Sessions silently cleaned.
Genesys WebSocket disconnect	GenesysSocketAdapterDistributedService coordinates reconnection across cluster via Redis RPC.
Pending session timeout	Genesys pending sessions not established within 120s are cleaned up.
Redis cache unavailable	Falls back to in-memory cache (configured via `CACHE_REDIS_ENABLE`).

None. No /health endpoint exists.

No explicit retry policies on outbound HTTP calls. HttpClientService wraps Axios with logging but does not retry on failure.

No health endpoint: No way for orchestrators to check service liveness.
No per-call timeouts on polling adapters: Cisco and Zendesk polling adapters have interval timers but no per-request timeout on individual HTTP calls.
No retry cap on Genesys WebSocket reconnection: Reconnection attempts have no maximum limit.
Silent session expiry: Expired sessions are cleaned up without notifying the caller (hbf-bot). The bot may not know a session ended.