Resilience: hbf-lcg
Timeouts
| Component | Timeout | Config | Default |
|---|---|---|---|
| Redis microservice response | Configurable | MICROSERVICE_RESPONSE_TIMEOUT_MILLIS | 5000ms |
| Redis microservice heartbeat | Configurable | MICROSERVICE_HEARTBEAT_TIMEOUT_MILLIS | 5000ms |
| Cisco polling interval | Fixed | Hardcoded | 5000ms |
| Genesys inactive session check | Fixed | Hardcoded | 10000ms |
| Zendesk session expiration | Configurable | ZENDESK_MONITOR_CONVERSATION_EXPIRATION_SECONDS | 3600s |
| Genesys inactive expiration | Configurable | GENESYS_MONITOR_CONVERSATION_EXPIRATION_SECONDS | 3600s |
| Genesys pending expiration | Configurable | GENESYS_MONITOR_PENDING_EXPIRATION_SECONDS | 120s |
Fallback Strategies
| Scenario | Behavior |
|---|---|
| Leader node failure | Automatic failover via Redis heartbeat. Surviving instances detect missed heartbeat and elect new leader. |
| Expired sessions | GatewayCleaner (leader-only cron) removes sessions past expiration thresholds. Sessions silently cleaned. |
| Genesys WebSocket disconnect | GenesysSocketAdapterDistributedService coordinates reconnection across cluster via Redis RPC. |
| Pending session timeout | Genesys pending sessions not established within 120s are cleaned up. |
| Redis cache unavailable | Falls back to in-memory cache (configured via CACHE_REDIS_ENABLE). |
Health Checks
None. No /health endpoint exists.
Retry Policies
No explicit retry policies on outbound HTTP calls. HttpClientService wraps Axios with logging but does not retry on failure.
Known Gaps
- No health endpoint: No way for orchestrators to check service liveness.
- No per-call timeouts on polling adapters: Cisco and Zendesk polling adapters have interval timers but no per-request timeout on individual HTTP calls.
- No retry cap on Genesys WebSocket reconnection: Reconnection attempts have no maximum limit.
- Silent session expiry: Expired sessions are cleaned up without notifying the caller (hbf-bot). The bot may not know a session ended.