Flow: Analytics Ingestion
End-to-end sequence for stats aggregation and report generation. Services: hbf-stats, hbf-reports, hbf-core, hbf-session-manager, hbf-console, SMTP
Sequence Diagram
Step-by-Step
Data Source: DataCollector (Real-Time Message Recording)
DataCollector is a component in hbf-bot's lifecycle pipeline that writes chatSessionMessage records into the chat-sessions.messages[] array in MongoDB as events are processed. This is the primary mechanism for recording what happened during a conversation.
For live chat events, DataCollector only runs on ITC (Internal To Client) events, not raw LIVECHAT-origin events. The _isEnabled() method returns false for EventOrigin.LIVECHAT. This means analytics are recorded when the message is delivered to the end user, not when it first arrives from hbf-lcg.
DataCollector also maintains the liveChats[] aggregate on the ChatSession, tracking waiting time, duration, response times, and agent info.
See docs/architecture/flows/live-chat.md for the full live chat analytics data flow.
Data Source: Session Completion
hbf-session-manager feeds into the analytics pipeline by marking chat sessions as completed in hbf-core. This updates the raw data that hbf-stats later aggregates.
Stats Aggregation (hbf-stats)
-
Polling (hbf-stats -> hbf-core): The hbf-stats daemon runs an infinite loop, querying hbf-core for tenants whose stats have not been updated recently (
statsUpdateLessThanthreshold). -
Data fetch (hbf-stats -> hbf-core): For each stale tenant, hbf-stats fetches the organization's timezone and the analytics summary covering two windows: daily granularity for the past 60 days, and monthly granularity for the past 13 months.
-
Aggregation (hbf-stats): The raw analytics data is aggregated into summary statistics per tenant.
-
Write-back (hbf-stats -> hbf-core): Aggregated stats are written back to the tenant document via
TenantsClient.createOrUpdateStats().
Report Generation (hbf-reports)
-
Scheduled triggers: hbf-reports runs cron jobs on two schedules:
- Weekly: Fires every Monday, generating weekly summary reports.
- Monthly: Fires on the 1st of each month, generating monthly summary reports.
-
Data fetch (hbf-reports -> hbf-core): For each scheduled report, hbf-reports calls 10+ analytics API methods on hbf-core, fetching summary data, live chat metrics, automated answer stats, organization/tenant/deployment metadata.
-
Generation: hbf-reports generates PDF or Excel files from the fetched data.
-
Delivery (hbf-reports -> SMTP): Reports are sent as email attachments via nodemailer.
On-Demand Reports (hbf-console -> hbf-reports)
-
Export (hbf-console -> hbf-reports): Users can request on-demand PDF exports from the console via
GET /exports. hbf-reports fetches the data from hbf-core and returns the generated PDF. -
Schedule management (hbf-console -> hbf-reports): Users manage automated report schedules (create, update, delete) via REST CRUD endpoints on hbf-reports.
Contracts
hbf-stats -> hbf-core (tenant polling):
TenantsClient.list({ statsUpdateLessThan: Date })
-> Tenant[] (tenants needing stats refresh)
hbf-stats -> hbf-core (data fetch + write-back):
Organization timezone lookup
Analytics summary (daily: 60 days, monthly: 13 months)
TenantsClient.createOrUpdateStats(tenantId, {
daily: [...],
monthly: [...],
updatedAt: Date
})
hbf-reports -> hbf-core (analytics methods, 10+):
Analytics summary, live chat stats, automated answers,
organization info, tenant info, deployment info,
session breakdowns, NLP accuracy, response rates, etc.
hbf-reports -> SMTP:
nodemailer transport with PDF/Excel attachment
To: report schedule recipients
Subject: weekly/monthly report for {organization}
hbf-console -> hbf-reports:
GET /exports?tenantId=...&from=...&to=... -> PDF download
GET /schedules -> Schedule[]
POST /schedules -> Schedule
PUT /schedules/:id -> Schedule
DELETE /schedules/:id -> 204