Communication: hbf-knowledge-manager
1-hop view of how this service communicates with its siblings. For the full system view, see
docs/architecture/service-communication.md.
Calls Out To
| Service | Protocol | Purpose | Key calls |
|---|---|---|---|
| hbf-core | hbf-core-api | Integration lookup, KB queries, file ingestion, KB group deletion, user auth, SharePoint subscription state persistence | IntegrationClient.getById(), IntegrationClient.findAllByWebhookKey(), IntegrationClient.update() (subscription IDs, expiries, delta links), IntegrationClient.listSharepointActiveSubscriptions(), KnowledgeBaseClient.list(), KnowledgeBaseArticleClient.fileToArticles(), KnowledgeBaseGroupClient.list(), KnowledgeBaseGroupClient.deleteAll(), KnowledgeBaseGroupClient.deleteBySourceId(), UsersClient.findCurrentUser() |
| Azure Blob Storage | Azure SDK (@azure/storage-blob) | Download blob files for ingestion; list blobs for full sync | SAS token sourced from integration config in hbf-core — no credentials stored locally |
| Microsoft Graph API | HTTPS (https://graph.microsoft.com/v1.0) | SharePoint site/drive resolution, file listing and download, delta queries, webhook subscription CRUD, item permissions, list item field metadata | OAuth2 client credentials via SHAREPOINT_CLIENT_ID + SHAREPOINT_CLIENT_SECRET; token cached per tenant with 5-minute buffer |
| Azure AD / Entra ID | HTTPS (https://login.microsoftonline.com/{tenantId}/oauth2/v2.0/token) | OAuth2 token acquisition for Graph API | Client credentials grant with scope https://graph.microsoft.com/.default |
Called By
| Caller | Protocol | How |
|---|---|---|
| Azure Event Grid | HTTP POST | POST /webhooks/azure-blob — delivers Microsoft.Storage.BlobCreated and Microsoft.Storage.BlobDeleted events; handles subscription validation handshake |
| Microsoft Graph | HTTP POST | POST /webhooks/sharepoint — delivers change notifications for subscribed drives; handles subscription validation handshake (?validationToken=) |
| hbf-console (or any admin caller) | HTTP POST | POST /sync/org/:orgId/integrations/:integrationId/knowledge-bases/:knowledgeBaseId/full — guarded by HBFGuard + AdminOrgRoleGuard |
| hbf-console (or any admin caller) | HTTP POST | POST /sync/sharepoint/integrations/:integrationId/subscriptions — initialize Graph webhook subscriptions; guarded by HBFGuard + AdminOrgRoleGuard |
| hbf-console (or any admin caller) | HTTP GET | GET /sharepoint/drives?tenantId=...&siteUrl=... — list document libraries for integration setup; guarded by HBFGuard |
Contracts
Inbound — Webhook (POST /webhooks/azure-blob)
Auth: none (Azure Event Grid delivers to the registered URL; webhook key routing via <accountName>:<containerName>).
Event Grid sends a JSON array. Two event types are handled:
Subscription validation (handshake):
[{ "eventType": "Microsoft.EventGrid.SubscriptionValidationEvent", "data": { "validationCode": "..." } }]
Blob created:
[{ "eventType": "Microsoft.Storage.BlobCreated", "topic": "/subscriptions/.../storageAccounts/<account>", "subject": "/blobServices/default/containers/<container>/blobs/<path>" }]
Blob deleted:
[{ "eventType": "Microsoft.Storage.BlobDeleted", "topic": "...", "subject": "/blobServices/default/containers/<container>/blobs/<path>" }]
Response: 200 OK immediately (processing is fire-and-forget).
Inbound — SharePoint Webhook (POST /webhooks/sharepoint)
Auth: optional ?secret=<SHAREPOINT_WEBHOOK_SECRET> query param (if SHAREPOINT_WEBHOOK_SECRET env var is set, requests without a matching secret are rejected with 401).
Subscription validation (handshake):
Graph sends a POST with ?validationToken=<opaque-token>. The service echoes the token back as text/plain with status 200.
Change notification:
{
"value": [
{
"subscriptionId": "<graph-subscription-id>",
"clientState": "<orgId>:<integrationId>",
"changeType": "updated",
"resource": "drives/<driveId>/root",
"subscriptionExpirationDateTime": "2026-05-01T00:00:00Z",
"tenantId": "<azure-tenant-id>"
}
]
}
Response: 202 Accepted immediately. Processing is fire-and-forget: the service runs a Graph delta query to resolve actual file changes.
Inbound — Full Sync (POST /sync/org/:orgId/integrations/:integrationId/knowledge-bases/:knowledgeBaseId/full)
Auth: Authorization: Bearer <user-token> (HBFGuard validates via hbf-core GET /users/me; AdminOrgRoleGuard checks org ADMIN role).
No request body required. Path params identify the target org, integration, and KB. Works for both Azure Blob and SharePoint integrations (provider resolved from integration type). For SharePoint, also calls ensureSubscription after sync completes.
Inbound — SharePoint Subscriptions (POST /sync/sharepoint/integrations/:integrationId/subscriptions)
Auth: Authorization: Bearer <user-token> (HBFGuard + AdminOrgRoleGuard).
Request body:
{ "organizationId": "<org-id>" }
Response:
{ "subscriptionCount": 2 }
Creates Graph webhook subscriptions for each drive in the integration. Establishes baseline delta links so the first notification only picks up new changes.
Inbound — Drive Discovery (GET /sharepoint/drives?tenantId=...&siteUrl=...)
Auth: Authorization: Bearer <user-token> (HBFGuard).
Query params: tenantId (Azure tenant), siteUrl (e.g. https://contoso.sharepoint.com/sites/MySite). Validates that siteUrl is a *.sharepoint.com domain.
Response:
{ "drives": [{ "id": "<drive-id>", "name": "Documents" }, ...] }
Outbound — hbf-core-api calls
Integration lookup by webhook key:
IntegrationClient.findAllByWebhookKey(webhookKey: string)— key format:<accountName>:<containerName>
KB list (filtered by integration):
KnowledgeBaseClient.list(orgId, { source: integrationId })
File ingestion (blob created):
KnowledgeBaseArticleClient.fileToArticles(orgId, kbId, fileBuffer, fileName, { sourceId, groupName, source, publishArticles: true })sourceIdformat:azure-blob:<account>:<container>:<blobPath>
KB group deletion (blob deleted):
KnowledgeBaseGroupClient.deleteBySourceId(orgId, kbId, sourceId)
Orphan cleanup (during full sync):
KnowledgeBaseGroupClient.deleteBySourceId(orgId, kbId, sourceId)— removes orphaned groups no longer present in the source
Integration update (SharePoint subscription/delta state):
IntegrationClient.update(orgId, integrationId, { webhookSubscriptionIds, webhookSubscriptionExpiries, deltaLinks, lastSubscriptionRenewalCheck })— persists Graph subscription IDs, expiry dates, and delta links per drive
Active SharePoint subscriptions (for renewal sweep):
IntegrationClient.listSharepointActiveSubscriptions()— returns all integrations with active Graph subscriptions
KB group listing (for orphan cleanup during full sync):
KnowledgeBaseGroupClient.list(orgId, kbId)— lists existing groups to detect orphaned sourceIds after sync
User auth (HBFGuard):
UsersClient.findCurrentUser()— called with the caller's Bearer token
Outbound — Microsoft Graph API calls
All calls go through GraphClientService which manages OAuth2 token caching per tenant.
Site resolution:
GET /sites/{hostname}:{serverRelativePath}— resolve SharePoint site URL to siteId
Drive listing and resolution:
GET /sites/{siteId}/drives— list document libraries- Drive lookup by name (client-side filter on response)
File operations:
GET /drives/{driveId}/items/{itemId}/children— list children (recursive traversal for full sync)GET /drives/{driveId}/items/{itemId}— get item metadata +@microsoft.graph.downloadUrl- Download via pre-authenticated URL from
@microsoft.graph.downloadUrl
Delta queries (incremental sync):
GET /drives/{driveId}/root/delta— initial delta queryGET {deltaLink}— subsequent delta queries using stored deltaLink- Returns paginated results with
@odata.nextLink/@odata.deltaLink
Subscription management:
POST /subscriptions— create webhook subscription for/drives/{driveId}/rootwith changeTypeupdatedPATCH /subscriptions/{id}— renew subscription (updateexpirationDateTime)GET /subscriptions/{id}— check subscription resource (for siteUrl staleness detection)DELETE /subscriptions/{id}— remove subscription
Item metadata:
GET /sites/{siteId}/drives/{driveId}/items/{itemId}/listItem/fields— read managed metadata (taxonomy columns)GET /drives/{driveId}/items/{itemId}/permissions— read item permissions (for RBAC sync)
Flows Involving This Service
This service is an ingestion pipeline triggered by external events (Azure Event Grid, Microsoft Graph) rather than a participant in the bot message-processing or live-chat flows. It also runs a background subscription renewal timer for SharePoint Graph webhook subscriptions (every 12 hours).