NLU / LLM Pipeline
How Platform processes a user message through NLP, LLM, and RAG. Services: hbf-core (config store), hbf-nlp (orchestration), helvia-rag-pipelines (RAG engine), semantic-doc-segmenter (document segmentation), open-bot-framework (DirectLine channel gateway, planned, not yet in use) Last updated: 2026-03-13
Pipeline Configuration Object
Stored in hbf-core MongoDB collection nlp-pipelines. Polymorphic: base class NLPPipeline with 6 subtypes selected by @JsonTypeInfo discriminator on the type field.
Source: hbf-core/src/main/java/gr/helvia/hbf/core/domain/NLPPipeline.kt
TypeScript consumer: hbf-core-api/src/datamodel/nlp.ts
Full schema: docs/domain-model/nlp-pipeline.md
Base Fields (all subtypes)
| Field | Type | Notes |
|---|---|---|
id | String | PK |
name | String | @NotNull |
type | NLPType | Discriminator (see Enums below) |
language | String | Primary language. @NotNull on create |
secondaryLanguages | Set<LanguageCode> | Optional additional languages |
status | NLPStatus | CREATED, OUTDATED, TRAINING, FAILED, READY, INITIALIZING |
predictionConfidenceThreshold | Double | Min confidence for intent match. Range (0, 1] |
includeTrainingTags / excludeTrainingTags | List<String> | Filter KB articles for training |
organization | Organization | @DBRef lazy |
tenant | Tenant | @DBRef lazy |
nlpService | NLPService | HBF_CORE or HBF_NLP (which service executes the pipeline) |
lastTrainedAt | Date | |
failedReason | String | Error message when status=FAILED |
Subtypes
| NLPType | Class | Key Extra Fields |
|---|---|---|
LUIS_NLP | LuisNLP | appId, appName, authoringKey, appVersion, host, predictionHost, predictionKey |
OPENAI_NLP | OpenAINLP | model, apiKey, temperature, maxTokens, trainingType (ZERO_SHOT/ONE_SHOT/FEW_SHOT/CUSTOM_PROMPT), modelCategory (COMPLETION/CHAT), prompt: OpenAIPrompt |
DIALOGFLOW_NLP | DialogflowNLP | projectId, privateKey, clientEmail, region (DialogFlowRegion enum), trainingOperationName |
HELVIA_NLP_SPECIFICATION | HelviaNLPSpecification | serviceUrl, bearerToken |
HELVIA_GPT | HelviaGPT (extends HelviaNLPSpecification) | pipelineId |
HELVIA_RAG_PIPELINE | HelviaRAGPipeline (extends HelviaNLPSpecification) | pipelineId, settings: RAGPipelineSettings |
RAGPipelineSettings (hbf-core side)
| Field | Type | Default | Notes |
|---|---|---|---|
includeHistory | Boolean | false | Send chat history to RAG pipeline |
maxHistoryTurns | Int | 4 | Range 1-30 |
RAG Pipeline Configuration (helvia-rag-pipelines side)
Stored as JSON blob in helvia-rag-pipelines MySQL pipelines.configuration_json, deserialized to PipelineConfiguration Pydantic model.
Source: helvia-rag-pipelines/app/schemas/pipeline_configuration_schemas.py
PipelineConfiguration
├── general_settings
│ article_format -- template with {{title}}, {{group}}, {{body}}, {{tags}}
│ corpus_language -- KB content language
│ native_languages[] -- languages supported without translation
│ default_native_language -- fallback language
│ return_confidence -- include confidence in response
│
├── embeddings
│ model -- e.g. "text-embedding-3-small"
│ providers[] -- LlmProvider (platform, url, apiKey, model)
│
├── semantic_search
│ enabled -- default true
│ max_results -- default 7
│ max_input_tokens -- default 0 (unlimited)
│ exact_match -- default true
│ visit_neighbors -- default 128
│ normalize_user_input -- default true
│ normalize_corpus -- default false
│
├── text_generation
│ providers[] -- LlmProvider
│ prompt -- system prompt for generation
│ max_tokens -- default 500
│ temperature -- default 0.0
│ parser -- { type, regex } for structured output (JSON or REGEX)
│ hide_urls -- default true (replaces URLs with UUIDs before LLM call)
│
├── chat_history
│ enabled -- default false
│ max_messages -- default 8
│
├── query_summarization
│ enabled -- default false
│ providers[] -- LlmProvider
│ prompt -- summarization prompt
│ max_tokens -- default 300
│ temperature -- default 0.0
│ use_summary_in_sem_search -- default true
│ use_summary_in_generation -- default false
│ skip_history_in_inference -- default false
│
├── translation
│ enabled -- default false
│ query_translation_providers[]
│ response_translation_providers[]
│ corpus_translation_providers[]
│ hide_urls -- default false
│
└── sem_cache
(configured via Helvia SemCache service)
Each LlmProvider entry contains: platform (OPENAI, AZURE_OPENAI), platform_url, platform_api_key, model, seed, prompt.
Tenant-to-Pipeline Binding
A Tenant links to pipelines via two mechanisms:
nlpMap(Map<String, String>): Simple language-to-pipeline-ID mapping. Checked first.nlpTrees(List<NLPPipelineTreeData>): Decision trees with variable-based conditions. Checked as fallback ifnlpMaphas no entry for the language.
Pipeline Decision Tree
Stored in MongoDB collection nlp-pipeline-trees. Each tree has:
nlpDecisionTree: Runtime-compiled decision mapnlpDecisionTreeSource: Editor-friendly graph with nodes and edges
Node types (NLPNodeType):
INTRO: Entry pointPIPELINE: References a pipeline IDSEQUENCE: Sequential evaluationQUERY: Conditional (LIQE expression evaluated against session variables)
Provider Selection
Provider registration (hbf-nlp ResolverModule):
| Pipeline Type | Provider Class | Client |
|---|---|---|
HELVIA_RAG_PIPELINE | HelviaRAGPipelinesProvider | HelviaRAGPipelineClient (HTTP) |
HELVIA_NLP_SPECIFICATION | HelviaNLPSpecificationProvider | HelviaNLPSpecificationPipelineClient (HTTP) |
DIALOGFLOW_NLP | DialogflowPipelinesProvider | DialogflowPipelineClient (gRPC) |
Note: LUIS_NLP, OPENAI_NLP, and HELVIA_GPT types exist in the enum but have no registered provider in hbf-nlp. They are legacy types.
Processing Sequence (POST /tenants/{tenantId}/process)
Full flow in hbf-nlp/src/nlp/nlp.service.ts:
Step 1: Priority Keyword Pre-processing
Check user query against tenant.settings.nluLocal.intents (Map of intent name to keyword list).
Uses configurable string similarity: exact match, Jaro-Winkler, or Damerau-Levenshtein with a similarityThreshold.
If matched, return immediately without calling any NLP provider.
Step 2: Language Resolution
If detectLanguage=true, call LlmService.languageDetectionLegacy():
- Legacy path: hardcoded Azure OpenAI call using env vars
AZURE_OPENAI_* - Modern path: uses tenant's
LLM_LANGUAGE_DETECTIONplugin with configurable provider
Step 3: Pipeline Selection
- Check
tenant.nlpMap[resolvedLanguage]for a direct pipeline ID - If no match, evaluate
tenant.nlpTreeMap[language]decision tree using LIQE expressions against session variables
Step 4: Provider Dispatch
ResolverService.resolve(pipeline.type) returns the registered provider.
Provider calls its downstream client:
- HELVIA_RAG:
POST {serviceUrl}/pipelines/{pipelineId}:processwith query, language, session history, parameters - HELVIA_NLP_SPEC:
POST {serviceUrl}:processwith query, language, parameters (no history) - DIALOGFLOW: gRPC
detectIntentvia@google-cloud/dialogflowSDK
Step 5: Metadata Persistence
MessageMetadataService writes to MySQL message_metadata table:
- Processing steps: PRIORITY_KEYWORDS, LANGUAGE_DETECTION, NLP_SYSTEM
- Each step includes: input, output, duration (ms)
RAG Query Flow (POST /pipelines/{pipelineId}:process)
Full flow in helvia-rag-pipelines/app/services/pipeline_service.py:
Vector DB Selection
Configured at startup via VECTOR_DB environment variable:
qdrant(default): Uses qdrant-client SDK. Supports API, on-disk, and in-memory backends.milvus: Uses pymilvus SDK. IVF_FLAT index with L2 distance.
VectorDbManager singleton selects the implementation once at boot. One collection per pipeline, named rag_pipelines_{id} (prefix configurable via VDB_COLLECTION_PREFIX).
Key vector DB settings:
VDB_DIMENSIONS: 1536 (default, matches text-embedding-3-small)VDB_BATCH_INSERT_SIZE: 100 items per batch during indexingMAX_VECTORS_PER_COLLECTION: 500,000- Qdrant uses COSINE distance metric; Milvus uses L2
LLM Client Selection (within helvia-rag-pipelines)
NLPProviderService manages client selection per scope:
| Scope | Config Source | Client Classes |
|---|---|---|
EMBEDDINGS | config.embeddings.providers[] | NLPAPIClientOpenAI, NLPAPIClientAzureOpenAI, NLPAPIClientGoogleGenAI |
CHAT (text gen) | config.text_generation.providers[] | Same + NLPAPIClientAnyscale |
QUERY_SUMMARIZATION | config.query_summarization.providers[] | Same as CHAT |
Provider selection uses round-robin rotation across configured providers for each scope. Each provider entry specifies platform (OPENAI, AZURE_OPENAI, GOOGLE_GENAI, ANYSCALE), platform_url, platform_api_key, and model.
Translation
TranslationService supports multiple clients:
TranslationAPIClientGoogle: Google Cloud Translate v3TranslationAPIClientAzure: Azure TranslatorTranslationAPIClientAzureOpenAI: Azure OpenAI (LLM-based translation)TranslationAPIClientOpenAI: OpenAI (LLM-based translation)
Provider selected per scope (query, response, corpus) from config.translation.*_providers[].
Semantic Cache (SemCache)
SemCache is triggered only when ALL conditions are met:
sem_cache.enabled = truein pipeline config- Text generation is enabled
- OpenAI embeddings are used
- Chat history
max_messages == 0(single-turn conversations only)
When triggered, SemCacheService checks the Helvia SemCache service for a semantically similar previous query. On cache hit, the cached response is returned. On miss, the generated response is stored for future queries.
Cache configuration is auto-initialized per pipeline: if cache_uuid or api_key is missing, the service creates them via the SemCache API.
Embedding Cache
Optional cache for embedding vectors. Configured via CACHE_MODE: memory (default), redis, or redis_async.
Cache key: (input_hash, provider, model, dimensions). TTL: CACHE_EXPIRATION_TIME (default 3600s).
Avoids re-computing embeddings for previously seen queries during both indexing and search.
Confidence Calculation
When return_confidence = true in pipeline config:
summary_confidence: extracted from OpenAI logprobs during query summarizationprocess_confidence: extracted from OpenAI logprobs during text generation- Final
confidence = summary_confidence x process_confidence
Document Ingestion / Training Pipeline
Training Trigger
POST /pipelines/{pipelineId}:train (called by hbf-nlp after corpus update)
Corpus Update (hbf-nlp side)
hbf-nlp transforms training content into corpus items before sending to helvia-rag-pipelines:
- Fetch activities (KNOWLEDGE_BASE + AUTOMATED_ANSWERS types) from hbf-core
- For each activity, extract intent content and/or KB article content
- Build
HelviaCorpusItem[]:{ id, title, group, body, training_text, type: "INTENT"|"ARTICLE", tags, language } PUT /pipelines/{pipelineId}/corpussends the full corpus (diff applied server-side)
Indexing (helvia-rag-pipelines side)
PipelineService.train()sets status toTRAINING_index_corpus()fetches corpus items whereneed_training=True(or all ifforce_reindex=True)- For each item,
SemanticSearchServicegenerates embedding via configured embedding provider - Embeddings upserted into vector DB collection via
VectorDbManager - Status set to
READY,last_trained_atupdated, corpus items marked trained
Corpus Diff Logic
PUT /pipelines/{pipelineId}/corpus:
- Compares incoming items against MySQL corpus by
(id, pipeline_id) - Inserts new items, updates changed items, deletes removed items
- Changed/new items get
need_training=True - If corpus language differs from pipeline native language, items are translated before storage
- Pipeline status set to
OUTDATED(requires re-training)
LLM Provider Architecture (hbf-nlp)
Separate from NLP pipeline providers. Used for session analysis, language detection, and direct LLM requests.
Provider Registration
| Alias | Class | API | Version |
|---|---|---|---|
OPEN_AI_LLM | OpenAIProvider | OpenAI Chat Completions | v1 |
AZURE_AI_LLM | AzureProvider | Azure OpenAI Chat Completions | 2025-03-01-preview |
GEMINI_AI_LLM | GeminiProvider | Google Gemini (@google/genai) | v1beta |
Integration Config
Each tenant has LLM integrations stored in hbf-core:
| Integration Type | Class | Key Fields |
|---|---|---|
OPEN_AI_LLM | OpenAIIntegration | apiKey, project, openAIOrganization, customUrl |
AZURE_AI_LLM | AzureAIIntegration | endpoint, apiKey, deploymentName (deprecated), apiVersion (deprecated) |
GEMINI_AI_LLM | GeminiAIIntegration | apiKey |
LLM Endpoints
| Endpoint | Purpose | Auth |
|---|---|---|
POST /llm-request | Parameterized LLM completion (structured data extraction) | JWT (bot only) |
POST /detect-language | Language detection | JWT (bot only) |
POST /organizations/:orgId/tenants/:tenantId/sessions/:sessionId/analyze | Session summary (sentiment, urgency, categories) | Bearer |
GET /organizations/:orgId/providers/:alias/version | Get provider API version | Bearer |
GET /organizations/:orgId/categories/:category/prompt | Get plugin prompt template | Bearer |
Session Analysis Flow
LlmController.analyzeChatSessioncalled with orgId/tenantId/sessionIdLlmServicefetches tenant'sLLM_SUMMARIZATIONplugin and its integration config- Resolves
LlmProvider(OpenAI/Azure/Gemini) viaResolverService - Builds session transcript, constructs prompt, calls
provider.analyze() - Result (summary, classification tags, sentiment) written back to hbf-core
Embedding Models
| Purpose | Configured In | Default Model | Providers |
|---|---|---|---|
| Document indexing | PipelineConfiguration.embeddings | text-embedding-3-small | OpenAI, Azure OpenAI, Google GenAI |
| Query embedding | Same as indexing (shared config) | Same | Same |
| Text generation | PipelineConfiguration.text_generation | gpt-4o | OpenAI, Azure OpenAI, Google Gemini, Anyscale |
| Query summarization | PipelineConfiguration.query_summarization | (per config) | Same as text generation |
Response Format
hbf-nlp process response (NLPProcessResponseDto)
{
intent: string, // matched intent name
entities: Entity[], // extracted entities
response: string, // generated text response
pipelineResults: {
query: string,
queryCategory: "Matched"|"Smalltalk"|"Generic"|"Missed",
matchedCorpus: { id: string, confidence: number },
examinedCorpus: [{ id: string, confidence: number }],
generatedText: string,
extractedParameters: [{ key: string, value: string }],
languageCode: string,
detectedLanguage: string
}
}
helvia-rag-pipelines process response
{
"answer": "Generated answer text",
"sources": [{ "title": "Article Title", "score": 0.92 }],
"confidence": 0.95,
"query_category": "Matched",
"examined_corpus": [{ "id": "...", "title": "...", "score": 0.85 }],
"matched_corpus": { "id": "...", "title": "...", "score": 0.92 },
"generated_text": "Generated answer text",
"extracted_parameters": [{ "key": "param1", "value": "value1" }]
}
Enums Reference
NLPType (pipeline type discriminator)
| Value | Description | Active Provider? |
|---|---|---|
LUIS_NLP | Microsoft LUIS | No (legacy) |
OPENAI_NLP | OpenAI completions | No (legacy) |
HELVIA_NLP_SPECIFICATION | Custom NLP service | Yes |
HELVIA_RAG_PIPELINE | RAG pipeline | Yes (primary) |
DIALOGFLOW_NLP | Google Dialogflow | Yes |
HELVIA_GPT | Helvia GPT pipeline | No (legacy) |
NLPStatus (pipeline lifecycle)
| Value | Description |
|---|---|
CREATED | Newly created, never trained |
OUTDATED | Corpus changed, needs retrain |
TRAINING | Training in progress |
FAILED | Training failed (see failedReason) |
READY | Trained and serving predictions |
INITIALIZING | Being initialized |
QueryCategories (NLP result classification)
| Value | Description |
|---|---|
Matched | Intent confidently matched |
Smalltalk | Detected as small talk |
Generic | Generic/broad query |
Missed | No intent matched above threshold |
NLPPlatform (helvia-rag-pipelines LLM providers)
| Value | Client Class |
|---|---|
OPENAI | NLPAPIClientOpenAI |
AZURE_OPENAI | NLPAPIClientAzureOpenAI |
GOOGLE_GENAI | NLPAPIClientGoogleGenAI |
ANYSCALE | NLPAPIClientAnyscale |
NLPPlatformScope (helvia-rag-pipelines provider scopes)
| Value | Used For |
|---|---|
EMBEDDINGS | Document/query embedding |
CHAT | Text generation |
QUERY_SUMMARIZATION | Query rewriting with history context |
Semantic Document Segmenter
Standalone FastAPI (Python) microservice that converts documents into structured, labeled segments for downstream consumption by helvia-rag-pipelines (corpus items) or other consumers.
Source: packages/semantic-doc-segmenter/
Processing Pipeline
LLM Usage
Three LLM tasks, all via OpenAI or Azure OpenAI (selected by LLM_BACKEND env var). Plus Gemini for vision PDF parsing.
| Task | Config Var | Default Model | Purpose |
|---|---|---|---|
| Heading extraction | *_LLM_MODEL_FOR_HEADINGS | gpt-5.2 | Identify section boundaries in plain text |
| Article tagging | *_LLM_MODEL_FOR_TAGGER | gpt-5.2 | Classify segments into predefined tags |
| Title extraction | *_LLM_MODEL_FOR_TAGGER | gpt-5.2 | Generate title for untitled segments |
| PDF vision parsing | GEMINI_MODEL | gemini-3.1-pro-preview | Convert complex PDFs via vision model |
All OpenAI/Azure calls use temperature=0, top_p=1. Token counting via tiktoken (o200k_base for gpt-4o/gpt-5, cl100k_base for others).
Heading Extraction Prompt
System: identify section headings (1-10 words) using visual/structural cues. Multi-line headings joined with <br/>. Excludes table of contents. Outputs markdown heading format.
Input text is chunked by tokens with 100-token overlap to handle boundary headings. Headings are injected into source text via normalized word matching (unidecode).
Tagging Prompt
System: assign up to N topics from a provided list. Falls back to "Other" if no match. Output: YAML list. max_tokens=50.
Gemini Prompts
Located in app/prompts/:
gemini_text_only.txt: for text-heavy PDFsgemini_text_and_images.txt: for PDFs with images (overlays[IMG]markers on image locations before sending)
Segmentation Algorithm
generate_articles() in app/services/markdown_service.py:
- Parse markdown into
MarkdownNodetree (hierarchy by#level) - Traverse tree depth-first
- If subtree content <= max_size: keep as single article
- If node body > max_size: split into numbered parts
- Articles include parent heading as prefix for context
- Size controlled by
SEGMENTER_MAX_ARTICLE_SIZE(default 2000) andSEGMENTER_SIZE_UNITS(words, chars, non_ws_chars, tokens)
API Endpoints
| Method | Path | Purpose |
|---|---|---|
POST /jobs | Submit document for processing | multipart/form-data with file + options |
GET /jobs | List all jobs | |
GET /jobs/{id} | Get job status | status, progress %, processing_stage |
PATCH /jobs/{id} | Cancel a job | { "status": "CANCELED" } |
DELETE /jobs/{id} | Delete job + document + segments | |
GET /documents | List documents | |
GET /documents/{id} | Get document metadata | |
GET /documents/{id}/segments | Get segments | id, title, body, lang, tags, pagenr |
Auth: JWT (HS256) with role: admin.
Job Processing Options
| Option | Default | Description |
|---|---|---|
maxsize | 2000 | Max segment size (in configured units) |
usetags | [] | Tag list for LLM tagging |
maxtags | 1 | Max tags per segment |
callbackurl | — | POST results when done |
process_images | per config | Extract images from documents |
pdf_parsing_backend | pymupdf | pymupdf, docling, gemini-3 |
enable_ocr | false | Enable OCR (docling only) |
ocr_backend | tesseract | tesseract (Greek support), easyocr |
force_full_page_ocr | false | OCR entire pages vs detected regions |
Language Detection
Cascade:
- Google Cloud Translate (if
USE_GOOGLE_LANGUAGE_DETECTION=true, 10s timeout) - fast-langdetect (pre-initialized, top 500 chars, "lite" model)
- Fallback to
SYSTEM_DEFAULT_LANGUAGE(en)
Data Model
MySQL with Alembic migrations. Three tables:
document: id, filename, filesize, mimetype, doctype, body (LONGBLOB), status, languagejob: id, document_id, status (NEW/PENDING/PROCESSING/COMPLETED/FAILED/CANCELED), progress, processing_stage, error, optionsdocumentsegment: id, document_id, ordinal, title, body, lang, pagenr, tags (JSON), status
Integration with helvia-rag-pipelines
semantic-doc-segmenter produces segments that become corpus items in RAG pipelines. The integration flow:
- Document uploaded to semantic-doc-segmenter
- Segments generated (title, body, tags, language)
- Callback or polling delivers segments to the consumer
- Consumer formats segments as
HelviaCorpusItem[]and sends toPUT /pipelines/{id}/corpusin helvia-rag-pipelines - helvia-rag-pipelines indexes segments into vector DB for semantic search
Key Configuration
| Variable | Default | Purpose |
|---|---|---|
LLM_BACKEND | openai | openai or azure |
OPENAI_API_KEY | — | OpenAI API key |
AZURE_LLM_ENDPOINT | — | Azure OpenAI endpoint |
GEMINI_API_KEY | — | Google Gemini key |
GEMINI_MODEL | gemini-3.1-pro-preview | Gemini model for PDF vision |
PDF_PARSING_BACKEND | pymupdf | Default PDF parser |
SEGMENTER_MAX_ARTICLE_SIZE | 2000 | Max segment size |
SEGMENTER_SIZE_UNITS | non_ws_chars | Size unit (words/chars/non_ws_chars/tokens) |
BACKGROUND_TASK_LIMIT | 2 | Max concurrent jobs |
JOB_TIMEOUT_SECONDS | 600 | Cooperative cancellation timeout |
ENABLE_IMAGE_HANDLING | false | Extract/store images |
IMAGE_HANDLING_MODE | s3 | s3 or tmp_file |
Key Files
| Path | Purpose |
|---|---|
app/services/doc_processing_service.py | Main pipeline orchestration |
app/services/doc_converter_service.py | Format-specific converters |
app/services/markdown_service.py | Heading extraction, tree parsing, chunking |
app/services/llm_service.py | OpenAI/Azure LLM calls |
app/services/gemini_service.py | Google Gemini API calls |
app/services/job_execution_manager.py | Worker pool management |
app/parsers/PyMuPDFParser.py | PyMuPDF PDF parser |
app/parsers/DoclingPDFParser.py | Docling PDF parser with OCR |
app/config/config.py | All env var configuration |
open-bot-framework: DirectLine Channel Gateway
Note: open-bot-framework is planned as a future self-hosted replacement for Azure DirectLine but is not yet in use. Currently, hbf-webchat connects to hbf-bot via Azure DirectLine (Microsoft's botframework-directline, bundled in botframework-webchat).
open-bot-framework is the channel entry point for development and self-hosted webchat deployments (planned, not yet in use). It implements the Microsoft Bot Framework DirectLine 3.0 protocol and relays user messages to hbf-bot (or any registered HTTP bot endpoint), which then invokes hbf-nlp for NLU processing.
Source: packages/open-bot-framework/
Full architecture: packages/open-bot-framework/docs/architecture.md
Role in the NLU Pipeline (planned)
open-bot-framework does not perform NLU or LLM work itself. Its planned pipeline role is:
- Receive user message from webchat client (REST
POST /v3/directline/conversations/:id/activities) - Validate DirectLine JWT, enrich the activity (id, timestamp, serviceUrl, conversation)
HTTP POSTthe enriched activity to the registered bot endpoint (e.g.hbf-bot)hbf-botreceives the activity and calls hbf-nlp for intent/entity extraction- Bot reply arrives at open-bot-framework via
POST /v3/conversations/:id/activities/:actId - Reply is broadcast to the webchat client over WebSocket
When deployed (planned, not yet in use), the hbf-webchat widget would use open-bot-framework as its DirectLine backend, calling it at runtime for tokens and activity delivery. open-bot-framework is intended to replace Azure's hosted DirectLine service — it does not replace hbf-webchat itself. Both would run together: the widget as the browser-side client, OBF as the server-side gateway. Currently, hbf-webchat uses Azure DirectLine (botframework-directline) for this role.
Message Flow (planned)
Authentication Model
Two token types, both HMAC-SHA256 JWTs signed by open-bot-framework itself:
| Token | Issued By | Used For | Key Payload Fields |
|---|---|---|---|
| DirectLine token | DirectlineTokenService | Client-to-gateway | bot, site, conv, user |
| OAuth2 access token | AuthorizationService | Bot-to-gateway | aud, iss, sub (clientId) |
DirectLine token is obtained by the client via POST /v3/directline/tokens/generate using a webchat site secret (<siteId>.<hmac>). The bot obtains an OAuth2 access token via POST /oauth2/v2.0/token (client credentials grant) using an OpenBotSecret credential.
Data Model
| Entity | Table | Key Fields | Notes |
|---|---|---|---|
OpenBot | open_bot | id (uuid), handle (unique), endpoint, schemaVersion | Registered bot. endpoint is the HTTP URL open-bot-framework POSTs activities to. |
OpenBotSecret | open_bot_secret | id, secretHash (SHA-256), plainReducted, expiresAt | Bot API credential. Secret shown once on creation; only hash stored. |
WebChatChannel | web_chat_channel | id, name, secret1, secret2 | Webchat site config. Two rotating secrets per channel. Belongs to OpenBot. |
Activity ID Convention
Activity IDs follow the DirectLine spec: <conversationId>|<7-digit-zero-padded-counter> (e.g. abc123|0000003).
typing activities get random IDs and do not increment the watermark counter.
The counter is backed by Redis (production) or in-memory (development/fallback), selected by ATOMIC_OPERATIONS_IMPLEMENTATION env var.
File Attachments
POST /v3/directline/conversations/:id/upload accepts multipart with an activity JSON part and one or more file parts. Files are uploaded to S3-compatible storage (MinIO, AWS S3, etc.) and the resulting URLs are written into activity.attachments[].contentUrl before forwarding to the bot.
Key Configuration
| Variable | Purpose |
|---|---|
JWT_SECRET | HMAC secret for signing DirectLine and access tokens |
JWT_EXPIRATION_SECONDS | Token TTL (default 3600) |
DIRECTLINE_HOST | Public hostname of this gateway (embedded in token iss/aud) |
DIRECTLINE_SOCKET_URL | Base URL for WebSocket stream URLs |
DIRECTLINE_REGION | Region suffix appended to generated conversation IDs |
SOCKET_PORT | WebSocket server port (default 1992) |
REDIS_URI | Redis connection for atomic watermark counter |
ATOMIC_OPERATIONS_IMPLEMENTATION | redis (default) or memory (single-instance fallback) |
STORAGE_BUCKET / STORAGE_ENDPOINT | S3-compatible storage for file attachments |
Key Files
| Path | Purpose |
|---|---|
src/features/directline/directline.controller.ts | User-facing DirectLine endpoints |
src/features/directline/directline-alt.controller.ts | Bot reply endpoint (POST /v3/conversations/.../activities/:actId) |
src/features/directline/directline-conversation.service.ts | Conversation lifecycle, activity enrichment, bot HTTP forward, WebSocket dispatch |
src/features/directline/dirtectline-token.service.ts | Token generate/refresh/verify (JWT) |
src/features/directline/directline.gateway.ts | Raw WebSocket server, per-conversation socket map |
src/features/authorization/authorization.controller.ts | POST /oauth2/v2.0/token (bot client credentials) |
src/features/authorization/authorization.service.ts | OAuth2 access token generation and verification |
src/features/openbot/openbot.service.ts | Bot CRUD and cached handle lookup |
src/features/openbotsecret/openbotsecret.service.ts | Secret hashing (SHA-256) and validation |
src/features/atomicity/atomic-operations.provider.ts | Redis vs memory backend selection |
src/features/storage/storage.service.ts | S3-compatible file upload for attachments |
NLP and Language Detection Routing (by Tenant Type)
Two tenant types with fundamentally different NLP paths in hbf-bot:
Classic tenants (isAgent=false):
- Intent detection: hbf-core NLP Process endpoint (via
ExternalNLU,nlpServicedefaults toHBF_CORE) - Language detection: piggybacks on hbf-core's NLP Process (
detectLanguage=trueflag), enabled bybotDeployment.settings.automaticLanguageDetectionEnabled - hbf-nlp
/detect-languageis NOT available (requiresisAgent=true) - Traditional NLU flow:
BaseNLU->ExternalNLU.process()-> hbf-core -> hbf-nlp
Modern/agent tenants (isAgent=true):
- Intent detection: skipped entirely.
AgenticBotHandlertakes over inConversationFlowStepand routes toDEFAULT_NODEinstead of running traditional NLU. - Language detection: hbf-nlp
/detect-languagedirectly (viaLanguageDetectionHandler), enabled byLLM_LANGUAGE_DETECTIONplugin +isAgent=true tenant.systemSettings.nlpServicecan beHBF_NLPorHBF_COREbut is largely irrelevant sinceAgenticBotHandlerbypassesBaseNLU
Key files: LanguageDetectionHandler.ts (isAgent gate), SetLanguageStep.ts (detection orchestration), ConversationFlowStep.ts (AgenticBotHandler branch), ExternalNlu.ts (classic NLU routing), BaseNlu.ts (entry point).