NLU / LLM Pipeline

How Platform processes a user message through NLP, LLM, and RAG. Services: hbf-core (config store), hbf-nlp (orchestration), helvia-rag-pipelines (RAG engine), semantic-doc-segmenter (document segmentation), open-bot-framework (DirectLine channel gateway, planned, not yet in use) Last updated: 2026-03-13

Pipeline Configuration Object

Stored in hbf-core MongoDB collection nlp-pipelines. Polymorphic: base class NLPPipeline with 6 subtypes selected by @JsonTypeInfo discriminator on the type field.

Source: hbf-core/src/main/java/gr/helvia/hbf/core/domain/NLPPipeline.kt TypeScript consumer: hbf-core-api/src/datamodel/nlp.ts Full schema: docs/domain-model/nlp-pipeline.md

Base Fields (all subtypes)

Field	Type	Notes
`id`	String	PK
`name`	String	@NotNull
`type`	NLPType	Discriminator (see Enums below)
`language`	String	Primary language. @NotNull on create
`secondaryLanguages`	Set<LanguageCode>	Optional additional languages
`status`	NLPStatus	CREATED, OUTDATED, TRAINING, FAILED, READY, INITIALIZING
`predictionConfidenceThreshold`	Double	Min confidence for intent match. Range (0, 1]
`includeTrainingTags` / `excludeTrainingTags`	List<String>	Filter KB articles for training
`organization`	Organization	@DBRef lazy
`tenant`	Tenant	@DBRef lazy
`nlpService`	NLPService	HBF_CORE or HBF_NLP (which service executes the pipeline)
`lastTrainedAt`	Date
`failedReason`	String	Error message when status=FAILED

Subtypes

NLPType	Class	Key Extra Fields
`LUIS_NLP`	LuisNLP	`appId`, `appName`, `authoringKey`, `appVersion`, `host`, `predictionHost`, `predictionKey`
`OPENAI_NLP`	OpenAINLP	`model`, `apiKey`, `temperature`, `maxTokens`, `trainingType` (ZERO_SHOT/ONE_SHOT/FEW_SHOT/CUSTOM_PROMPT), `modelCategory` (COMPLETION/CHAT), `prompt: OpenAIPrompt`
`DIALOGFLOW_NLP`	DialogflowNLP	`projectId`, `privateKey`, `clientEmail`, `region` (DialogFlowRegion enum), `trainingOperationName`
`HELVIA_NLP_SPECIFICATION`	HelviaNLPSpecification	`serviceUrl`, `bearerToken`
`HELVIA_GPT`	HelviaGPT (extends HelviaNLPSpecification)	`pipelineId`
`HELVIA_RAG_PIPELINE`	HelviaRAGPipeline (extends HelviaNLPSpecification)	`pipelineId`, `settings: RAGPipelineSettings`

RAGPipelineSettings (hbf-core side)

Field	Type	Default	Notes
`includeHistory`	Boolean	false	Send chat history to RAG pipeline
`maxHistoryTurns`	Int	4	Range 1-30

RAG Pipeline Configuration (helvia-rag-pipelines side)

Stored as JSON blob in helvia-rag-pipelines MySQL pipelines.configuration_json, deserialized to PipelineConfiguration Pydantic model.

Source: helvia-rag-pipelines/app/schemas/pipeline_configuration_schemas.py

PipelineConfiguration
  ├── general_settings
  │     article_format          -- template with {{title}}, {{group}}, {{body}}, {{tags}}
  │     corpus_language          -- KB content language
  │     native_languages[]       -- languages supported without translation
  │     default_native_language  -- fallback language
  │     return_confidence        -- include confidence in response
  │
  ├── embeddings
  │     model                    -- e.g. "text-embedding-3-small"
  │     providers[]              -- LlmProvider (platform, url, apiKey, model)
  │
  ├── semantic_search
  │     enabled                  -- default true
  │     max_results              -- default 7
  │     max_input_tokens         -- default 0 (unlimited)
  │     exact_match              -- default true
  │     visit_neighbors          -- default 128
  │     normalize_user_input     -- default true
  │     normalize_corpus         -- default false
  │
  ├── text_generation
  │     providers[]              -- LlmProvider
  │     prompt                   -- system prompt for generation
  │     max_tokens               -- default 500
  │     temperature              -- default 0.0
  │     parser                   -- { type, regex } for structured output (JSON or REGEX)
  │     hide_urls                -- default true (replaces URLs with UUIDs before LLM call)
  │
  ├── chat_history
  │     enabled                  -- default false
  │     max_messages             -- default 8
  │
  ├── query_summarization
  │     enabled                  -- default false
  │     providers[]              -- LlmProvider
  │     prompt                   -- summarization prompt
  │     max_tokens               -- default 300
  │     temperature              -- default 0.0
  │     use_summary_in_sem_search    -- default true
  │     use_summary_in_generation    -- default false
  │     skip_history_in_inference    -- default false
  │
  ├── translation
  │     enabled                  -- default false
  │     query_translation_providers[]
  │     response_translation_providers[]
  │     corpus_translation_providers[]
  │     hide_urls                -- default false
  │
  └── sem_cache
        (configured via Helvia SemCache service)

Each LlmProvider entry contains: platform (OPENAI, AZURE_OPENAI), platform_url, platform_api_key, model, seed, prompt.

Tenant-to-Pipeline Binding

A Tenant links to pipelines via two mechanisms:

nlpMap (Map<String, String>): Simple language-to-pipeline-ID mapping. Checked first.
nlpTrees (List<NLPPipelineTreeData>): Decision trees with variable-based conditions. Checked as fallback if nlpMap has no entry for the language.

Pipeline Decision Tree

Stored in MongoDB collection nlp-pipeline-trees. Each tree has:

nlpDecisionTree: Runtime-compiled decision map
nlpDecisionTreeSource: Editor-friendly graph with nodes and edges

Node types (NLPNodeType):

INTRO: Entry point
PIPELINE: References a pipeline ID
SEQUENCE: Sequential evaluation
QUERY: Conditional (LIQE expression evaluated against session variables)

Provider Selection

Provider registration (hbf-nlp ResolverModule):

Pipeline Type	Provider Class	Client
`HELVIA_RAG_PIPELINE`	HelviaRAGPipelinesProvider	HelviaRAGPipelineClient (HTTP)
`HELVIA_NLP_SPECIFICATION`	HelviaNLPSpecificationProvider	HelviaNLPSpecificationPipelineClient (HTTP)
`DIALOGFLOW_NLP`	DialogflowPipelinesProvider	DialogflowPipelineClient (gRPC)

Note: LUIS_NLP, OPENAI_NLP, and HELVIA_GPT types exist in the enum but have no registered provider in hbf-nlp. They are legacy types.

Processing Sequence (`POST /tenants/{tenantId}/process`)

Full flow in hbf-nlp/src/nlp/nlp.service.ts:

Step 1: Priority Keyword Pre-processing

Check user query against tenant.settings.nluLocal.intents (Map of intent name to keyword list). Uses configurable string similarity: exact match, Jaro-Winkler, or Damerau-Levenshtein with a similarityThreshold. If matched, return immediately without calling any NLP provider.

Step 2: Language Resolution

If detectLanguage=true, call LlmService.languageDetectionLegacy():

Legacy path: hardcoded Azure OpenAI call using env vars AZURE_OPENAI_*
Modern path: uses tenant's LLM_LANGUAGE_DETECTION plugin with configurable provider

Step 3: Pipeline Selection

Check tenant.nlpMap[resolvedLanguage] for a direct pipeline ID
If no match, evaluate tenant.nlpTreeMap[language] decision tree using LIQE expressions against session variables

Step 4: Provider Dispatch

ResolverService.resolve(pipeline.type) returns the registered provider. Provider calls its downstream client:

HELVIA_RAG: POST {serviceUrl}/pipelines/{pipelineId}:process with query, language, session history, parameters
HELVIA_NLP_SPEC: POST {serviceUrl}:process with query, language, parameters (no history)
DIALOGFLOW: gRPC detectIntent via @google-cloud/dialogflow SDK

Step 5: Metadata Persistence

MessageMetadataService writes to MySQL message_metadata table:

Processing steps: PRIORITY_KEYWORDS, LANGUAGE_DETECTION, NLP_SYSTEM
Each step includes: input, output, duration (ms)

RAG Query Flow (`POST /pipelines/{pipelineId}:process`)

Full flow in helvia-rag-pipelines/app/services/pipeline_service.py:

Vector DB Selection

Configured at startup via VECTOR_DB environment variable:

qdrant (default): Uses qdrant-client SDK. Supports API, on-disk, and in-memory backends.
milvus: Uses pymilvus SDK. IVF_FLAT index with L2 distance.

VectorDbManager singleton selects the implementation once at boot. One collection per pipeline, named rag_pipelines_{id} (prefix configurable via VDB_COLLECTION_PREFIX).

Key vector DB settings:

VDB_DIMENSIONS: 1536 (default, matches text-embedding-3-small)
VDB_BATCH_INSERT_SIZE: 100 items per batch during indexing
MAX_VECTORS_PER_COLLECTION: 500,000
Qdrant uses COSINE distance metric; Milvus uses L2

LLM Client Selection (within helvia-rag-pipelines)

NLPProviderService manages client selection per scope:

Scope	Config Source	Client Classes
`EMBEDDINGS`	`config.embeddings.providers[]`	NLPAPIClientOpenAI, NLPAPIClientAzureOpenAI, NLPAPIClientGoogleGenAI
`CHAT` (text gen)	`config.text_generation.providers[]`	Same + NLPAPIClientAnyscale
`QUERY_SUMMARIZATION`	`config.query_summarization.providers[]`	Same as CHAT

Provider selection uses round-robin rotation across configured providers for each scope. Each provider entry specifies platform (OPENAI, AZURE_OPENAI, GOOGLE_GENAI, ANYSCALE), platform_url, platform_api_key, and model.

Translation

TranslationService supports multiple clients:

TranslationAPIClientGoogle: Google Cloud Translate v3
TranslationAPIClientAzure: Azure Translator
TranslationAPIClientAzureOpenAI: Azure OpenAI (LLM-based translation)
TranslationAPIClientOpenAI: OpenAI (LLM-based translation)

Provider selected per scope (query, response, corpus) from config.translation.*_providers[].

Semantic Cache (SemCache)

SemCache is triggered only when ALL conditions are met:

sem_cache.enabled = true in pipeline config
Text generation is enabled
OpenAI embeddings are used
Chat history max_messages == 0 (single-turn conversations only)

When triggered, SemCacheService checks the Helvia SemCache service for a semantically similar previous query. On cache hit, the cached response is returned. On miss, the generated response is stored for future queries.

Cache configuration is auto-initialized per pipeline: if cache_uuid or api_key is missing, the service creates them via the SemCache API.

Embedding Cache

Optional cache for embedding vectors. Configured via CACHE_MODE: memory (default), redis, or redis_async. Cache key: (input_hash, provider, model, dimensions). TTL: CACHE_EXPIRATION_TIME (default 3600s). Avoids re-computing embeddings for previously seen queries during both indexing and search.

Confidence Calculation

When return_confidence = true in pipeline config:

summary_confidence: extracted from OpenAI logprobs during query summarization
process_confidence: extracted from OpenAI logprobs during text generation
Final confidence = summary_confidence x process_confidence

Document Ingestion / Training Pipeline

Training Trigger

POST /pipelines/{pipelineId}:train (called by hbf-nlp after corpus update)

Corpus Update (hbf-nlp side)

hbf-nlp transforms training content into corpus items before sending to helvia-rag-pipelines:

Fetch activities (KNOWLEDGE_BASE + AUTOMATED_ANSWERS types) from hbf-core
For each activity, extract intent content and/or KB article content

Build HelviaCorpusItem[]:

{ id, title, group, body, training_text, type: "INTENT"|"ARTICLE", tags, language }

PUT /pipelines/{pipelineId}/corpus sends the full corpus (diff applied server-side)

Indexing (helvia-rag-pipelines side)

PipelineService.train() sets status to TRAINING
_index_corpus() fetches corpus items where need_training=True (or all if force_reindex=True)
For each item, SemanticSearchService generates embedding via configured embedding provider
Embeddings upserted into vector DB collection via VectorDbManager
Status set to READY, last_trained_at updated, corpus items marked trained

Corpus Diff Logic

PUT /pipelines/{pipelineId}/corpus:

Compares incoming items against MySQL corpus by (id, pipeline_id)
Inserts new items, updates changed items, deletes removed items
Changed/new items get need_training=True
If corpus language differs from pipeline native language, items are translated before storage
Pipeline status set to OUTDATED (requires re-training)

LLM Provider Architecture (hbf-nlp)

Separate from NLP pipeline providers. Used for session analysis, language detection, and direct LLM requests.

Provider Registration

Alias	Class	API	Version
`OPEN_AI_LLM`	OpenAIProvider	OpenAI Chat Completions	v1
`AZURE_AI_LLM`	AzureProvider	Azure OpenAI Chat Completions	2025-03-01-preview
`GEMINI_AI_LLM`	GeminiProvider	Google Gemini (`@google/genai`)	v1beta

Integration Config

Each tenant has LLM integrations stored in hbf-core:

Integration Type	Class	Key Fields
`OPEN_AI_LLM`	OpenAIIntegration	`apiKey`, `project`, `openAIOrganization`, `customUrl`
`AZURE_AI_LLM`	AzureAIIntegration	`endpoint`, `apiKey`, `deploymentName` (deprecated), `apiVersion` (deprecated)
`GEMINI_AI_LLM`	GeminiAIIntegration	`apiKey`

LLM Endpoints

Endpoint	Purpose	Auth
`POST /llm-request`	Parameterized LLM completion (structured data extraction)	JWT (bot only)
`POST /detect-language`	Language detection	JWT (bot only)
`POST /organizations/:orgId/tenants/:tenantId/sessions/:sessionId/analyze`	Session summary (sentiment, urgency, categories)	Bearer
`GET /organizations/:orgId/providers/:alias/version`	Get provider API version	Bearer
`GET /organizations/:orgId/categories/:category/prompt`	Get plugin prompt template	Bearer

Session Analysis Flow

LlmController.analyzeChatSession called with orgId/tenantId/sessionId
LlmService fetches tenant's LLM_SUMMARIZATION plugin and its integration config
Resolves LlmProvider (OpenAI/Azure/Gemini) via ResolverService
Builds session transcript, constructs prompt, calls provider.analyze()
Result (summary, classification tags, sentiment) written back to hbf-core

Embedding Models

Purpose	Configured In	Default Model	Providers
Document indexing	`PipelineConfiguration.embeddings`	text-embedding-3-small	OpenAI, Azure OpenAI, Google GenAI
Query embedding	Same as indexing (shared config)	Same	Same
Text generation	`PipelineConfiguration.text_generation`	gpt-4o	OpenAI, Azure OpenAI, Google Gemini, Anyscale
Query summarization	`PipelineConfiguration.query_summarization`	(per config)	Same as text generation

Response Format

hbf-nlp process response (`NLPProcessResponseDto`)

{
  intent: string,                    // matched intent name
  entities: Entity[],                // extracted entities
  response: string,                  // generated text response
  pipelineResults: {
    query: string,
    queryCategory: "Matched"|"Smalltalk"|"Generic"|"Missed",
    matchedCorpus: { id: string, confidence: number },
    examinedCorpus: [{ id: string, confidence: number }],
    generatedText: string,
    extractedParameters: [{ key: string, value: string }],
    languageCode: string,
    detectedLanguage: string
  }
}

helvia-rag-pipelines process response

{
  "answer": "Generated answer text",
  "sources": [{ "title": "Article Title", "score": 0.92 }],
  "confidence": 0.95,
  "query_category": "Matched",
  "examined_corpus": [{ "id": "...", "title": "...", "score": 0.85 }],
  "matched_corpus": { "id": "...", "title": "...", "score": 0.92 },
  "generated_text": "Generated answer text",
  "extracted_parameters": [{ "key": "param1", "value": "value1" }]
}

Enums Reference

NLPType (pipeline type discriminator)

Value	Description	Active Provider?
`LUIS_NLP`	Microsoft LUIS	No (legacy)
`OPENAI_NLP`	OpenAI completions	No (legacy)
`HELVIA_NLP_SPECIFICATION`	Custom NLP service	Yes
`HELVIA_RAG_PIPELINE`	RAG pipeline	Yes (primary)
`DIALOGFLOW_NLP`	Google Dialogflow	Yes
`HELVIA_GPT`	Helvia GPT pipeline	No (legacy)

NLPStatus (pipeline lifecycle)

Value	Description
`CREATED`	Newly created, never trained
`OUTDATED`	Corpus changed, needs retrain
`TRAINING`	Training in progress
`FAILED`	Training failed (see `failedReason`)
`READY`	Trained and serving predictions
`INITIALIZING`	Being initialized

QueryCategories (NLP result classification)

Value	Description
`Matched`	Intent confidently matched
`Smalltalk`	Detected as small talk
`Generic`	Generic/broad query
`Missed`	No intent matched above threshold

NLPPlatform (helvia-rag-pipelines LLM providers)

Value	Client Class
`OPENAI`	NLPAPIClientOpenAI
`AZURE_OPENAI`	NLPAPIClientAzureOpenAI
`GOOGLE_GENAI`	NLPAPIClientGoogleGenAI
`ANYSCALE`	NLPAPIClientAnyscale

NLPPlatformScope (helvia-rag-pipelines provider scopes)

Value	Used For
`EMBEDDINGS`	Document/query embedding
`CHAT`	Text generation
`QUERY_SUMMARIZATION`	Query rewriting with history context

Semantic Document Segmenter

Standalone FastAPI (Python) microservice that converts documents into structured, labeled segments for downstream consumption by helvia-rag-pipelines (corpus items) or other consumers.

Source: packages/semantic-doc-segmenter/

Processing Pipeline

LLM Usage

Three LLM tasks, all via OpenAI or Azure OpenAI (selected by LLM_BACKEND env var). Plus Gemini for vision PDF parsing.

Task	Config Var	Default Model	Purpose
Heading extraction	`*_LLM_MODEL_FOR_HEADINGS`	gpt-5.2	Identify section boundaries in plain text
Article tagging	`*_LLM_MODEL_FOR_TAGGER`	gpt-5.2	Classify segments into predefined tags
Title extraction	`*_LLM_MODEL_FOR_TAGGER`	gpt-5.2	Generate title for untitled segments
PDF vision parsing	`GEMINI_MODEL`	gemini-3.1-pro-preview	Convert complex PDFs via vision model

All OpenAI/Azure calls use temperature=0, top_p=1. Token counting via tiktoken (o200k_base for gpt-4o/gpt-5, cl100k_base for others).

Heading Extraction Prompt

System: identify section headings (1-10 words) using visual/structural cues. Multi-line headings joined with <br/>. Excludes table of contents. Outputs markdown heading format.

Input text is chunked by tokens with 100-token overlap to handle boundary headings. Headings are injected into source text via normalized word matching (unidecode).

Tagging Prompt

System: assign up to N topics from a provided list. Falls back to "Other" if no match. Output: YAML list. max_tokens=50.

Gemini Prompts

Located in app/prompts/:

gemini_text_only.txt: for text-heavy PDFs
gemini_text_and_images.txt: for PDFs with images (overlays [IMG] markers on image locations before sending)

Segmentation Algorithm

generate_articles() in app/services/markdown_service.py:

Parse markdown into MarkdownNode tree (hierarchy by # level)
Traverse tree depth-first
If subtree content <= max_size: keep as single article
If node body > max_size: split into numbered parts
Articles include parent heading as prefix for context
Size controlled by SEGMENTER_MAX_ARTICLE_SIZE (default 2000) and SEGMENTER_SIZE_UNITS (words, chars, non_ws_chars, tokens)

API Endpoints

Method	Path	Purpose
`POST /jobs`	Submit document for processing	multipart/form-data with file + options
`GET /jobs`	List all jobs
`GET /jobs/{id}`	Get job status	status, progress %, processing_stage
`PATCH /jobs/{id}`	Cancel a job	`{ "status": "CANCELED" }`
`DELETE /jobs/{id}`	Delete job + document + segments
`GET /documents`	List documents
`GET /documents/{id}`	Get document metadata
`GET /documents/{id}/segments`	Get segments	id, title, body, lang, tags, pagenr

Auth: JWT (HS256) with role: admin.

Job Processing Options

Option	Default	Description
`maxsize`	2000	Max segment size (in configured units)
`usetags`	`[]`	Tag list for LLM tagging
`maxtags`	1	Max tags per segment
`callbackurl`	—	POST results when done
`process_images`	per config	Extract images from documents
`pdf_parsing_backend`	pymupdf	`pymupdf`, `docling`, `gemini-3`
`enable_ocr`	false	Enable OCR (docling only)
`ocr_backend`	tesseract	`tesseract` (Greek support), `easyocr`
`force_full_page_ocr`	false	OCR entire pages vs detected regions

Language Detection

Cascade:

Google Cloud Translate (if USE_GOOGLE_LANGUAGE_DETECTION=true, 10s timeout)
fast-langdetect (pre-initialized, top 500 chars, "lite" model)
Fallback to SYSTEM_DEFAULT_LANGUAGE (en)

Data Model

MySQL with Alembic migrations. Three tables:

document: id, filename, filesize, mimetype, doctype, body (LONGBLOB), status, language
job: id, document_id, status (NEW/PENDING/PROCESSING/COMPLETED/FAILED/CANCELED), progress, processing_stage, error, options
documentsegment: id, document_id, ordinal, title, body, lang, pagenr, tags (JSON), status

Integration with helvia-rag-pipelines

semantic-doc-segmenter produces segments that become corpus items in RAG pipelines. The integration flow:

Document uploaded to semantic-doc-segmenter
Segments generated (title, body, tags, language)
Callback or polling delivers segments to the consumer
Consumer formats segments as HelviaCorpusItem[] and sends to PUT /pipelines/{id}/corpus in helvia-rag-pipelines
helvia-rag-pipelines indexes segments into vector DB for semantic search

Key Configuration

Variable	Default	Purpose
`LLM_BACKEND`	openai	`openai` or `azure`
`OPENAI_API_KEY`	—	OpenAI API key
`AZURE_LLM_ENDPOINT`	—	Azure OpenAI endpoint
`GEMINI_API_KEY`	—	Google Gemini key
`GEMINI_MODEL`	gemini-3.1-pro-preview	Gemini model for PDF vision
`PDF_PARSING_BACKEND`	pymupdf	Default PDF parser
`SEGMENTER_MAX_ARTICLE_SIZE`	2000	Max segment size
`SEGMENTER_SIZE_UNITS`	non_ws_chars	Size unit (words/chars/non_ws_chars/tokens)
`BACKGROUND_TASK_LIMIT`	2	Max concurrent jobs
`JOB_TIMEOUT_SECONDS`	600	Cooperative cancellation timeout
`ENABLE_IMAGE_HANDLING`	false	Extract/store images
`IMAGE_HANDLING_MODE`	s3	`s3` or `tmp_file`

Key Files

Path	Purpose
`app/services/doc_processing_service.py`	Main pipeline orchestration
`app/services/doc_converter_service.py`	Format-specific converters
`app/services/markdown_service.py`	Heading extraction, tree parsing, chunking
`app/services/llm_service.py`	OpenAI/Azure LLM calls
`app/services/gemini_service.py`	Google Gemini API calls
`app/services/job_execution_manager.py`	Worker pool management
`app/parsers/PyMuPDFParser.py`	PyMuPDF PDF parser
`app/parsers/DoclingPDFParser.py`	Docling PDF parser with OCR
`app/config/config.py`	All env var configuration

open-bot-framework: DirectLine Channel Gateway

Note: open-bot-framework is planned as a future self-hosted replacement for Azure DirectLine but is not yet in use. Currently, hbf-webchat connects to hbf-bot via Azure DirectLine (Microsoft's botframework-directline, bundled in botframework-webchat).

open-bot-framework is the channel entry point for development and self-hosted webchat deployments (planned, not yet in use). It implements the Microsoft Bot Framework DirectLine 3.0 protocol and relays user messages to hbf-bot (or any registered HTTP bot endpoint), which then invokes hbf-nlp for NLU processing.

Source: packages/open-bot-framework/ Full architecture: packages/open-bot-framework/docs/architecture.md

Role in the NLU Pipeline (planned)

open-bot-framework does not perform NLU or LLM work itself. Its planned pipeline role is:

Receive user message from webchat client (REST POST /v3/directline/conversations/:id/activities)
Validate DirectLine JWT, enrich the activity (id, timestamp, serviceUrl, conversation)
HTTP POST the enriched activity to the registered bot endpoint (e.g. hbf-bot)
hbf-bot receives the activity and calls hbf-nlp for intent/entity extraction
Bot reply arrives at open-bot-framework via POST /v3/conversations/:id/activities/:actId
Reply is broadcast to the webchat client over WebSocket

When deployed (planned, not yet in use), the hbf-webchat widget would use open-bot-framework as its DirectLine backend, calling it at runtime for tokens and activity delivery. open-bot-framework is intended to replace Azure's hosted DirectLine service — it does not replace hbf-webchat itself. Both would run together: the widget as the browser-side client, OBF as the server-side gateway. Currently, hbf-webchat uses Azure DirectLine (botframework-directline) for this role.

Message Flow (planned)

Authentication Model

Two token types, both HMAC-SHA256 JWTs signed by open-bot-framework itself:

Token	Issued By	Used For	Key Payload Fields
DirectLine token	`DirectlineTokenService`	Client-to-gateway	`bot`, `site`, `conv`, `user`
OAuth2 access token	`AuthorizationService`	Bot-to-gateway	`aud`, `iss`, `sub` (clientId)

DirectLine token is obtained by the client via POST /v3/directline/tokens/generate using a webchat site secret (<siteId>.<hmac>). The bot obtains an OAuth2 access token via POST /oauth2/v2.0/token (client credentials grant) using an OpenBotSecret credential.

Data Model

Entity	Table	Key Fields	Notes
`OpenBot`	`open_bot`	`id` (uuid), `handle` (unique), `endpoint`, `schemaVersion`	Registered bot. `endpoint` is the HTTP URL open-bot-framework POSTs activities to.
`OpenBotSecret`	`open_bot_secret`	`id`, `secretHash` (SHA-256), `plainReducted`, `expiresAt`	Bot API credential. Secret shown once on creation; only hash stored.
`WebChatChannel`	`web_chat_channel`	`id`, `name`, `secret1`, `secret2`	Webchat site config. Two rotating secrets per channel. Belongs to `OpenBot`.

Activity ID Convention

Activity IDs follow the DirectLine spec: <conversationId>|<7-digit-zero-padded-counter> (e.g. abc123|0000003). typing activities get random IDs and do not increment the watermark counter. The counter is backed by Redis (production) or in-memory (development/fallback), selected by ATOMIC_OPERATIONS_IMPLEMENTATION env var.

File Attachments

POST /v3/directline/conversations/:id/upload accepts multipart with an activity JSON part and one or more file parts. Files are uploaded to S3-compatible storage (MinIO, AWS S3, etc.) and the resulting URLs are written into activity.attachments[].contentUrl before forwarding to the bot.

Key Configuration

Variable	Purpose
`JWT_SECRET`	HMAC secret for signing DirectLine and access tokens
`JWT_EXPIRATION_SECONDS`	Token TTL (default 3600)
`DIRECTLINE_HOST`	Public hostname of this gateway (embedded in token `iss`/`aud`)
`DIRECTLINE_SOCKET_URL`	Base URL for WebSocket stream URLs
`DIRECTLINE_REGION`	Region suffix appended to generated conversation IDs
`SOCKET_PORT`	WebSocket server port (default 1992)
`REDIS_URI`	Redis connection for atomic watermark counter
`ATOMIC_OPERATIONS_IMPLEMENTATION`	`redis` (default) or `memory` (single-instance fallback)
`STORAGE_BUCKET` / `STORAGE_ENDPOINT`	S3-compatible storage for file attachments

Key Files

Path	Purpose
`src/features/directline/directline.controller.ts`	User-facing DirectLine endpoints
`src/features/directline/directline-alt.controller.ts`	Bot reply endpoint (`POST /v3/conversations/.../activities/:actId`)
`src/features/directline/directline-conversation.service.ts`	Conversation lifecycle, activity enrichment, bot HTTP forward, WebSocket dispatch
`src/features/directline/dirtectline-token.service.ts`	Token generate/refresh/verify (JWT)
`src/features/directline/directline.gateway.ts`	Raw WebSocket server, per-conversation socket map
`src/features/authorization/authorization.controller.ts`	`POST /oauth2/v2.0/token` (bot client credentials)
`src/features/authorization/authorization.service.ts`	OAuth2 access token generation and verification
`src/features/openbot/openbot.service.ts`	Bot CRUD and cached handle lookup
`src/features/openbotsecret/openbotsecret.service.ts`	Secret hashing (SHA-256) and validation
`src/features/atomicity/atomic-operations.provider.ts`	Redis vs memory backend selection
`src/features/storage/storage.service.ts`	S3-compatible file upload for attachments

NLP and Language Detection Routing (by Tenant Type)

Two tenant types with fundamentally different NLP paths in hbf-bot:

Classic tenants (isAgent=false):

Intent detection: hbf-core NLP Process endpoint (via ExternalNLU, nlpService defaults to HBF_CORE)
Language detection: piggybacks on hbf-core's NLP Process (detectLanguage=true flag), enabled by botDeployment.settings.automaticLanguageDetectionEnabled
hbf-nlp /detect-language is NOT available (requires isAgent=true)
Traditional NLU flow: BaseNLU -> ExternalNLU.process() -> hbf-core -> hbf-nlp

Modern/agent tenants (isAgent=true):

Intent detection: skipped entirely. AgenticBotHandler takes over in ConversationFlowStep and routes to DEFAULT_NODE instead of running traditional NLU.
Language detection: hbf-nlp /detect-language directly (via LanguageDetectionHandler), enabled by LLM_LANGUAGE_DETECTION plugin + isAgent=true
tenant.systemSettings.nlpService can be HBF_NLP or HBF_CORE but is largely irrelevant since AgenticBotHandler bypasses BaseNLU

Key files: LanguageDetectionHandler.ts (isAgent gate), SetLanguageStep.ts (detection orchestration), ConversationFlowStep.ts (AgenticBotHandler branch), ExternalNlu.ts (classic NLU routing), BaseNlu.ts (entry point).

Pipeline Configuration Object​

Base Fields (all subtypes)​

Subtypes​

RAGPipelineSettings (hbf-core side)​

RAG Pipeline Configuration (helvia-rag-pipelines side)​

Tenant-to-Pipeline Binding​

Pipeline Decision Tree​

Provider Selection​

Processing Sequence (POST /tenants/{tenantId}/process)​

Step 1: Priority Keyword Pre-processing​

Step 2: Language Resolution​

Step 3: Pipeline Selection​

Step 4: Provider Dispatch​

Step 5: Metadata Persistence​

RAG Query Flow (POST /pipelines/{pipelineId}:process)​

Vector DB Selection​

LLM Client Selection (within helvia-rag-pipelines)​

Translation​

Semantic Cache (SemCache)​

Embedding Cache​

Confidence Calculation​

Document Ingestion / Training Pipeline​

Training Trigger​

Corpus Update (hbf-nlp side)​

Indexing (helvia-rag-pipelines side)​

Corpus Diff Logic​

LLM Provider Architecture (hbf-nlp)​

Provider Registration​

Integration Config​

LLM Endpoints​

Session Analysis Flow​

Embedding Models​

Response Format​

hbf-nlp process response (NLPProcessResponseDto)​

helvia-rag-pipelines process response​

Enums Reference​

NLPType (pipeline type discriminator)​

NLPStatus (pipeline lifecycle)​

QueryCategories (NLP result classification)​

NLPPlatform (helvia-rag-pipelines LLM providers)​

NLPPlatformScope (helvia-rag-pipelines provider scopes)​

Semantic Document Segmenter​

Processing Pipeline​

LLM Usage​

Heading Extraction Prompt​

Tagging Prompt​

Gemini Prompts​

Segmentation Algorithm​

API Endpoints​

Job Processing Options​

Language Detection​

Data Model​

Integration with helvia-rag-pipelines​

Key Configuration​

Key Files​

open-bot-framework: DirectLine Channel Gateway​

Role in the NLU Pipeline (planned)​

Message Flow (planned)​

Authentication Model​

Data Model​

Activity ID Convention​

File Attachments​

Key Configuration​

Key Files​

NLP and Language Detection Routing (by Tenant Type)​

Pipeline Configuration Object

Base Fields (all subtypes)

Subtypes

RAGPipelineSettings (hbf-core side)

RAG Pipeline Configuration (helvia-rag-pipelines side)

Tenant-to-Pipeline Binding

Pipeline Decision Tree

Provider Selection

Processing Sequence (`POST /tenants/{tenantId}/process`)

Step 1: Priority Keyword Pre-processing

Step 2: Language Resolution

Step 3: Pipeline Selection

Step 4: Provider Dispatch

Step 5: Metadata Persistence

RAG Query Flow (`POST /pipelines/{pipelineId}:process`)

Vector DB Selection

LLM Client Selection (within helvia-rag-pipelines)

Translation

Semantic Cache (SemCache)

Embedding Cache

Confidence Calculation

Document Ingestion / Training Pipeline

Training Trigger

Corpus Update (hbf-nlp side)

Indexing (helvia-rag-pipelines side)

Corpus Diff Logic

LLM Provider Architecture (hbf-nlp)

Provider Registration

Integration Config

LLM Endpoints

Session Analysis Flow

Embedding Models

Response Format

hbf-nlp process response (`NLPProcessResponseDto`)

helvia-rag-pipelines process response

Enums Reference

NLPType (pipeline type discriminator)

NLPStatus (pipeline lifecycle)

QueryCategories (NLP result classification)

NLPPlatform (helvia-rag-pipelines LLM providers)

NLPPlatformScope (helvia-rag-pipelines provider scopes)

Semantic Document Segmenter

Processing Pipeline

LLM Usage

Heading Extraction Prompt

Tagging Prompt

Gemini Prompts

Segmentation Algorithm

API Endpoints

Job Processing Options

Language Detection

Data Model

Integration with helvia-rag-pipelines

Key Configuration

Key Files

open-bot-framework: DirectLine Channel Gateway

Role in the NLU Pipeline (planned)

Message Flow (planned)

Authentication Model

Data Model

Activity ID Convention

File Attachments

Key Configuration

Key Files

NLP and Language Detection Routing (by Tenant Type)