AI Brief: hbf-nlp

NestJS NLP service that handles intent classification, entity extraction, language detection, LLM orchestration, and RAG pipeline routing for the Helvia Chatbricks platform. Offloads all NLP processing that previously ran inside hbf-core.

What This Repo Does

Receives user messages and routes them through one of three NLP pipeline backends: Helvia RAG Pipelines (vector search + LLM generation), Helvia NLP Specification (custom NLP), or Dialogflow. Also exposes LLM endpoints for chat session analysis, language detection, and direct LLM requests. Persists per-message NLP metadata (pipeline used, confidence, timings) to MySQL.

Tech Stack

Language: TypeScript
Framework: NestJS 11
Key dependencies: @helvia/hbf-core-api, @nestjs/typeorm + typeorm (MySQL), @google-cloud/dialogflow, @google/genai, axios, @nestjs/cache-manager + keyv/redis, @nestjs/schedule, nestjs-pino + @elastic/ecs-pino-format, elastic-apm-node
Dev dependencies: openai (dev only)

Entry Points

Main: src/main.ts
App module: src/app.module.ts
Config: .env / .env.local (envFilePath in ConfigModule)

Key Directories

Directory	Purpose
`src/nlp/`	Core NLP pipeline routing, training, and processing logic
`src/nlp/providers/`	Pipeline provider implementations (RAG, NLP Spec, Dialogflow)
`src/nlp/clients/`	HTTP clients to call downstream pipeline APIs
`src/llm/`	LLM orchestration: analyze session, detect language, direct LLM request
`src/llm/providers/`	Abstract LlmProvider + OpenAI, Azure, Gemini implementations
`src/llm/clients/`	Low-level API clients for Azure and OpenAI
`src/generation/`	Text generation module with per-provider generation providers
`src/core/`	hbf-core-api wrapper (tenants, pipelines, sessions, activities)
`src/models/`	NLP model CRUD (TypeORM entities via models.module)
`src/test-set/`	Test set management for NLP pipeline evaluation
`src/scheduler/`	Scheduled tasks (e.g., poll training status)
`src/notifications/`	Push notifications to hbf-core on training events
`src/entities/`	TypeORM entity: MessageMetadata (NLP trace per message)
`migrations/`	TypeORM migrations (MySQL)
`src/utils/`	String similarity, session utils, knowledge base article utils
`src/guards/`	HBFGuard, JWTGuard, role guards (CanReadTenant, CanManageTenant, etc.)

API Surface

Key endpoints discovered:

NLP Processing:

POST /organizations/:orgId/tenants/:tenantId/process — process a message using the tenant's default pipeline (language-aware)
POST /organizations/:orgId/tenants/:tenantId/nlp-pipelines/:pipelineId/process — process with a specific pipeline
POST /tenants/:tenantId/process — moderator-only process (no org scoping)

NLP Training:

POST /organizations/:orgId/tenants/:tenantId/train — train all pipelines for a tenant
POST /organizations/:orgId/tenants/:tenantId/nlp-pipelines/:pipelineId/train — train a specific pipeline
POST /tenants/:tenantId/train — moderator-level tenant train
POST /nlp-pipelines/:pipelineId/train — moderator-level single pipeline train

LLM:

POST /organizations/:orgId/tenants/:tenantId/sessions/:sessionId/analyze — analyze/summarize a chat session
POST /llm-request — direct LLM completion request (hbf-bot only, JWT auth)
POST /detect-language — detect message language (hbf-bot only, JWT auth)
GET /organizations/:orgId/categories/:category/prompt — retrieve plugin prompt for a category
GET /organizations/:orgId/providers/:alias/version — get LLM provider API version

Message Metadata:

Endpoints in src/nlp/message-metadata/ (read per-message NLP trace)

External Dependencies

LLM Providers: OpenAI, Azure OpenAI, Google Gemini, Google Dialogflow
Database: MySQL (via TypeORM)
Cache: Redis (optional, falls back to in-memory via CacheableMemory)
hbf-core-api: tenant/pipeline/session/activity data
helvia-rag-pipelines: downstream RAG pipeline service (HTTP)
Helvia NLP Specification service: downstream NLP spec pipeline (HTTP)
APM: Elastic APM

Running Locally

npm install
npm run start:dev

Swagger UI available at /api when running.

Tests

npm test            # unit tests
npm run test:e2e    # e2e tests
npm run test:cov    # coverage report

What This Repo Does​

Tech Stack​

Entry Points​

Key Directories​

API Surface​

External Dependencies​

Running Locally​

Tests​