Skip to main content

Flow: RAG Knowledge Query

Sequence diagram for knowledge retrieval through the RAG pipeline. Services: hbf-bot, hbf-nlp, helvia-rag-pipelines, OpenAI/Azure OpenAI, Qdrant/Milvus

Sequence Diagram

Step-by-Step

  1. Bot triggers NLP (hbf-bot → hbf-nlp): During workflow execution, hbf-bot sends the user's query to hbf-nlp for processing.

  2. Pipeline resolution (hbf-nlp → hbf-core): hbf-nlp fetches the pipeline configuration from hbf-core. If the pipeline type is helvia-rag, it routes to helvia-rag-pipelines using the configured serviceUrl.

  3. Translation (helvia-rag-pipelines): If the query language differs from the corpus language, the query is translated via Google Cloud Translate or Azure Translator before processing.

  4. Semantic cache (helvia-rag-pipelines → SemCache): The service checks the Helvia SemCache for a semantically similar previous query. On cache hit, the cached response is returned immediately.

  5. Embedding (helvia-rag-pipelines → OpenAI): On cache miss, the query is embedded using the configured embedding model (OpenAI, Azure OpenAI, or Google Generative AI).

  6. Vector search (helvia-rag-pipelines → Qdrant/Milvus): The query embedding is used for semantic search against the pipeline's vector collection. Top-k corpus items are returned with similarity scores.

  7. Response generation (helvia-rag-pipelines → LLM): The matched corpus items are combined with the query in a prompt, and an LLM generates a natural language answer.

  8. Cache store (helvia-rag-pipelines → SemCache): The generated response is stored in the semantic cache for future similar queries.

  9. Return (helvia-rag-pipelines → hbf-nlp → hbf-bot): The answer, sources, and confidence are returned up the chain.

Alternative: Direct RAG Search from hbf-bot

hbf-bot can also call helvia-rag-pipelines directly via POST /pipelines/{pipelineId}:search for semantic search without going through hbf-nlp. This path returns raw search results (corpus matches) rather than a generated answer.

Contracts

hbf-nlp → helvia-rag-pipelines (POST /pipelines/{pipelineId}:process):

Request: {
"query": "How do I reset my password?",
"query_language": "en",
"session_id": "sess_123",
"max_results": 5,
"previous_messages": [
{ "role": "user", "content": "I need help with my account" },
{ "role": "assistant", "content": "Sure, what do you need help with?" }
],
"parameters": {}
}
Response: {
"answer": "To reset your password, go to Settings > Security > Reset Password...",
"sources": [{ "title": "Password Reset Guide", "score": 0.92 }],
"confidence": 0.95
}

hbf-bot → helvia-rag-pipelines (POST /pipelines/{pipelineId}:search):

Request: {
"query": "password reset",
"max_results": 3,
"filters": { "include_tags": ["faq"], "exclude_tags": ["internal"] },
"response_options": { "return_examined_corpus_str": true }
}
Response: {
"results": [
{ "title": "Password Reset Guide", "body": "...", "score": 0.92, "tags": "faq" }
]
}

helvia-rag-pipelines → OpenAI (POST /v1/embeddings):

Request:  { "model": "text-embedding-3-small", "input": ["How do I reset my password?"], "encoding_format": "base64" }
Response: { "data": [{ "embedding": "base64..." }] }

helvia-rag-pipelines → OpenAI (POST /v1/chat/completions):

Request: {
"model": "gpt-4o",
"messages": [
{ "role": "system", "content": "Answer based on the following knowledge base..." },
{ "role": "user", "content": "How do I reset my password?" }
],
"temperature": 0.3,
"max_completion_tokens": 1000
}