Flow: RAG Knowledge Query
Sequence diagram for knowledge retrieval through the RAG pipeline. Services: hbf-bot, hbf-nlp, helvia-rag-pipelines, OpenAI/Azure OpenAI, Qdrant/Milvus
Sequence Diagram
Step-by-Step
-
Bot triggers NLP (hbf-bot → hbf-nlp): During workflow execution, hbf-bot sends the user's query to hbf-nlp for processing.
-
Pipeline resolution (hbf-nlp → hbf-core): hbf-nlp fetches the pipeline configuration from hbf-core. If the pipeline type is
helvia-rag, it routes to helvia-rag-pipelines using the configuredserviceUrl. -
Translation (helvia-rag-pipelines): If the query language differs from the corpus language, the query is translated via Google Cloud Translate or Azure Translator before processing.
-
Semantic cache (helvia-rag-pipelines → SemCache): The service checks the Helvia SemCache for a semantically similar previous query. On cache hit, the cached response is returned immediately.
-
Embedding (helvia-rag-pipelines → OpenAI): On cache miss, the query is embedded using the configured embedding model (OpenAI, Azure OpenAI, or Google Generative AI).
-
Vector search (helvia-rag-pipelines → Qdrant/Milvus): The query embedding is used for semantic search against the pipeline's vector collection. Top-k corpus items are returned with similarity scores.
-
Response generation (helvia-rag-pipelines → LLM): The matched corpus items are combined with the query in a prompt, and an LLM generates a natural language answer.
-
Cache store (helvia-rag-pipelines → SemCache): The generated response is stored in the semantic cache for future similar queries.
-
Return (helvia-rag-pipelines → hbf-nlp → hbf-bot): The answer, sources, and confidence are returned up the chain.
Alternative: Direct RAG Search from hbf-bot
hbf-bot can also call helvia-rag-pipelines directly via POST /pipelines/{pipelineId}:search for semantic search without going through hbf-nlp. This path returns raw search results (corpus matches) rather than a generated answer.
Contracts
hbf-nlp → helvia-rag-pipelines (POST /pipelines/{pipelineId}:process):
Request: {
"query": "How do I reset my password?",
"query_language": "en",
"session_id": "sess_123",
"max_results": 5,
"previous_messages": [
{ "role": "user", "content": "I need help with my account" },
{ "role": "assistant", "content": "Sure, what do you need help with?" }
],
"parameters": {}
}
Response: {
"answer": "To reset your password, go to Settings > Security > Reset Password...",
"sources": [{ "title": "Password Reset Guide", "score": 0.92 }],
"confidence": 0.95
}
hbf-bot → helvia-rag-pipelines (POST /pipelines/{pipelineId}:search):
Request: {
"query": "password reset",
"max_results": 3,
"filters": { "include_tags": ["faq"], "exclude_tags": ["internal"] },
"response_options": { "return_examined_corpus_str": true }
}
Response: {
"results": [
{ "title": "Password Reset Guide", "body": "...", "score": 0.92, "tags": "faq" }
]
}
helvia-rag-pipelines → OpenAI (POST /v1/embeddings):
Request: { "model": "text-embedding-3-small", "input": ["How do I reset my password?"], "encoding_format": "base64" }
Response: { "data": [{ "embedding": "base64..." }] }
helvia-rag-pipelines → OpenAI (POST /v1/chat/completions):
Request: {
"model": "gpt-4o",
"messages": [
{ "role": "system", "content": "Answer based on the following knowledge base..." },
{ "role": "user", "content": "How do I reset my password?" }
],
"temperature": 0.3,
"max_completion_tokens": 1000
}