Bielser Engineering

Software Development & AI/ML

White paper summary

MERCURI

Multi-Engine RAG&CAG Unified Resource Intelligence.

MERCURI is a multi-engine AI workspace that connects to business systems, ingests and organizes data securely, and lets teams use that data on or offline to ask questions, generate content, and run workflows inside one governed environment.

It is built around a simple premise: serious AI work is not one prompt followed by one answer. It is model selection, grounded retrieval, explainability, review, authored output, and persistent operational state held together in a single workspace.

Multi-engine AI workspaceRAG + CAGExplainable routingGoverned knowledgeLearning workflowsAuthored content suite

Updated

March 31, 2026

MERCURI is built on a Next.js frontend and a FastAPI backend, with support for local-first and hosted model operation.

Public website

MERCURI is live at mercuri-ai.com.

Real workflow chain

Select the right model for the task.

Ground outputs in trusted workspace material.

Inspect route, retrieval, and trust signals.

Move from answers into study or authored deliverables.

Preserve state across longer workflows and operations.

Current model support

Local Ollama profiles

OpenAI profiles

Anthropic / Claude profiles

The problem it solves

AI work breaks down when chat, retrieval, review, and operations are separated.

Most AI products handle one impressive step at a time. MERCURI is designed around the reality that useful work usually spans trusted evidence, route visibility, follow-up creation, study, and persistent state across a longer session.

The product treats AI as a workspace problem, not just a chat problem. That is what closes the gap between a model demo and a usable intelligence environment.

Raw chat answers are often disconnected from trusted internal knowledge.

Users cannot clearly see why a route, model, or answer was chosen.

Retrieval, document review, study, and content creation get split across separate tools.

Multimodal workflows are usually bolted on instead of integrated.

Security, governance, and operational visibility sit outside the AI layer.

Knowledge becomes difficult to scope, audit, and reuse across repeated work.

Design principles

The product model is built around continuity, trust, and control.

Principle

Workflow continuity

MERCURI keeps question, retrieval, review, creation, and follow-up action inside one workspace so context is not lost between steps.

Principle

Explainability by default

Route reasoning, retrieval details, confidence surfaces, and trust metadata are part of the product model, not an afterthought.

Principle

Multi-engine flexibility

Different tasks can use different model profiles while staying inside the same governed history, workspace, and operational surface.

Principle

Secure ownership and governance

Documents, authored outputs, sessions, permissions, and sensitivity handling live inside an authenticated system with clear controls.

System overview

Four layers hold the workspace together.

Layer

Frontend experience layer

A unified Next.js interface for chat, settings, dashboard analytics, authored content, retrieval controls, quiz flows, and trust surfaces.

Layer

Backend orchestration layer

A FastAPI backend that manages chat routing, auth and sessions, ingestion, retrieval, developer APIs, and operational state.

Layer

Knowledge and retrieval layer

Document ownership, ingest processing, PDF segmentation, collection targeting, vector search, scope management, and knowledge graph behavior.

Layer

Persistence and state layer

Persistent user accounts, workspace profiles, authored assets, snapshots, quiz datasets, documents, connector state, and feedback data.

Workspace capabilities

MERCURI connects retrieval, learning, creation, and operations.

Multi-engine intelligence

MERCURI is not tied to one provider. It supports local and hosted model profiles while preserving a single workspace context.

Local Ollama profiles

OpenAI profiles

Anthropic / Claude profiles

User-selected or system-routed models

Grounding with RAG and CAG

RAG grounds answers in workspace knowledge. CAG keeps adjacent work fast and context-aware across follow-up requests.

Vector database and collection targeting

Grounded answers from ingested material

Context reuse across repeated tasks

Balanced continuity, efficiency, and evidence

Explainability and trust

Users can inspect how a response was produced rather than treating the output as a black box.

Explainable routing and route metadata

Retrieval debugger views

Confidence and verification surfaces

Audit, trace, and execution graph support

Knowledge and document intelligence

Documents are governed workspace assets rather than simple attachments, with clear scope, ownership, and ingest behavior.

Authenticated document ownership

Sensitivity controls and workspace visibility

Automatic PDF segmentation for large files

Knowledge hierarchy and graph relationships

Learning, review, and authored content

MERCURI moves from answering questions into assessing understanding and producing durable deliverables.

Quiz creation, CSV upload, and persistent quiz libraries

Missed-question review and remediation workflows

Rich documents with AI-assisted drafting and PDF export

Spreadsheet-style table and CSV editing

Operational workspace and platform

The product acts as an operating surface for longer work, not just a reasoning interface for isolated prompts.

Missions, research runs, and workspace snapshots

Dashboard metrics and guided next steps

Connector configuration and managed imports

Versioned /api/v1 routes and scoped API keys

Governance and identity

Security and ownership are part of the product, not bolted on.

Sign up, sign in, sign out, and password reset flows

Session-based auth using HTTP-only cookies

Role-based workspace access, invitations, and TOTP MFA

Enterprise SSO entry support when configured

Secure document ownership and sensitivity enforcement

Output redaction enforcement and security event audit logging

User controls for recent chat deletion and cache clearing

Self-delete and admin delete-account flows

Deployment flexibility

Built to run locally, in containers, or in hosted production.

Local development

Docker-based deployment

Fully containerized operation with Ollama

Vercel frontend plus public backend deployment

Nginx and TLS fronting for hardened backend exposure

Representative use cases

Grounded intelligence for extended work, not isolated prompts.

Internal knowledge search and grounded Q&AResearch support over uploaded reference materialStudy and assessment workflowsDocument review with follow-up lesson generationOperational workspace analysisMultimodal reasoning across text and imagesAuthored report and dataset generationConnector-driven knowledge import and reuse