Knowledge Assistant — Trust-Critical Retrieval Across Fragmented Internal Knowledge

Permission-aware internal knowledge assistant focused on trust-critical retrieval, citations, access control, and calibrated uncertainty across fragmented company knowledge.

Date:

Role: AI Engineer (Freelance)

Stack: Python, hybrid retrieval, semantic chunking, reranking, access-aware retrieval, Slack, web UI

Outcomes

  • Permission-aware internal knowledge assistant across Confluence, Google Drive, and internal communications
  • About 3x faster access to relevant internal knowledge in practice, from roughly ~1 minute to ~15–20 seconds
  • Citation-backed answers, confidence-gated behavior, and secure access separation for more sensitive corpus handling

Case study

This was not a generic chatbot project. It was a trust-critical retrieval system built to help people recover the right project context, safely and with evidence, across fragmented internal knowledge.

TL;DR

  • The core problem was context fragmentation across Confluence, Google Drive, and internal communications.
  • Retrieval quality, access correctness, and calibrated uncertainty mattered more than conversational smoothness.
  • I treated permissions, citations, and evaluation as core product behavior, not infrastructure around it.

Workflow and user problem

The system was built for a mid-sized international software-development company, especially managers, project leads, and C-level stakeholders who had to reconnect to project context across many parallel streams of work.

The real job to be done was simple: help a user recover the right current project context without manually searching scattered documents, asking colleagues, or rebuilding context in meetings.

Trust requirements

A fluent answer was not useful if it was grounded in the wrong project, the wrong client, the wrong timeframe, or the wrong permission boundary.

That made this a trust-critical retrieval problem rather than a generic chat problem. The design priority was trust over conversational smoothness.

Permissions and retrieval model

Permission-aware retrieval was foundational architecture, not an afterthought. Corpora were filtered before retrieval and context assembly, not after generation.

The system used hybrid retrieval: keyword search for precision, embeddings for recall, and later reranking to improve top-context quality. Semantic chunking by paragraphs and meaning blocks improved retrieval usefulness over flatter chunking.

Citations and calibrated uncertainty

Every useful answer needed citations or links. I preferred narrower, grounded answers over broad speculative ones.

The answer policy was confidence-tiered:

  • below 40%: explicit “I don’t know”
  • 40–60%: uncertain answer plus top links
  • above 60%: answer with citations

That policy mattered because the right fallback was often links, suggestions, or explicit uncertainty rather than a polished but unsafe answer.

Evaluation approach

I evaluated retrieval quality separately from final-answer quality. The eval lens included relevance, usefulness, citation correctness, access correctness, and behavior under weak evidence.

Human review was part of the system: weekly manager review and relevance scoring, plus human-in-the-loop feedback from managers and documentation/specification owners.

Failure modes

The hardest failures were not random hallucinations. They were plausible answers grounded in the wrong project phase, stale plans, the wrong client context, or inaccessible material.

Another core risk was confidence mismatch: answers that sounded clean when the evidence was weak. Retrieval mistakes across corpus boundaries were treated as architectural failures, not just model failures.

What shipped

  • A permission-aware internal knowledge assistant across Confluence, Google Drive, and internal communications
  • Slack interface plus a simple web UI
  • Hybrid retrieval, semantic chunking, reranking, and confidence-gated answers
  • Citation-backed answer policy and explicit low-confidence degradation
  • Secure access separation for a more sensitive C-level corpus

Public-safe impact

  • Company scale: ~150 people, 10+ teams, roughly 20–30 project contexts
  • Effective answer retrieval time dropped from roughly ~1 minute to ~15–20 seconds
  • About 3x faster access to relevant internal knowledge in practice
  • Clarification-heavy meetings were roughly halved, reducing meeting drag and coordination overhead

What stayed internal

  • The company name and deeper domain details
  • Detailed corpus design, auth implementation specifics, and internal observability
  • Private screenshots, dashboards, and deeper retrieval/debug tooling