RAG API for Internal Knowledge Base | Ragex

Ragex knowledge base centralizes scattered company documents into a single searchable layer. Upload SOPs, HR handbooks, and engineering wikis, then let employees ask natural-language questions and get answers grounded in your actual policies — not hallucinated responses from an ungrounded LLM.

The Scattered Knowledge Problem

Enterprise knowledge lives in too many places. HR policies sit in a Google Drive folder. Engineering runbooks live in Confluence. Onboarding checklists are pinned in Slack channels. SOPs exist as PDFs on SharePoint. Every department maintains its own wiki, its own folder structure, its own naming conventions.

The cost is measurable. Employees spend 20-30% of their working hours searching for information they know exists somewhere. A new hire asks "Where's the PTO policy?" and gets three different links — two of them outdated. An engineer troubleshooting a production incident cannot find the runbook because it is titled "Q3 Incident Response v2 FINAL (revised).docx" in a folder nobody has bookmarked.

Internal search tools make this worse. They return hundreds of results ranked by recency or keyword frequency, not by relevance to the actual question. Searching "expense reimbursement process" returns every document that mentions "expense" — meeting notes, budget spreadsheets, the actual policy buried on page three of results. This is the same document search problem that plagues customer-facing applications, only compounded by the volume and format diversity of internal docs.

How It Works

RAG (Retrieval-Augmented Generation) adds a retrieval layer between the employee's question and the LLM's response. Instead of the LLM guessing from training data, it searches your actual internal documents and generates an answer grounded in what they say.

Ragex handles the entire retrieval pipeline — parsing documents across 16 file types, splitting them into searchable chunks, generating embeddings, indexing, and serving search results with reranking. You don't choose embedding models, configure vector dimensions, or manage infrastructure. When better models become available, your retrieval quality improves without a code change.

The setup takes three API calls and under 5 minutes to your first query:

curl -X POST https://api.useragex.com/api/v1/knowledge-bases \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"name": "Internal Knowledge Base"}'

Upload your SOPs, HR policies, wikis, and spreadsheets:

curl -X POST https://api.useragex.com/api/v1/knowledge-bases/:kb_id/documents \
  -H "Authorization: Bearer $API_KEY" \
  -F "file=@hr-handbook-2026.pdf" \
  -F "metadata={\"department\": \"hr\", \"doc_type\": \"policy\"}"

Search with an employee's question:

curl -X POST https://api.useragex.com/api/v1/knowledge-bases/:kb_id/search \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "How many PTO days after two years?",
    "top_k": 5,
    "rerank": true,
    "filter": {"department": "hr"}
  }'

Processing is asynchronous. Text files index in about 4 seconds. A 10-page PDF completes in under 60 seconds. You can start searching as soon as the first documents are ready — no need to wait for the entire knowledge base to finish indexing. Feed the top search results as context to your LLM (GPT-5-mini, Claude, or any model you prefer) to generate a grounded, citable answer.

What You Can Build

An internal knowledge base powered by a RAG API opens up several high-value applications that go beyond simple search.

Company-wide Q&A assistant. A Slack bot connected to your knowledge base lets employees ask "What's the parental leave policy for employees in California?" and get the exact answer from your current benefits handbook — not a cached version from 2022. The same pattern works for customer support chatbots facing external users, but internal assistants benefit from metadata filtering that scopes answers by department.

Onboarding copilot for new hires. New employees spend their first weeks asking colleagues where to find onboarding checklists, equipment request forms, and engineering setup guides. A chatbot grounded in your docs answers those questions instantly, pulling from HR handbooks, IT setup guides, and team wikis. Every answer links back to the source document so new hires can bookmark it for later.

Engineering runbook search. When an incident happens at 2 AM, the on-call engineer needs the right runbook in seconds. Keyword search for "database failover" might miss the document titled "RDS Recovery Procedures." Semantic search finds it because the meaning matches, even when the words do not. The same approach applies to any document search scenario where vocabulary mismatch is a problem.

Your Documents Work Out of the Box

Enterprise internal docs are rarely clean text files. The employee handbook is a 47-page PDF with nested tables. The engineering runbook is a Confluence export with embedded screenshots. The expense policy is an XLSX with conditional formatting. A RAG API for internal knowledge base needs to handle all of these without manual preprocessing.

The API supports 16 file types to cover what companies actually store:

Parsed formats (PDF, DOCX, PPTX, XLSX, images): These are processed with layout-aware extraction. Scanned PDFs go through OCR. Multi-column layouts are reassembled. Tables — org charts, approval matrices, pricing schedules — are preserved as single chunks so they are never split across search results.

Direct formats (TXT, MD, HTML, CSV, JSON, TSV, XML): Ingested directly with structure preserved. Wiki exports, config files, and structured data are searchable without conversion.

Metadata filtering lets you scope searches by department, document type, or access level. Tag documents during upload, then filter at query time — an HR question only searches HR docs, an engineering question only searches engineering runbooks. This same API also powers healthcare knowledge bases where access control is even more critical.

Works With Your Stack

Ragex is a building block, not a walled garden. It returns search results as JSON with relevance scores, source metadata, and matched text — ready to feed into any LLM or framework.

For Python developers, the API integrates directly with LangChain for building retrieval chains, or with LlamaIndex for document-centric workflows. TypeScript teams building internal tools with Next.js can use the Vercel AI SDK integration to add streaming, grounded responses to their internal portals.

You bring your documents and your LLM. The API handles everything in between — parsing, chunking, embedding, indexing, and retrieval. Five API calls from zero to a working knowledge-grounded feature.

Getting Started

Install the Python SDK (a TypeScript SDK is also available), create a knowledge base, upload your documents, and run your first search query. The process from installation to a working internal Q&A feature typically takes less than an afternoon. No infrastructure to provision, no models to configure.

pip install ragex

from ragex import Ragex

client = Ragex(api_key="your-api-key")

kb = client.knowledge_bases.create(name="Internal KB")  # create a knowledge base

doc = client.documents.create(  # upload an HR handbook
    knowledge_base_id=kb.id,
    file=open("hr-handbook-2026.pdf", "rb"),
    metadata={"department": "hr", "doc_type": "policy"}
)

results = client.search(  # search with an employee's question
    knowledge_base_id=kb.id,
    query="How many PTO days after two years?",
    top_k=5,
    rerank=True
)

for result in results:
    print(result.content, result.score)

Pass the top results as context to your LLM to generate a grounded, citable answer.

Pricing

Plan	Monthly	Pages	Queries	Best For
Starter	$29	500	5,000	Small teams, under 50 employees
Pro	$79	2,000	15,000	Growing teams, 50-200 employees
Business	$229	6,500	50,000	Mid-size companies, 200-500 employees
Scale	$499	15,000	120,000	Large orgs, multi-department KB

Most companies start on Pro. A mid-size company typically has 2,000-8,000 internal documents and 10,000-30,000 employee queries per month. Pricing is all-inclusive — parsing, embedding, reranking, and search are bundled. No separate vector storage bills or per-embedding charges.

For teams evaluating whether to build a DIY pipeline or use a managed service, the total cost of ownership comparison with self-assembled vector database stacks favors the managed approach once you factor in developer time, ongoing maintenance, and model upgrades. Browse alternative managed RAG providers to compare options.

FAQ

How long does it take to index an entire internal knowledge base?

It depends on file types and volume. Text-based files (MD, TXT, HTML) process in about 4 seconds each. PDFs take under 60 seconds for a 10-page document. A typical mid-size company with 2,000 mixed-format documents could have their entire knowledge base indexed within a few hours. Documents process asynchronously, so you can start searching as soon as the first batch is ready.

Can employees search across departments they don't have access to?

The API supports metadata filtering on every search request. Tag documents with department, access level, or team during upload, then pass a filter parameter to restrict results. For example, filter by department = "engineering" to ensure a developer only sees engineering runbooks. Your application layer controls access logic — the API enforces the filter at query time with no performance penalty.

How does the API handle outdated documents when policies change?

Replace any document via the API without downtime. The system re-processes the new version asynchronously while search continues against the current version. When processing completes, the old version is swapped out automatically. This means employees always get answers based on the latest HR policy or SOP — not a stale version that was superseded three months ago.

What file formats work for enterprise internal docs?

The API supports 16 file types covering the formats enterprises actually use. Parsed formats include PDF, DOCX, PPTX, XLSX, and images — these are processed with layout-aware extraction that preserves tables and handles scanned documents via OCR. Direct formats include TXT, MD, HTML, CSV, JSON, TSV, and XML. Between these, virtually every internal document format is covered without manual conversion.

Last updated: February 20, 2026