Simple API for document retrieval and search

Ragex gives you document retrieval in three endpoints — create a knowledge base, upload files, and search with natural language. No vector database, no embedding pipeline, no infrastructure to manage.

TL;DR: Ragex provides document retrieval through three REST endpoints: create a knowledge base, upload documents, and search. The API handles parsing, chunking, embedding, indexing, and reranking for 16 file types. SDKs are available for Python and TypeScript. Plans start at $29/mo.

What makes a document retrieval API "simple"?

Simplicity means fewer moving parts. A traditional document retrieval stack requires a document parser, a chunking library, an embedding model, a vector database, and a reranker — five components with five APIs. Ragex replaces all of them with a single API key and three endpoints.

The API surface looks like this:

Endpoint What It Does
POST /v1/knowledge-bases Create a named collection for your documents
POST /v1/knowledge-bases/:id/documents Upload a file or raw text (async processing begins)
POST /v1/knowledge-bases/:id/search Semantic search with reranking, returns ranked chunks

No configuration files, no model selection, no index tuning. Upload a PDF and search it with a natural language question.

How do you get started?

import { RagexClient } from 'ragex';

const client = new RagexClient({ apiKey: 'YOUR_API_KEY' });

// Create a knowledge base
const kb = await client.createKnowledgeBase({ name: 'Product Docs' });

// Upload a document
const doc = await client.uploadDocument(kb.id, pdfFile);

// Wait for processing, then search
const results = await client.search(kb.id, {
  query: 'How do I configure webhooks?',
  top_k: 5,
});

console.log(results.results[0].text);

Install the SDK (npm install ragex or pip install ragex), create a client with your API key, and you are ready to upload and search. The SDK handles authentication, request formatting, and response parsing.

What does the search response include?

Each search result contains:

  • text — the matched chunk content
  • score — relevance score from reranking (0 to 1)
  • document_id and document_name — which source file the chunk came from
  • metadata — chunk-level info like page number, section heading, and chunk index
  • document_metadata — any custom metadata you attached at upload time

This gives you everything needed to show results to users or pass them as context to an LLM. You do not need a separate call to get source information.

How does filtering work?

Attach metadata to documents at upload time and filter on it during search. This is useful for scoping queries to specific categories, versions, or access levels:

// Upload with metadata
await client.uploadDocument(kb.id, file, {
  metadata: { category: 'billing', version: 3 }
});

// Search with a filter
const results = await client.search(kb.id, {
  query: 'refund policy',
  filter: { category: { $eq: 'billing' } },
});

The API supports $eq, $ne, $gt, $gte, $lt, $lte, $in, and $nin operators. Filters are applied before vector search, so they do not slow down queries.

FAQ

Is this a REST API or does it require a specific SDK?

It is a standard REST API with JSON request and response bodies. SDKs for Python and TypeScript are provided for convenience, but you can call the endpoints directly with curl, fetch, or any HTTP client in any language. Authentication is a Bearer token in the Authorization header.

How does this compare to Elasticsearch or Algolia?

Elasticsearch and Algolia are keyword search engines — they match terms. Ragex does semantic search — it understands meaning. A query like "how to fix login issues" matches chunks about "authentication errors" and "password reset" even though those words do not appear in the query. The API also handles document parsing and reranking, which Elasticsearch does not.

What is the latency for search queries?

Search returns results in milliseconds, even with reranking enabled. Reranking adds a cross-encoder re-scoring step that improves relevance. The latency overhead is minimal and the quality improvement is significant for most use cases.


Last updated: 2026-03-09