Knowledge Bases

Knowledge Bases provide Retrieval-Augmented Generation (RAG) capabilities to Prospector Studio agents. Upload documents, and the platform automatically processes them for semantic search.

Creating a Knowledge Base

Navigate to the Knowledge Bases section in Studio
Create a new knowledge base with a name and description
Upload documents to the knowledge base
Wait for processing to complete — documents are automatically chunked, embedded, and indexed

Supported Formats

Prospector Studio supports a wide range of document formats:

Format	Extensions
PDF	`.pdf`
Word	`.docx`
PowerPoint	`.pptx`
Excel	`.xlsx`
OpenDocument	`.odt`, `.odp`, `.ods`
CSV	`.csv`
JSON	`.json`
XML	`.xml`
HTML	`.html`
Plain text	`.txt`, `.md`

How It Works

When you upload a document, Prospector Studio:

Parses the document using format-specific extractors
Chunks the content into semantically meaningful segments
Generates embeddings via LiteLLM for each chunk
Indexes the vectors in PostgreSQL using pgvector

When an agent with an attached knowledge base receives a query, it:

Generates an embedding for the user's question
Performs a vector similarity search against the knowledge base
Includes the most relevant chunks as context in the LLM prompt

Attaching to Agents

Knowledge bases are connected to agents through the agent configuration. A single agent can reference multiple knowledge bases, and a single knowledge base can be shared across multiple agents.

Document Processing Status

Documents go through a processing pipeline after upload. You can monitor the status of each document in real-time through the Studio UI. States include:

Pending — Queued for processing
Processing — Being parsed, chunked, and embedded
Ready — Fully indexed and available for search
Failed — An error occurred during processing