Glossary
Chunking
Also known as: Text chunking, Document segmentation
Definition
In RAG systems, chunking refers to splitting a document into smaller units, typically between 200 and 1,000 tokens per chunk. Strategies range from fixed size to sentence- or paragraph-based to semantic segmentation. Good chunk boundaries respect the underlying content structure (sections, headings) so that retrieved passages stand on their own. Chunk size and overlap affect retrieval recall, precision, and cost.
How Swiss Knowledge Hub uses this term
Swiss Knowledge Hub automatically chunks uploaded documents along page and paragraph boundaries and stores the page number and source reference for each chunk. This allows every answer to be backed by a direct jump into the originating PDF.
Related terms
Sources
- Pinecone — Chunking strategies for LLM applications — https://www.pinecone.io/learn/chunking-strategies/
Last updated: April 22, 2026