Chunking
ELI5 — The Vibe Check
Cutting up big documents into smaller pieces so an AI can actually understand them. LLMs have limited context windows, so you can't just shove an entire codebase into one prompt. You slice it into chunks, store them in a vector database, and retrieve the relevant pieces when needed. The art is knowing WHERE to cut.
Real Talk
Chunking is the process of splitting large documents into smaller segments for embedding and retrieval in RAG systems. Strategies include fixed-size chunks, sentence-based splitting, semantic chunking, and recursive character splitting. Chunk size, overlap, and splitting strategy significantly impact retrieval quality.
Show Me The Code
// Simple chunking with overlap
function chunk(text, size = 500, overlap = 50) {
const chunks = []
for (let i = 0; i < text.length; i += size - overlap) {
chunks.push(text.slice(i, i + size))
}
return chunks
}
// Each chunk gets embedded → stored in vector DB
When You'll Hear This
"Try smaller chunks — the retrieval quality is bad." / "Semantic chunking beats fixed-size for code."
Related Terms
Context Window
A context window is how much text an AI can 'see' at once — its working memory.
Embedding
An embedding is turning words, sentences, or entire documents into lists of numbers (vectors) that capture their meaning.
RAG (Retrieval Augmented Generation)
RAG is how you give an AI access to your private documents without retraining it.
Vector Database
A vector database is a special database built to store and search embeddings.