← 목록으로
Tutorial
Building a simple RAG app with open-source models
RAG (Retrieval-Augmented Generation) lets you add your own docs to an LLM. Here’s a minimal path to a working app.
Stack Use sentence-transformers or OpenAI embeddings for chunks. Store vectors in Chroma or FAISS. Use a small local LLM (e.g. Llama, Mistral) or an API for the final answer.
Steps 1. Chunk documents (500–1000 tokens, overlap 50–100). 2. Embed chunks and store in a vector DB. 3. On query, embed the query, retrieve top-k chunks. 4. Pass chunks + query to the LLM and return the answer.
Tips Keep chunks meaningful (e.g. by section). Add metadata (source, page) for citations. Tune top-k and chunk size on a small test set.
You can ship a first version in a weekend; iterate on chunking and model choice for quality.