rag - Samuel's Vault

# Retrieval-Augmented Generation  - Search for _relevant_ info based on _similarity_ (e.g. Jaccard similarity) ## Lexical Retrieval - Jaccard similarity - Stop words removal and _stemming_ - Down side: common word and less common word with more specific meaning are given the same emphasis - Upside: easy to implement and runs fast - Frequency-inverse document frequency ([[tfidf|TF*IDF]]) and BM25 take word importance into consideration. - Lexical retrieval still powers most of online searches - Algolia - Elasticsearch ## Neural Retrieval - Snippetize documents (by paragraphs, or by running windows) - Run snippets through [[embedding]] models, and store in [[vector-db]]. - Perform vector lookup to find relevant snippets. ## Pipeline - Obtain dynamic and static context, along with user instructions. - Further search for relevant context (preferably with neural retrieval, through [[vector-db]]) - Assemble the prompt. - Feed the prompt to LLM.