Friday viewing: "Advanced Retrieval" Webinar I had a great time discussing: – Importance of preprocessing @mrobinson0623 of @UnstructuredIO – Different retrieval methods to power RAG applications
– What lies beyond simple RAG? @atroyn of @trychroma
@hwchase17
-
Advanced Retrieval Methods for RAG Applications
By
–
-
Evaluating Question-Answering Applications: A Practical Cookbook
By
–
Everyone's building question/answering applications, but evaluating them is pretty tricky We're trying to make this easier. This cookbook shows how to evaluate the final answers (end-to-end app). We'll add one more focused on just retrieval shortly. What else would be helpful?
-
Balancing Chunk Size for Semantic Embeddings in RAG Systems
By
–
There was always a balance between small enough chunks to allow the embeddings to capture the semantic meaning of a chunk, while at the same time wanting them to be long enough to have the full context This helps strike this balance Docs: https://
python.langchain.com/docs/modules/d
ata_connection/retrievers/parent_document_retriever
… -
Production Data Ingestion Webinar with Airbyte and Sweep
By
–
"Production Ingestion" Webinar In order to have good retrieval, you need to ingest data properly and at scale. Excited to announce that will be the focus our next webinar We'll be joined by @AirbyteHQ and @sweepai
. Register below: -
Vercel Template for AI Agents with Intermediate Steps Display
By
–
New vercel template and cookbook for agents – show the intermediate steps in a clear way
-
LangSmith: Debugging, Testing, and Monitoring Tools for LangChain
By
–
Lots of fun stuff in LangSmith – debugging, testing, evaluation, monitoring And you can use it all even if you aren’t using LangChain! Great new guide
-
Data Frameworks for Production Vectorstore Management
By
–
It's nice to see more data frameworks emerging for production considerations (keeping context up-to-date and in-sync in a vectorstore)
-
Parent Document Retriever: Semantic Chunk Embeddings Strategy
By
–
Parent Document Retriever A new retrieval algorithm that: Creates small chunks (to allow embeddings to have semantic meaning) Fetches the PARENT documents those chunks came from (to capture full context) Parent documents can be either raw documents or larger chunks
-
Balancing Chunk Size for Semantic Embeddings in RAG Systems
By
–
There was always a balance between small enough chunks to allow the embeddings to capture the semantic meaning of a chunk, while at the same time wanting them to be long enough to have the full context This helps strike this balance Docs: https://
python.langchain.com/docs/modules/d
ata_connection/retrievers/parent_document_retriever
… -
Expanding Context Windows Improves Semantic Chunk Retrieval
By
–
Can also be thought of as similar to fetching the chunk before/after the most semantically similar chunk This takes advantage of the fact that context windows are getting longer and longer! This would have been less plausible a few months ago