2023: Oh yeah, just use Llamaindex to chunk up your text, store embeddings on pinecone, use Langchain for chain of thought, and use GPT4 model to tune the turbo-3.5 model for cost, and set up a fail safe to switch to another model for when API isn’t available….
LLM Optimization: Chunking, Embeddings, and Model Failover Strategies
By
–
Leave a Reply