Building a Personal RAG Assistant
I’ve been fascinated by the advancements in LLM and GenAI over the last year. The idea of building a system that can understand and respond to human language in a meaningful way has been a long-standing goal to better understand the inner workings of LLM.
The Challenge: Building a Document-Based Question Answering System
My goal was to create a Retrieval-Augmented Generation (RAG) system that could answer questions based on a collection of documents. A web app interface capable of providing summaries, extracting key information, or even generating creative text formats based on my document library.
Technical Stack and Challenges
- FastAPI: For building REST API to interact with the LLM backend.
- Langchain: A powerful framework for developing LLM applications, providing essential components for document loading, splitting, and vectorization.
- FAISS: A library for efficient similarity search, crucial for retrieving relevant documents based on user queries.
- Llama3 (or a similar open-source LLM): The core language model for generating text.
While building this system, I encountered several challenges:
- Document Preprocessing: Effectively splitting documents into manageable chunks while preserving context was a complex task.
- Vector Embedding Quality: The choice of embedding model significantly impacted the accuracy of document retrieval.
- LLM Limitations: Fine-tuning the LLM to generate accurate and relevant answers required careful prompt engineering and experimentation.
- Expanding Document Formats: Incorporating support for various document types beyond PDF, TXT, and CSV.
- System Performance: Balancing the trade-off between response time and model complexity was essential.
- User Interface Development: Creating a user-friendly interface to simplify interaction with the system.
This project is still very much a work in progress. I'm actively exploring the best approaches to scaling this system and ensuring optimal performance. The journey to building a robust and efficient RAG platform is ongoing, and I'm excited to share updates and lessons learned along the way.
Github
https://github.com/harshitsinghai77/doc-talk
Conclusion
That's it for this post.