Chat with Documents allows you to upload a PDF document and ask questions about its content. The application uses Llama-3.2 for LLM and HuggingFace embeddings for document indexing and querying. You can chat with the document and get real-time responses.
- Upload a PDF document and index it.
- Ask questions related to the content of the uploaded document.
- Clear the chat history.
- Download chat history as a text file.
- Display document analytics like word frequency (powered by Plotly).
- Powered by Llama-3.2 and HuggingFace embeddings.
Before running the application, make sure you have the following software and libraries installed:
- Python 3.8+
- Streamlit (for web interface)
- Plotly (for analytics dashboard)
- Llama-3.2 and HuggingFace (for querying and document embedding)
- Qdrant (for vector storage)
- OpenAI API for language model integration
llama_index
streamlit
plotly
qdrant-client
huggingface_hub
Pillow
uuid
base64
Clone this repository to your local machine using:
git clone https://github.com/buriihenry/Chat-with-PDF.git
It is highly recommended to use a virtual environment to manage dependencies. Run the following commands:
python3 -m venv venv
source venv/bin/activate
python -m venv venv
.\env\Scripts\ctivate
Install the required libraries using pip
:
pip install -r requirements.txt
If you don't have a requirements.txt
file yet, you can create one by manually adding the following dependencies:
llama_index
streamlit
plotly
qdrant-client
huggingface_hub
Pillow
uuid
base64
You can also install individual packages with the following command:
pip install streamlit plotly llama_index qdrant-client huggingface_hub Pillow
You will need the Ollama model (Llama-3.2:1b) for querying. Please make sure to install it. You can find instructions on the Ollama website if necessary.
For local testing, you can run a Qdrant instance using Docker:
docker run -p 6333:6333 qdrant/qdrant
Alternatively, you can install Qdrant locally (without Docker) using pip:
pip install qdrant-client
Ensure that your environment has access to Qdrant running on localhost:6333
and any HuggingFace API keys if required.
Once you have installed all dependencies and set up the environment, you can run the application locally with Streamlit:
- Make sure you are in the project directory.
- Run the following command to start the Streamlit app:
streamlit run app.py
This will launch the application in your web browser at http://localhost:8501
.
part of the code looks like this:
llm = Ollama(model="llama3.2:1b", request_timeout=120.0)
- This indicates that the application is set up to use an Ollama model for generating responses based on queries. Without installing Ollama, this part of the code will not function, leading to errors when you try to run it.
- If you want to run Ollama using Docker, you can do so with the following command. This will pull the latest Ollama image and start it in a container
docker run -d --name ollama -p 11434:11434 -v ollama_volume:/root/.ollama ollama/ollama:latest
- after installing the Ollama, we need to pull 'llama3.2:1b' by running the below command
ollama pull llama3.2:1b
Once the app is running, you can:
- Upload a PDF document via the file upload option in the sidebar.
- Chat with the document: Type a question in the input box, and the application will respond based on the content of the uploaded document.
- View Analytics: The app will show word frequency analysis for the document you uploaded.
- Clear chat history: Press the “Clear Chat” button to reset the conversation.
- Download chat history: Download the entire chat history in
.txt
format for reference.
- Upload a document, such as a research paper or technical manual.
- Ask specific questions like:
- "What is the main topic of this document?"
- "Can you explain the key findings?"
- "How does the author define machine learning?"
The model will fetch the relevant context and provide answers.
Feel free to open issues or submit pull requests if you would like to contribute to the project. All contributions are welcome!
- Improve document handling and indexing performance.
- Add support for multiple document types (Word, Text, etc.).
- Add advanced NLP features like summarization and entity recognition.
This project is licensed under the MIT License.
Thank you for checking out Chat with Documents! We hope you enjoy using it. 😊