PDF Interaction using LangChain and Ollama

This is a PDF Document Analysis with AI-Powered Question Answering project. In today’s fast-paced world, accessing and extracting information from PDF documents efficiently is crucial for many professionals across various industries. However, manually skimming through lengthy documents to find specific information can be time-consuming and tedious.

In this blog post, we introduce a solution that leverages AI-powered question-answering techniques to streamline the process of analyzing PDF documents.

This project utilizes various open-source libraries and models to create a PDF-based question answering system. The application allows users to upload a PDF file and then ask questions about the contents of that PDF. 

The system will attempt to provide an answer to the user’s query based on the information contained within the uploaded PDF.

The Project Should Perform Several Tasks

Apart from the Main Function, which serves as the entry point for the application.

This code does several tasks including setting up the Ollama model, uploading a PDF file, extracting the text from the PDFsplitting the text into chunks, creating embeddings, and finally uses all of the above to generate answers to the user’s questions.

Dependencies and How to Get them

To get started with this project, you will need to install the required dependencies. In this case you can follow the following instructions:

PyPDF2

PyPDF2 is a popular Python library used for reading and writing PDF files. To install the PyPDF2, use this pip command pip install PyPDF2.

Ollama or LLaMA

LLaMA is an open-source large language model (LLM) developed by Meta AI, this is capable of generating text based on a given prompt or input. Specifically we use the llama3 model which is the latest model released by the Meta.

The LLaMA model is not available for direct installation via pip. Instead, you can use the Hugging Face `transformers` library and install the `llama-3` package.

For this tutorial we use the Ollama application downloaded and installed in the computer To download just go their official website to download: https://ollama.com/

After you download make sure you go to the terminal and download the specific model that you want, for this case Open the terminal and run ollama run llama3.

FAISS

Facebook AI Similarity Search (FAISS) is a useful Python library used for efficient similarity search and clustering of dense vectors.

Note: FAISS it is not bundled with LangChain. You’ll need to install FAISS separately if you want to use it. use this pip install faiss-gpu

Langchain

Langchain is an amazing open-source library used to create, train, and use language models and other natural language processing (NLP) tools. This is one of the game changers in AI and they allow us to use the AI models in many ways.

Remember: LangChain itself does not include the FAISS library, Ollama model, or other specific dependencies like Ollama embeddings directly. Instead, it provides interfaces, abstractions, and utilities to work with these external libraries and models seamlessly within the LangChain ecosystem.

When I am saying the LangChain is amazing, here what I mean. Look this overview of what LangChain does and the dependencies it facilitates to get:

  1. Text Splitting:
    • LangChain offers utilities for splitting text into smaller, manageable chunks. This is helpful when dealing with large documents or datasets.
  2. Embeddings:
    • Embeddings are representations of words or phrases in a continuous vector space. LangChain provides support for using embeddings in NLP tasks. In your case, it’s using Ollama embeddings, which are based on the Ollama language model.
  3. Vectorization:
    • Vectorization involves converting text data into numerical vectors that machine learning models can understand. LangChain offers vectorization capabilities through the FAISS module.
  4. Question Answering:
    • LangChain includes functionality for building question answering systems. It provides a convenient way to load pre-trained models and apply them to a given input to generate responses.
  5. Language Models (LLMs):
    • LangChain allows the integration of various language models (LLMs) into NLP workflows. In your code, it’s utilizing the Ollama language model, which is one of the LLMs supported by LangChain.
  6. Callbacks:
    • Callbacks are used for handling asynchronous events or interactions in NLP pipelines. LangChain provides a callback manager and specific callback handlers, such as StreamingStdOutCallbackHandler, to manage and process callbacks effectively.

Therefore, as they provide these amazing functionalities, LangChain simplifies the development of NLP applications and workflows, making it easier for developers to leverage advanced NLP techniques and models in their projects.

Code segmented and explained in Steps

Step 1: Import Required Libraries

This step imports the necessary libraries and modules required for the PDF processing and natural language processing tasks.

Make sure you import all the required libraries and follow the code bellow:

Example: import all the required libraries or dependencies


from PyPDF2 import PdfReader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings.ollama import OllamaEmbeddings
from langchain.vectorstores import FAISS
from langchain.chains.question_answering import load_qa_chain
from langchain.llms import Ollama
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

Step 2: Define the Main Function

This function serves as the entry point of the program.

Example: Defining the Main Function


def main():
    print("Ask your PDF šŸ’¬")

Step 3: Set up the Ollama Model

I LOVE this step because it gives me free open source model to use, thanks for the developers involved in this step.

This step initializes the Ollama language model with the specified model (“llama3”).

Example: Set up the Ollama model with the LLaMA3 model


    # Set up the Ollama model with the LLaMA3 model
    ollama_llm = Ollama(
        model="llama3",  # Specify LLaMA3 model
        callback_manager=CallbackManager([StreamingStdOutCallbackHandler()])
    )

Step 4: Upload the PDF File

Get freedom and ask your own data to use the model. In this case you need to update the pdf_path variable with the path to the PDF file you want to process

Example: uploading the pdf file


    # upload file
    pdf_path = "/path/to/your/pdf/file.pdf"  # Update with the path to your PDF file

Step 5: Extract Text from PDF

This is also important step as it reads the PDF file and extracts the text content from each page.

Example: extracting the text


    # extract the text
    with open(pdf_path, "rb") as file:
        pdf_reader = PdfReader(file)
        text = ""
        for page in pdf_reader.pages:
            text += page.extract_text()

Step 6: Split Text into Chunks

In this step we process the extracted text by splitting into smaller chunks to optimize processing.

Example: splitting into chunks


    # split into chunks
    text_splitter = CharacterTextSplitter(
        separator="\n",
        chunk_size=1000,
        chunk_overlap=200,
        length_function=len
    )
    chunks = text_splitter.split_text(text)

text_splitter: Responsible for breaking down the text into manageable chunks.

The `CharacterTextSplitter` class is used to split the extracted text into manageable chunks. This is done to improve the performance of the question answering process by reducing the amount of text that needs to be processed at once.

Step 7: Create Embeddings and Knowledge Base

Then the embeddings are generated for the text chunks using Ollama. The embeddings are also used to create a knowledge base for efficient search.

Example: creating the embeddings


    # create embeddings
    ollama_embeddings = OllamaEmbeddings()
    knowledge_base = FAISS.from_texts(chunks, ollama_embeddings)

The `OllamaEmbeddings` class is used to create embeddings for each chunk of text. These embeddings are then used as input to a similarity search function, which returns a list of relevant documents (in this case, PDF pages) based on the user’s question.

Step 8: Ask User for Question

In this step we get the questions from the user, therefore, we develop a Prompt, so the user to input a question about the PDF content.

Example: show user input


    # show user input
    user_question = input("Ask a question about your PDF: ")

Step 9: Perform Question Answering

This process does something if the user provides a question, the program searches for similar documents in the knowledge base and performs question answering using the loaded QA chain. The response is then printed to the console.

Example: Perform Question Answering Task


    if user_question:
        docs = knowledge_base.similarity_search(user_question)

        # load QA chain
        chain = load_qa_chain(ollama_llm, chain_type="stuff")

        # run the chain
        response = chain.run(input_documents=docs, question=user_question)

        print(response)

Note: The project uses a pre-trained QA chain model (`load_qa_chain`) to generate responses to user questions. This chain is run with the relevant documents and user question as input to produce an answer.

Step 10: Main Execution

This condition ensures that the main() function is executed when the script is run as the main program.

Example: call the main() method for Main Execution


if __name__ == '__main__':
    main()

Full Source Code

Example: Copy the Source Code and use it for free


from PyPDF2 import PdfReader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings.ollama import OllamaEmbeddings
from langchain.vectorstores import FAISS
from langchain.chains.question_answering import load_qa_chain
from langchain.llms import Ollama
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

def main():
    print("Ask your PDF šŸ’¬")

    # Set up the Ollama model with the LLaMA3 model
    ollama_llm = Ollama(
        model="llama3",  # Specify LLaMA3 model
        callback_manager=CallbackManager([StreamingStdOutCallbackHandler()])
    )

    # upload file
    pdf_path = "/path/to/your/pdf/file.pdf"  # Update with the path to your PDF file

    # extract the text
    with open(pdf_path, "rb") as file:
        pdf_reader = PdfReader(file)
        text = ""
        for page in pdf_reader.pages:
            text += page.extract_text()

    # split into chunks
    text_splitter = CharacterTextSplitter(
        separator="\n",
        chunk_size=1000,
        chunk_overlap=200,
        length_function=len
    )
    chunks = text_splitter.split_text(text)

    # create embeddings
    ollama_embeddings = OllamaEmbeddings()
    knowledge_base = FAISS.from_texts(chunks, ollama_embeddings)

    # show user input
    user_question = input("Ask a question about your PDF: ")
    if user_question:
        docs = knowledge_base.similarity_search(user_question)

        # load QA chain
        chain = load_qa_chain(ollama_llm, chain_type="stuff")

        # run the chain
        response = chain.run(input_documents=docs, question=user_question)

        print(response)


if __name__ == '__main__':
    main()

Benefits

Open-Source Frameworks and Libraries: The project usesĀ open-source libraries and models, which means that it is free and thatĀ users are able to modify or extend the code as needed.

Easy to Use: The application provides a simple interface for users to upload PDF files and ask questions about their contents, making it easy for non-technical users to get started with the system.

Customizable: Can be extended and customized to suit specific use cases and requirements.

Conclusion

In conclusion, our AI-powered question-answering system offers a seamless solution for extracting information from PDF documents quickly and accurately. By leveraging state-of-the-art language models and NLP techniques, users can efficiently access relevant information from large volumes of text, improving productivity and decision-making processes.

The project works as follows:

  1. The program begins by setting up the Ollama language model, which is a part of the langchain library. Ollama is a powerful language model capable of understanding and generating human-like text.
  2. The user uploads a PDF document, and the program extracts the text content from the PDF file.
  3. The extracted text is split into smaller, manageable chunks using the CharacterTextSplitter.
  4. Ollama embeddings are generated from the text chunks, creating a knowledge base for text similarity search.
  5. The user inputs a question about the PDF document.
  6. The program searches for similar text chunks in the knowledge base and retrieves relevant documents.
  7. A question-answering chain is loaded, and the model generates a response to the user’s question based on the retrieved documents.
  8. The response is displayed to the user.