
Are you interested in leveraging the power of AI to create an interactive chat website? In this step by step guide, I will show you all the process that you need in order to use the Langchain and Ollama, which is the two powerful tools in the world of natural language processing (NLP), to build a sophisticated chat application (Chat Website).
By the end of this tutorial, even beginners will be able to integrate advanced language models or Generative AI, into their web applications seamlessly.
Let’s dive in and explore how to set up these frameworks, understand their functionalities, and utilize them to enhance your website’s interactivity.
Using this technology, you can do research in a specific website (its content) and ask question or summary to you LLM application.
Table of Contents
Introduction to Langchain and Ollama
First thing you need is to understand what is LangChain and Ollama. They are the core principles to understand this application.
Check my other step by step guide LLM based applications:
Langchain
Langchain is an open-source library designed to create, train, and use language models and other natural language processing (NLP) tools. It is a game changer in AI, allowing developers to integrate advanced AI models into their applications seamlessly.
Ollama
Ollama is an AI model application that includes powerful language models like LLaMA3. It offers advanced capabilities for generating text based on a given prompt or input, making it a valuable tool for NLP tasks.
Setting Up Your Environment
Before we dive into the code, let’s set up the necessary dependencies.
Step 1: Install Required Libraries
First, install the required libraries using pip:
pip install requests langchain faiss-cpu
What is FAISS: FAISS stands Facebook AI Similarity Search, is an open-source library developed by Facebook AI Research. It is designed to efficiently search for similar vectors in large datasets, making it a powerful tool for tasks such as nearest neighbor search and clustering. FAISS is highly optimized for speed and memory usage, allowing it to handle large-scale data efficiently. It supports both exact and approximate nearest neighbor search, with a variety of algorithms and configurations to suit different needs. FAISS can be used in applications like recommendation systems, image retrieval, and natural language processing where finding similar items quickly is crucial.
Step 2: Download and Install Ollama
- Download Ollama: Visit the Ollama official website and download the application.
- Install Ollama: Follow the installation instructions provided on the website.
- Download the LLaMA3 Model: Open your terminal and run:
ollama run llama3
Writing the Code
Now that we have the necessary libraries installed, let’s write the code to create our chat website.
Step 1: Import Required Libraries
Begin by importing the required libraries:
import requests
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings.ollama import OllamaEmbeddings
from langchain.vectorstores import FAISS
from langchain.chains.question_answering import load_qa_chain
from langchain.llms import Ollama
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
Let me Explain to you the functionalities an Purpose of these Imported Libraries
See the explanations of the functionality and purpose of each of the imported libraries imported in our code:
requests
The requests
library is a popular Python library for making HTTP requests. It simplifies the process of sending HTTP requests and handling responses. In this code, it is used to fetch the content of a webpage.
Purpose: To send a GET request to a specified URL and retrieve the webpage content.
CharacterTextSplitter from langchain.text_splitter
The CharacterTextSplitter
class from the langchain.text_splitter
module is used to split large chunks of text into smaller, manageable pieces. It splits the text based on a specified separator, chunk size, and chunk overlap.
Purpose: To divide the website content into smaller chunks for easier processing and embedding.
OllamaEmbeddings from langchain.embeddings.ollama
The OllamaEmbeddings
class from the langchain.embeddings.ollama
module is used to convert text chunks into embeddings (numerical representations). These embeddings can then be used for similarity searches.
Purpose: To create numerical representations of the text chunks, making them suitable for similarity searches.
FAISS from langchain.vectorstores
The FAISS
library, integrated with langchain.vectorstores
, is used to efficiently store and search for similar vectors. FAISS is known for its high performance in handling large-scale similarity searches.
Purpose: To store the text embeddings and perform similarity searches to find the most relevant text chunks based on a user’s query.
load_qa_chain from langchain.chains.question_answering
The load_qa_chain
function from the langchain.chains.question_answering
module loads a question-answering chain. A QA chain is a sequence of processes that take input data and a question to produce an answer.
Purpose: To load and run the question-answering process using the text chunks and user queries.
Ollama from langchain.llms
The Ollama
class from the langchain.llms
module represents the language model used for generating responses. In this code, it is set up to use the LLaMA model.
Purpose: To generate responses based on the user’s question and the relevant text chunks.
CallbackManager from langchain.callbacks.manager
The CallbackManager
class from the langchain.callbacks.manager
module manages the callbacks during the model’s execution. Callbacks can be used to handle streaming outputs or log intermediate results.
Purpose: To manage and handle callbacks during the execution of the LLaMA model.
StreamingStdOutCallbackHandler from langchain.callbacks.streaming_stdout
The StreamingStdOutCallbackHandler
class from the langchain.callbacks.streaming_stdout
module is a specific type of callback handler that streams the output directly to the standard output (e.g., the console).
Purpose: To stream the generated responses from the LLaMA model directly to the console in real-time.
Check also how to Build Dynamic Web Pages with CrewAI and Advanced Language Models
Step 2: Set Up the Ollama Model
Set up the Ollama model with the LLaMA3 model:
ollama_llm = Ollama(
model="llama3",
callback_manager=CallbackManager([StreamingStdOutCallbackHandler()])
)
Step 3: Fetch Website Content
Fetch the content from the website you want to use as the knowledge base:
url = "https://python.langchain.com/docs/get_started/introduction"
response = requests.get(url)
website_content = response.text
Step 4: Split the Website Content
Use the CharacterTextSplitter to split the content into manageable chunks:
text_splitter = CharacterTextSplitter(
separator="\n",
chunk_size=1000,
chunk_overlap=200,
length_function=len
)
chunks = text_splitter.split_text(website_content)
Step 5: Create Embeddings and Knowledge Base
Create embeddings for the text chunks and set up a FAISS vector store:
ollama_embeddings = OllamaEmbeddings()
knowledge_base = FAISS.from_texts(chunks, ollama_embeddings)
Step 6: Get User Input and Search for Relevant Information
Prompt the user for a question and search for relevant information:
user_question = input("Your question: ")
if user_question:
docs = knowledge_base.similarity_search(user_question)
# Load QA chain
chain = load_qa_chain(ollama_llm, chain_type="stuff")
# Run the chain
response = chain.run(input_documents=docs, question=user_question)
# Check if the response contains relevant information from the website
if "python.langchain.com" not in response:
print("I don't know.")
else:
print(response)
Step 7: Specify the Main Function and Let everything Run
Wrap everything in a main function:
import requests
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings.ollama import OllamaEmbeddings
from langchain.vectorstores import FAISS
from langchain.chains.question_answering import load_qa_chain
from langchain.llms import Ollama
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
def main():
print("Ask your question 💬")
# Set up the Ollama model with the LLaMA3 model
ollama_llm = Ollama(
model="llama2", # Specify LLaMA3 model
callback_manager=CallbackManager([StreamingStdOutCallbackHandler()])
)
# Fetch content from the website
url = "https://python.langchain.com/docs/get_started/introduction"
response = requests.get(url)
website_content = response.text
# Extract relevant text from the website content
# You may need to use web scraping techniques to extract the specific text
# Here, let's assume we have extracted the text from the website
# Split into chunks
text_splitter = CharacterTextSplitter(
separator="\n",
chunk_size=1000,
chunk_overlap=200,
length_function=len
)
chunks = text_splitter.split_text(website_content)
# Create embeddings
ollama_embeddings = OllamaEmbeddings()
knowledge_base = FAISS.from_texts(chunks, ollama_embeddings)
# Get user input
user_question = input("Your question: ")
if user_question:
docs = knowledge_base.similarity_search(user_question)
# Load QA chain
chain = load_qa_chain(ollama_llm, chain_type="stuff")
# Run the chain
response = chain.run(input_documents=docs, question=user_question)
# Check if the response contains relevant information from the website
if "python.langchain.com" not in response:
print("I don't know.")
else:
print(response)
if __name__ == '__main__':
main()
Summary
Congratulations!! In this tutorial, we demonstrated how to set up and use the Langchain library with the Ollama model to create a chat website. By following these steps, you can fetch and process content from a website, split it into manageable chunks, create embeddings, and interact with users through an AI-powered chat interface. This approach showcases the powerful capabilities of combining advanced language models with effective text processing and search tools.