207 lines
5.1 KiB
Plaintext
207 lines
5.1 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## RAG Day 3\n",
|
|
"\n",
|
|
"### Expert Question Answerer for InsureLLM\n",
|
|
"\n",
|
|
"LangChain 1.0 implementation of a RAG pipeline.\n",
|
|
"\n",
|
|
"Using the VectorStore we created last time (with HuggingFace `all-MiniLM-L6-v2`)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from dotenv import load_dotenv\n",
|
|
"from langchain_openai import ChatOpenAI\n",
|
|
"\n",
|
|
"from langchain_chroma import Chroma\n",
|
|
"from langchain_core.messages import SystemMessage, HumanMessage\n",
|
|
"from langchain_huggingface import HuggingFaceEmbeddings\n",
|
|
"import gradio as gr"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"MODEL = \"gpt-4.1-nano\"\n",
|
|
"DB_NAME = \"vector_db\"\n",
|
|
"load_dotenv(override=True)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Connect to Chroma; use Hugging Face all-MiniLM-L6-v2"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"embeddings = HuggingFaceEmbeddings(model_name=\"all-MiniLM-L6-v2\")\n",
|
|
"vectorstore = Chroma(persist_directory=DB_NAME, embedding_function=embeddings)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Set up the 2 key LangChain objects: retriever and llm\n",
|
|
"\n",
|
|
"#### A sidebar on \"temperature\":\n",
|
|
"- Controls how diverse the output is\n",
|
|
"- A temperature of 0 means that the output should be predictable\n",
|
|
"- Higher temperature for more variety in answers\n",
|
|
"\n",
|
|
"Some people describe temperature as being like 'creativity' but that's not quite right\n",
|
|
"- It actually controls which tokens get selected during inference\n",
|
|
"- temperature=0 means: always select the token with highest probability\n",
|
|
"- temperature=1 usually means: a token with 10% probability should be picked 10% of the time\n",
|
|
"\n",
|
|
"Note: a temperature of 0 doesn't mean outputs will always be reproducible. You also need to set a random seed. We will do that in weeks 6-8. (Even then, it's not always reproducible.)\n",
|
|
"\n",
|
|
"Note 2: if you want creativity, use the System Prompt!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"retriever = vectorstore.as_retriever()\n",
|
|
"llm = ChatOpenAI(temperature=0, model_name=MODEL)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### These LangChain objects implement the method `invoke()`"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"retriever.invoke(\"Who is Avery?\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"llm.invoke(\"Who is Avery?\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Time to put this together!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"SYSTEM_PROMPT_TEMPLATE = \"\"\"\n",
|
|
"You are a knowledgeable, friendly assistant representing the company Insurellm.\n",
|
|
"You are chatting with a user about Insurellm.\n",
|
|
"If relevant, use the given context to answer any question.\n",
|
|
"If you don't know the answer, say so.\n",
|
|
"Context:\n",
|
|
"{context}\n",
|
|
"\"\"\""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def answer_question(question: str, history):\n",
|
|
" docs = retriever.invoke(question)\n",
|
|
" context = \"\\n\\n\".join(doc.page_content for doc in docs)\n",
|
|
" system_prompt = SYSTEM_PROMPT_TEMPLATE.format(context=context)\n",
|
|
" response = llm.invoke([SystemMessage(content=system_prompt), HumanMessage(content=question)])\n",
|
|
" return response.content"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"answer_question(\"Who is Averi Lancaster?\", [])"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## What could possibly come next? 😂"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"gr.ChatInterface(answer_question).launch()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Admit it - you thought RAG would be more complicated than that!!"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": ".venv",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.12.9"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 4
|
|
}
|