Files
LLM_Engineering_OLD/week5/day3.ipynb
2025-11-04 07:26:42 -05:00

207 lines
5.1 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## RAG Day 3\n",
"\n",
"### Expert Question Answerer for InsureLLM\n",
"\n",
"LangChain 1.0 implementation of a RAG pipeline.\n",
"\n",
"Using the VectorStore we created last time (with HuggingFace `all-MiniLM-L6-v2`)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from dotenv import load_dotenv\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"from langchain_chroma import Chroma\n",
"from langchain_core.messages import SystemMessage, HumanMessage\n",
"from langchain_huggingface import HuggingFaceEmbeddings\n",
"import gradio as gr"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"MODEL = \"gpt-4.1-nano\"\n",
"DB_NAME = \"vector_db\"\n",
"load_dotenv(override=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Connect to Chroma; use Hugging Face all-MiniLM-L6-v2"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"embeddings = HuggingFaceEmbeddings(model_name=\"all-MiniLM-L6-v2\")\n",
"vectorstore = Chroma(persist_directory=DB_NAME, embedding_function=embeddings)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Set up the 2 key LangChain objects: retriever and llm\n",
"\n",
"#### A sidebar on \"temperature\":\n",
"- Controls how diverse the output is\n",
"- A temperature of 0 means that the output should be predictable\n",
"- Higher temperature for more variety in answers\n",
"\n",
"Some people describe temperature as being like 'creativity' but that's not quite right\n",
"- It actually controls which tokens get selected during inference\n",
"- temperature=0 means: always select the token with highest probability\n",
"- temperature=1 usually means: a token with 10% probability should be picked 10% of the time\n",
"\n",
"Note: a temperature of 0 doesn't mean outputs will always be reproducible. You also need to set a random seed. We will do that in weeks 6-8. (Even then, it's not always reproducible.)\n",
"\n",
"Note 2: if you want creativity, use the System Prompt!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"retriever = vectorstore.as_retriever()\n",
"llm = ChatOpenAI(temperature=0, model_name=MODEL)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### These LangChain objects implement the method `invoke()`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"retriever.invoke(\"Who is Avery?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"llm.invoke(\"Who is Avery?\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Time to put this together!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"SYSTEM_PROMPT_TEMPLATE = \"\"\"\n",
"You are a knowledgeable, friendly assistant representing the company Insurellm.\n",
"You are chatting with a user about Insurellm.\n",
"If relevant, use the given context to answer any question.\n",
"If you don't know the answer, say so.\n",
"Context:\n",
"{context}\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def answer_question(question: str, history):\n",
" docs = retriever.invoke(question)\n",
" context = \"\\n\\n\".join(doc.page_content for doc in docs)\n",
" system_prompt = SYSTEM_PROMPT_TEMPLATE.format(context=context)\n",
" response = llm.invoke([SystemMessage(content=system_prompt), HumanMessage(content=question)])\n",
" return response.content"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"answer_question(\"Who is Averi Lancaster?\", [])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## What could possibly come next? 😂"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"gr.ChatInterface(answer_question).launch()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Admit it - you thought RAG would be more complicated than that!!"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.9"
}
},
"nbformat": 4,
"nbformat_minor": 4
}