{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## RAG Day 3\n",
    "\n",
    "### Expert Question Answerer for InsureLLM\n",
    "\n",
    "LangChain 1.0 implementation of a RAG pipeline.\n",
    "\n",
    "Using the VectorStore we created last time (with HuggingFace `all-MiniLM-L6-v2`)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from dotenv import load_dotenv\n",
    "from langchain_openai import ChatOpenAI\n",
    "\n",
    "from langchain_chroma import Chroma\n",
    "from langchain_core.messages import SystemMessage, HumanMessage\n",
    "from langchain_huggingface import HuggingFaceEmbeddings\n",
    "import gradio as gr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "MODEL = \"gpt-4.1-nano\"\n",
    "DB_NAME = \"vector_db\"\n",
    "load_dotenv(override=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Connect to Chroma; use Hugging Face all-MiniLM-L6-v2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "embeddings = HuggingFaceEmbeddings(model_name=\"all-MiniLM-L6-v2\")\n",
    "vectorstore = Chroma(persist_directory=DB_NAME, embedding_function=embeddings)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Set up the 2 key LangChain objects: retriever and llm\n",
    "\n",
    "#### A sidebar on \"temperature\":\n",
    "- Controls how diverse the output is\n",
    "- A temperature of 0 means that the output should be predictable\n",
    "- Higher temperature for more variety in answers\n",
    "\n",
    "Some people describe temperature as being like 'creativity' but that's not quite right\n",
    "- It actually controls which tokens get selected during inference\n",
    "- temperature=0 means: always select the token with highest probability\n",
    "- temperature=1 usually means: a token with 10% probability should be picked 10% of the time\n",
    "\n",
    "Note: a temperature of 0 doesn't mean outputs will always be reproducible. You also need to set a random seed. We will do that in weeks 6-8. (Even then, it's not always reproducible.)\n",
    "\n",
    "Note 2: if you want creativity, use the System Prompt!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "retriever = vectorstore.as_retriever()\n",
    "llm = ChatOpenAI(temperature=0, model_name=MODEL)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### These LangChain objects implement the method `invoke()`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "retriever.invoke(\"Who is Avery?\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "llm.invoke(\"Who is Avery?\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Time to put this together!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "SYSTEM_PROMPT_TEMPLATE = \"\"\"\n",
    "You are a knowledgeable, friendly assistant representing the company Insurellm.\n",
    "You are chatting with a user about Insurellm.\n",
    "If relevant, use the given context to answer any question.\n",
    "If you don't know the answer, say so.\n",
    "Context:\n",
    "{context}\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def answer_question(question: str, history):\n",
    "    docs = retriever.invoke(question)\n",
    "    context = \"\\n\\n\".join(doc.page_content for doc in docs)\n",
    "    system_prompt = SYSTEM_PROMPT_TEMPLATE.format(context=context)\n",
    "    response = llm.invoke([SystemMessage(content=system_prompt), HumanMessage(content=question)])\n",
    "    return response.content"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "answer_question(\"Who is Averi Lancaster?\", [])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## What could possibly come next? 😂"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "gr.ChatInterface(answer_question).launch()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Admit it - you thought RAG would be more complicated than that!!"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}