Added DeepSeek to weeks 1, 2 and 8

2025-01-28 12:23:46 -05:00
parent 8cb97665af
commit 7d6d9959df
9 changed files with 298 additions and 9 deletions
--- a/SETUP-PC.md
+++ b/SETUP-PC.md
@@ -147,6 +147,7 @@ If you have other keys, you can add them too, or come back to this in future wee
 ```
 GOOGLE_API_KEY=xxxx
 ANTHROPIC_API_KEY=xxxx
+DEEPSEEK_API_KEY=xxxx
 HF_TOKEN=xxxx
 ```

--- a/SETUP-linux.md
+++ b/SETUP-linux.md
@@ -157,6 +157,7 @@ If you have other keys, you can add them too, or come back to this in future wee
 ```
 GOOGLE_API_KEY=xxxx
 ANTHROPIC_API_KEY=xxxx
+DEEPSEEK_API_KEY=xxxx
 HF_TOKEN=xxxx
 ```

--- a/SETUP-mac.md
+++ b/SETUP-mac.md
@@ -146,6 +146,7 @@ If you have other keys, you can add them too, or come back to this in future wee
 ```
 GOOGLE_API_KEY=xxxx
 ANTHROPIC_API_KEY=xxxx
+DEEPSEEK_API_KEY=xxxx
 HF_TOKEN=xxxx
 ```

--- a/EXERCISE.ipynb
+++ b/EXERCISE.ipynb
@@ -203,6 +203,46 @@
    "print(response.choices[0].message.content)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "bc7d1de3-e2ac-46ff-a302-3b4ba38c4c90",
+   "metadata": {},
+   "source": [
+    "## Also trying the amazing reasoning model DeepSeek\n",
+    "\n",
+    "Here we use the version of DeepSeek-reasoner that's been distilled to 1.5B.  \n",
+    "This is actually a 1.5B variant of Qwen that has been fine-tuned using synethic data generated by Deepseek R1.\n",
+    "\n",
+    "Other sizes of DeepSeek are [here](https://ollama.com/library/deepseek-r1) all the way up to the full 671B parameter version, which would use up 404GB of your drive and is far too large for most!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cf9eb44e-fe5b-47aa-b719-0bb63669ab3d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!ollama pull deepseek-r1:1.5b"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1d3d554b-e00d-4c08-9300-45e073950a76",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# This may take a few minutes to run! You should then see a fascinating \"thinking\" trace inside <think> tags, followed by some decent definitions\n",
+    "\n",
+    "response = ollama_via_openai.chat.completions.create(\n",
+    "    model=\"deepseek-r1:1.5b\",\n",
+    "    messages=[{\"role\": \"user\", \"content\": \"Please give definitions of some core concepts behind LLMs: a neural network, attention and the transformer\"}]\n",
+    ")\n",
+    "\n",
+    "print(response.choices[0].message.content)"
+   ]
+  },
  {
   "cell_type": "markdown",
   "id": "1622d9bb-5c68-4d4e-9ca4-b492c751f898",
--- a/week1/troubleshooting.ipynb
+++ b/week1/troubleshooting.ipynb
@@ -27,7 +27,15 @@
    "\n",
    "Click in the cell below and press Shift+Return to run it.  \n",
    "If this gives you problems, then please try working through these instructions to address:  \n",
-    "https://chatgpt.com/share/676e6e3b-db44-8012-abaa-b3cf62c83eb3"
+    "https://chatgpt.com/share/676e6e3b-db44-8012-abaa-b3cf62c83eb3\n",
+    "\n",
+    "I've also heard that you might have problems if you are using a work computer that's running security software zscaler.\n",
+    "\n",
+    "Some advice from students in this situation with zscaler:\n",
+    "\n",
+    "> In the anaconda prompt, this helped sometimes, although still got failures occasionally running code in Jupyter:\n",
+    "`conda config --set ssl_verify false`  \n",
+    "Another thing that helped was to add `verify=False` anytime where there is `request.get(..)`, so `request.get(url, headers=headers)` becomes `request.get(url, headers=headers, verify=False)`"
   ]
  },
  {
--- a/week2/day1.ipynb
+++ b/week2/day1.ipynb
@@ -69,12 +69,19 @@
    "For Anthropic, visit https://console.anthropic.com/  \n",
    "For Google, visit https://ai.google.dev/gemini-api  \n",
    "\n",
+    "### Also - adding DeepSeek if you wish\n",
+    "\n",
+    "Optionally, if you'd like to also use DeepSeek, create an account [here](https://platform.deepseek.com/), create a key [here](https://platform.deepseek.com/api_keys) and top up with at least the minimum $2 [here](https://platform.deepseek.com/top_up).\n",
+    "\n",
+    "### Adding API keys to your .env file\n",
+    "\n",
    "When you get your API keys, you need to set them as environment variables by adding them to your `.env` file.\n",
    "\n",
    "```\n",
    "OPENAI_API_KEY=xxxx\n",
    "ANTHROPIC_API_KEY=xxxx\n",
    "GOOGLE_API_KEY=xxxx\n",
+    "DEEPSEEK_API_KEY=xxxx\n",
    "```\n",
    "\n",
    "Afterwards, you may need to restart the Jupyter Lab Kernel (the Python process that sits behind this notebook) via the Kernel menu, and then rerun the cells from the top."
@@ -120,7 +127,7 @@
    "# Load environment variables in a file called .env\n",
    "# Print the key prefixes to help with any debugging\n",
    "\n",
-    "load_dotenv()\n",
+    "load_dotenv(override=True)\n",
    "openai_api_key = os.getenv('OPENAI_API_KEY')\n",
    "anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')\n",
    "google_api_key = os.getenv('GOOGLE_API_KEY')\n",
@@ -350,6 +357,123 @@
    "print(response.choices[0].message.content)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "33f70c88-7ca9-470b-ad55-d93a57dcc0ab",
+   "metadata": {},
+   "source": [
+    "## (Optional) Trying out the DeepSeek model\n",
+    "\n",
+    "### Let's ask DeepSeek a really hard question - both the Chat and the Reasoner model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3d0019fb-f6a8-45cb-962b-ef8bf7070d4d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Optionally if you wish to try DeekSeek, you can also use the OpenAI client library\n",
+    "\n",
+    "deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')\n",
+    "\n",
+    "if deepseek_api_key:\n",
+    "    print(f\"DeepSeek API Key exists and begins {deepseek_api_key[:3]}\")\n",
+    "else:\n",
+    "    print(\"DeepSeek API Key not set - please skip to the next section if you don't wish to try the DeepSeek API\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c72c871e-68d6-4668-9c27-96d52b77b867",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Using DeepSeek Chat\n",
+    "\n",
+    "deepseek_via_openai_client = OpenAI(\n",
+    "    api_key=deepseek_api_key, \n",
+    "    base_url=\"https://api.deepseek.com\"\n",
+    ")\n",
+    "\n",
+    "response = deepseek_via_openai_client.chat.completions.create(\n",
+    "    model=\"deepseek-chat\",\n",
+    "    messages=prompts,\n",
+    ")\n",
+    "\n",
+    "print(response.choices[0].message.content)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "50b6e70f-700a-46cf-942f-659101ffeceb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "challenge = [{\"role\": \"system\", \"content\": \"You are a helpful assistant\"},\n",
+    "             {\"role\": \"user\", \"content\": \"How many words are there in your answer to this prompt\"}]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "66d1151c-2015-4e37-80c8-16bc16367cfe",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Using DeepSeek Chat with a harder question! And streaming results\n",
+    "\n",
+    "stream = deepseek_via_openai_client.chat.completions.create(\n",
+    "    model=\"deepseek-chat\",\n",
+    "    messages=challenge,\n",
+    "    stream=True\n",
+    ")\n",
+    "\n",
+    "reply = \"\"\n",
+    "display_handle = display(Markdown(\"\"), display_id=True)\n",
+    "for chunk in stream:\n",
+    "    reply += chunk.choices[0].delta.content or ''\n",
+    "    reply = reply.replace(\"```\",\"\").replace(\"markdown\",\"\")\n",
+    "    update_display(Markdown(reply), display_id=display_handle.display_id)\n",
+    "\n",
+    "print(\"Number of words:\", len(reply.split(\" \")))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "43a93f7d-9300-48cc-8c1a-ee67380db495",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Using DeepSeek Reasoner - this may hit an error if DeepSeek is busy\n",
+    "# It's over-subscribed (as of 28-Jan-2025) but should come back online soon!\n",
+    "# If this fails, come back to this in a few days..\n",
+    "\n",
+    "response = deepseek_via_openai_client.chat.completions.create(\n",
+    "    model=\"deepseek-reasoner\",\n",
+    "    messages=challenge\n",
+    ")\n",
+    "\n",
+    "reasoning_content = response.choices[0].message.reasoning_content\n",
+    "content = response.choices[0].message.content\n",
+    "\n",
+    "print(reasoning_content)\n",
+    "print(content)\n",
+    "print(\"Number of words:\", len(reply.split(\" \")))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c09e6b5c-6816-4cd3-a5cd-a20e4171b1a0",
+   "metadata": {},
+   "source": [
+    "## Back to OpenAI with a serious question"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": null,
--- a/week8/agents/frontier_agent.py
+++ b/week8/agents/frontier_agent.py
@@ -23,11 +23,19 @@ class FrontierAgent(Agent):
    
    def __init__(self, collection):
        """
-        Set up this instance by connecting to OpenAI, to the Chroma Datastore,
+        Set up this instance by connecting to OpenAI or DeepSeek, to the Chroma Datastore,
        And setting up the vector encoding model
        """
        self.log("Initializing Frontier Agent")
-        self.openai = OpenAI()
+        deepseek_api_key = os.getenv("DEEPSEEK_API_KEY")
+        if deepseek_api_key:
+            self.client = OpenAI(api_key=deepseek_api_key, base_url="https://api.deepseek.com")
+            self.MODEL = "deepseek-chat"
+            self.log("Frontier Agent is set up with DeepSeek")
+        else:
+            self.client = OpenAI()
+            self.MODEL = "gpt-4o-mini"
+            self.log("Frontier Agent is setting up with OpenAI")
        self.collection = collection
        self.model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
        self.log("Frontier Agent is ready")
@@ -85,14 +93,14 @@ class FrontierAgent(Agent):

    def price(self, description: str) -> float:
        """
-        Make a call to OpenAI to estimate the price of the described product,
+        Make a call to OpenAI or DeepSeek to estimate the price of the described product,
        by looking up 5 similar products and including them in the prompt to give context
        :param description: a description of the product
        :return: an estimate of the price
        """
        documents, prices = self.find_similars(description)
-        self.log("Frontier Agent is about to call OpenAI with context including 5 similar products")
-        response = self.openai.chat.completions.create(
+        self.log(f"Frontier Agent is about to call {self.MODEL} with context including 5 similar products")
+        response = self.client.chat.completions.create(
            model=self.MODEL, 
            messages=self.messages_for(description, documents, prices),
            seed=42,
--- a/week8/day2.3.ipynb
+++ b/week8/day2.3.ipynb
@@ -209,7 +209,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "test[1].prompt"
+    "print(test[1].prompt)"
   ]
  },
  {
@@ -255,6 +255,16 @@
    "    return float(match.group()) if match else 0"
   ]
  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "06743833-c362-47f8-b02a-139be2cd52ab",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "get_price(\"The price for this is $99.99\")"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": null,
@@ -306,6 +316,86 @@
    "Tester.test(gpt_4o_mini_rag, test)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "d793c6d0-ce3f-4680-b37d-4643f0cd1d8e",
+   "metadata": {},
+   "source": [
+    "## Optional Extra: Trying a DeepSeek API call instead of OpenAI\n",
+    "\n",
+    "If you have a DeepSeek API key, we will use it here as an alternative implementation; otherwise skip to the next section.."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "21b6a22f-0195-47b6-8f6d-cab6ebe05742",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Connect to DeepSeek using the OpenAI client python library\n",
+    "\n",
+    "deepseek_api_key = os.getenv(\"DEEPSEEK_API_KEY\")\n",
+    "deepseek_via_openai_client = OpenAI(api_key=deepseek_api_key,base_url=\"https://api.deepseek.com\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ea7267d6-9489-4dac-a6e0-aec108e788c2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Added some retry logic here because DeepSeek is very oversubscribed and sometimes fails..\n",
+    "\n",
+    "def deepseek_api_rag(item):\n",
+    "    documents, prices = find_similars(item)\n",
+    "    retries = 8\n",
+    "    done = False\n",
+    "    while not done and retries > 0:\n",
+    "        try:\n",
+    "            response = deepseek_via_openai_client.chat.completions.create(\n",
+    "                model=\"deepseek-chat\", \n",
+    "                messages=messages_for(item, documents, prices),\n",
+    "                seed=42,\n",
+    "                max_tokens=8\n",
+    "            )\n",
+    "            reply = response.choices[0].message.content\n",
+    "            done = True\n",
+    "        except Exception as e:\n",
+    "            print(f\"Error: {e}\")\n",
+    "            retries -= 1\n",
+    "    return get_price(reply)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6560faf2-4dec-41e5-95e2-b2c46cdb3ba8",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "deepseek_api_rag(test[1])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0578b116-869f-429d-8382-701f1c0882f3",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "Tester.test(deepseek_api_rag, test)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6739870f-1eec-4547-965d-4b594e685697",
+   "metadata": {},
+   "source": [
+    "## And now to wrap this in an \"Agent\" class"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": null,
@@ -316,6 +406,20 @@
    "from agents.frontier_agent import FrontierAgent"
   ]
  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2efa7ba9-c2d7-4f95-8bb5-c4295bbeb01f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Let's print the logs so we can see what's going on\n",
+    "\n",
+    "import logging\n",
+    "root = logging.getLogger()\n",
+    "root.setLevel(logging.INFO)"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": null,
--- a/week8/day5.ipynb
+++ b/week8/day5.ipynb
@@ -141,7 +141,9 @@
   "source": [
    "# Running the final product\n",
    "\n",
-    "## Just hit shift + enter in the next cell, and let the deals flow in!!"
+    "## Just hit shift + enter in the next cell, and let the deals flow in!!\n",
+    "\n",
+    "Note that the Frontier Agent will use DeepSeek if there's a DEEPSEEK_API_KEY in your .env file, otherwise gpt-4o-mini."
   ]
  },
  {