Updated models, added vector tests

2025-05-05 08:35:47 -04:00
parent e5816c514f
commit 1fb53d70de
7 changed files with 132 additions and 195 deletions
--- a/week2/day1.ipynb
+++ b/week2/day1.ipynb
@@ -226,9 +226,9 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# GPT-3.5-Turbo\n",
+    "# GPT-4o-mini\n",
    "\n",
-    "completion = openai.chat.completions.create(model='gpt-3.5-turbo', messages=prompts)\n",
+    "completion = openai.chat.completions.create(model='gpt-4o-mini', messages=prompts)\n",
    "print(completion.choices[0].message.content)"
   ]
  },
@@ -239,17 +239,33 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# GPT-4o-mini\n",
+    "# GPT-4.1-mini\n",
    "# Temperature setting controls creativity\n",
    "\n",
    "completion = openai.chat.completions.create(\n",
-    "    model='gpt-4o-mini',\n",
+    "    model='gpt-4.1-mini',\n",
    "    messages=prompts,\n",
    "    temperature=0.7\n",
    ")\n",
    "print(completion.choices[0].message.content)"
   ]
  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "12d2a549-9d6e-4ea0-9c3e-b96a39e9959e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# GPT-4.1-nano - extremely fast and cheap\n",
+    "\n",
+    "completion = openai.chat.completions.create(\n",
+    "    model='gpt-4.1-nano',\n",
+    "    messages=prompts\n",
+    ")\n",
+    "print(completion.choices[0].message.content)"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": null,
@@ -257,16 +273,34 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# GPT-4o\n",
+    "# GPT-4.1\n",
    "\n",
    "completion = openai.chat.completions.create(\n",
-    "    model='gpt-4o',\n",
+    "    model='gpt-4.1',\n",
    "    messages=prompts,\n",
    "    temperature=0.4\n",
    ")\n",
    "print(completion.choices[0].message.content)"
   ]
  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "96232ef4-dc9e-430b-a9df-f516685e7c9a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# If you have access to this, here is the reasoning model o3-mini\n",
+    "# This is trained to think through its response before replying\n",
+    "# So it will take longer but the answer should be more reasoned - not that this helps..\n",
+    "\n",
+    "completion = openai.chat.completions.create(\n",
+    "    model='o3-mini',\n",
+    "    messages=prompts\n",
+    ")\n",
+    "print(completion.choices[0].message.content)"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": null,
@@ -365,7 +399,8 @@
   "outputs": [],
   "source": [
    "# As an alternative way to use Gemini that bypasses Google's python API library,\n",
-    "# Google has recently released new endpoints that means you can use Gemini via the client libraries for OpenAI!\n",
+    "# Google released endpoints that means you can use Gemini via the client libraries for OpenAI!\n",
+    "# We're also trying Gemini's latest reasoning/thinking model\n",
    "\n",
    "gemini_via_openai_client = OpenAI(\n",
    "    api_key=google_api_key, \n",
@@ -373,7 +408,7 @@
    ")\n",
    "\n",
    "response = gemini_via_openai_client.chat.completions.create(\n",
-    "    model=\"gemini-2.0-flash\",\n",
+    "    model=\"gemini-2.5-flash-preview-04-17\",\n",
    "    messages=prompts\n",
    ")\n",
    "print(response.choices[0].message.content)"
@@ -488,6 +523,33 @@
    "print(\"Number of words:\", len(content.split(\" \")))"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "cbf0d5dd-7f20-4090-a46d-da56ceec218f",
+   "metadata": {},
+   "source": [
+    "## Additional exercise to build your experience with the models\n",
+    "\n",
+    "This is optional, but if you have time, it's so great to get first hand experience with the capabilities of these different models.\n",
+    "\n",
+    "You could go back and ask the same question via the APIs above to get your own personal experience with the pros & cons of the models.\n",
+    "\n",
+    "Later in the course we'll look at benchmarks and compare LLMs on many dimensions. But nothing beats personal experience!\n",
+    "\n",
+    "Here are some questions to try:\n",
+    "1. The question above: \"How many words are there in your answer to this prompt\"\n",
+    "2. A creative question: \"In 3 sentences, describe the color Blue to someone who's never been able to see\"\n",
+    "3. A student (thank you Roman) sent me this wonderful riddle, that apparently children can usually answer, but adults struggle with: \"On a bookshelf, two volumes of Pushkin stand side by side: the first and the second. The pages of each volume together have a thickness of 2 cm, and each cover is 2 mm thick. A worm gnawed (perpendicular to the pages) from the first page of the first volume to the last page of the second volume. What distance did it gnaw through?\".\n",
+    "\n",
+    "The answer may not be what you expect, and even though I'm quite good at puzzles, I'm embarrassed to admit that I got this one wrong.\n",
+    "\n",
+    "### What to look out for as you experiment with models\n",
+    "\n",
+    "1. How the Chat models differ from the Reasoning models (also known as Thinking models)\n",
+    "2. The ability to solve problems and the ability to be creative\n",
+    "3. Speed of generation\n"
+   ]
+  },
  {
   "cell_type": "markdown",
   "id": "c09e6b5c-6816-4cd3-a5cd-a20e4171b1a0",
@@ -762,7 +824,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.11"
+   "version": "3.11.12"
  }
 },
 "nbformat": 4,