Minor updates including pinning datasets version

2025-07-12 15:26:07 -04:00
parent 8a8493e387
commit a10872469d
17 changed files with 104 additions and 46 deletions
--- a/environment.yml
+++ b/environment.yml
@@ -17,16 +17,13 @@ dependencies:
  - scikit-learn
  - chromadb
  - jupyter-dash
-  - sentencepiece
  - pyarrow
  - pip:
    - beautifulsoup4
    - plotly
-    - bitsandbytes
    - transformers
    - sentence-transformers
-    - datasets
-    - accelerate
+    - datasets==3.6.0
    - openai
    - anthropic
    - google-generativeai
--- a/extras/trading/prototype_trader.ipynb
+++ b/extras/trading/prototype_trader.ipynb
@@ -346,7 +346,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.11"
+   "version": "3.11.13"
  }
 },
 "nbformat": 4,
--- a/requirements.txt
+++ b/requirements.txt
@@ -14,18 +14,15 @@ tqdm
 openai
 gradio
 langchain
-tiktoken
+langchain-core
+langchain-text-splitters
 langchain-openai
-langchain_experimental
-langchain_chroma
-langchain[docarray]
-datasets
-sentencepiece
+langchain-chroma
+langchain-community
+datasets==3.6.0
 matplotlib
 google-generativeai
 anthropic
-scikit-learn
-unstructured
 chromadb
 plotly
 jupyter-dash
@@ -33,9 +30,6 @@ beautifulsoup4
 pydub
 modal
 ollama
-accelerate
-sentencepiece
-bitsandbytes
 psutil
 setuptools
 speedtest-cli
--- a/week2/day1.ipynb
+++ b/week2/day1.ipynb
@@ -290,12 +290,12 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# If you have access to this, here is the reasoning model o3-mini\n",
+    "# If you have access to this, here is the reasoning model o4-mini\n",
    "# This is trained to think through its response before replying\n",
    "# So it will take longer but the answer should be more reasoned - not that this helps..\n",
    "\n",
    "completion = openai.chat.completions.create(\n",
-    "    model='o3-mini',\n",
+    "    model='o4-mini',\n",
    "    messages=prompts\n",
    ")\n",
    "print(completion.choices[0].message.content)"
@@ -308,12 +308,12 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# Claude 3.7 Sonnet\n",
+    "# Claude 4.0 Sonnet\n",
    "# API needs system message provided separately from user prompt\n",
    "# Also adding max_tokens\n",
    "\n",
    "message = claude.messages.create(\n",
-    "    model=\"claude-3-7-sonnet-latest\",\n",
+    "    model=\"claude-sonnet-4-20250514\",\n",
    "    max_tokens=200,\n",
    "    temperature=0.7,\n",
    "    system=system_message,\n",
@@ -332,12 +332,12 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# Claude 3.7 Sonnet again\n",
+    "# Claude 4.0 Sonnet again\n",
    "# Now let's add in streaming back results\n",
    "# If the streaming looks strange, then please see the note below this cell!\n",
    "\n",
    "result = claude.messages.stream(\n",
-    "    model=\"claude-3-7-sonnet-latest\",\n",
+    "    model=\"claude-sonnet-4-20250514\",\n",
    "    max_tokens=200,\n",
    "    temperature=0.7,\n",
    "    system=system_message,\n",
@@ -408,12 +408,28 @@
    ")\n",
    "\n",
    "response = gemini_via_openai_client.chat.completions.create(\n",
-    "    model=\"gemini-2.5-flash-preview-04-17\",\n",
+    "    model=\"gemini-2.5-flash\",\n",
    "    messages=prompts\n",
    ")\n",
    "print(response.choices[0].message.content)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "492f0ff2-8581-4836-bf00-37fddbe120eb",
+   "metadata": {},
+   "source": [
+    "# Sidenote:\n",
+    "\n",
+    "This alternative approach of using the client library from OpenAI to connect with other models has become extremely popular in recent months.\n",
+    "\n",
+    "So much so, that all the models now support this approach - including Anthropic.\n",
+    "\n",
+    "You can read more about this approach, with 4 examples, in the first section of this guide:\n",
+    "\n",
+    "https://github.com/ed-donner/agents/blob/main/guides/09_ai_apis_and_ollama.ipynb"
+   ]
+  },
  {
   "cell_type": "markdown",
   "id": "33f70c88-7ca9-470b-ad55-d93a57dcc0ab",
@@ -583,7 +599,7 @@
    "# Have it stream back results in markdown\n",
    "\n",
    "stream = openai.chat.completions.create(\n",
-    "    model='gpt-4o-mini',\n",
+    "    model='gpt-4.1-mini',\n",
    "    messages=prompts,\n",
    "    temperature=0.7,\n",
    "    stream=True\n",
@@ -634,11 +650,11 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# Let's make a conversation between GPT-4o-mini and Claude-3-haiku\n",
+    "# Let's make a conversation between GPT-4.1-mini and Claude-3.5-haiku\n",
    "# We're using cheap versions of models so the costs will be minimal\n",
    "\n",
-    "gpt_model = \"gpt-4o-mini\"\n",
-    "claude_model = \"claude-3-haiku-20240307\"\n",
+    "gpt_model = \"gpt-4.1-mini\"\n",
+    "claude_model = \"claude-3-5-haiku-latest\"\n",
    "\n",
    "gpt_system = \"You are a chatbot who is very argumentative; \\\n",
    "you disagree with anything in the conversation and you challenge everything, in a snarky way.\"\n",
@@ -774,6 +790,19 @@
    "\n",
    "Try creating a 3-way, perhaps bringing Gemini into the conversation! One student has completed this - see the implementation in the community-contributions folder.\n",
    "\n",
+    "The most reliable way to do this involves thinking a bit differently about your prompts: just 1 system prompt and 1 user prompt each time, and in the user prompt list the full conversation so far.\n",
+    "\n",
+    "Something like:\n",
+    "\n",
+    "```python\n",
+    "user_prompt = f\"\"\"\n",
+    "    You are Alex, in conversation with Blake and Charlie.\n",
+    "    The conversation so far is as follows:\n",
+    "    {conversation}\n",
+    "    Now with this, respond with what you would like to say next, as Alex.\n",
+    "    \"\"\"\n",
+    "```\n",
+    "\n",
    "Try doing this yourself before you look at the solutions. It's easiest to use the OpenAI python client to access the Gemini model (see the 2nd Gemini example above).\n",
    "\n",
    "## Additional exercise\n",
@@ -824,7 +853,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.12"
+   "version": "3.11.13"
  }
 },
 "nbformat": 4,
--- a/week2/day2.ipynb
+++ b/week2/day2.ipynb
@@ -568,7 +568,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.12"
+   "version": "3.11.13"
  }
 },
 "nbformat": 4,
--- a/week2/day3.ipynb
+++ b/week2/day3.ipynb
@@ -301,7 +301,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.11"
+   "version": "3.11.13"
  }
 },
 "nbformat": 4,
--- a/no_langchain/RAG_chat_no_LangChain.ipynb
+++ b/no_langchain/RAG_chat_no_LangChain.ipynb
@@ -386,7 +386,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.11"
+   "version": "3.11.13"
  }
 },
 "nbformat": 4,
--- a/week5/community-contributions/docuSeekAI/docuSeekAI.ipynb
+++ b/week5/community-contributions/docuSeekAI/docuSeekAI.ipynb
@@ -92,10 +92,24 @@
  }
 ],
 "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
  "language_info": {
-   "name": "python"
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.12"
  }
 },
 "nbformat": 4,
- "nbformat_minor": 2
+ "nbformat_minor": 4
 }
--- a/week5/day1.ipynb
+++ b/week5/day1.ipynb
@@ -256,7 +256,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.11"
+   "version": "3.11.13"
  }
 },
 "nbformat": 4,
--- a/week5/day4.5.ipynb
+++ b/week5/day4.5.ipynb
@@ -27,6 +27,20 @@
    "import gradio as gr"
   ]
  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "94a564ed-5cda-42d9-aada-2a5e85d02d15",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# install faiss-cpu!\n",
+    "# Mac users - this may fail if you don't have a recent version of MacOS\n",
+    "# In which case I recommend you skip this lab -- FAISS is not essential! (Or upgrade MacOS if you wish..)\n",
+    "\n",
+    "!pip install faiss-cpu"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": null,
@@ -400,7 +414,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.12"
+   "version": "3.11.13"
  }
 },
 "nbformat": 4,
--- a/week6/day1.ipynb
+++ b/week6/day1.ipynb
@@ -102,6 +102,18 @@
    "%matplotlib inline"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "cd6d801e-d195-45fe-898e-495dbcb19d7d",
+   "metadata": {},
+   "source": [
+    "## Load our dataset\n",
+    "\n",
+    "In the next cell, we load in the dataset from huggingface.\n",
+    "\n",
+    "If this gives you an error like \"trust_remote_code is no longer supported\", then please run this command in a new cell: `!pip install datasets==3.6.0` and then restart the Kernel, and try again."
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": null,
@@ -109,8 +121,6 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# Load in our dataset\n",
-    "\n",
    "dataset = load_dataset(\"McAuley-Lab/Amazon-Reviews-2023\", f\"raw_meta_Appliances\", split=\"full\", trust_remote_code=True)"
   ]
  },
@@ -429,7 +439,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.12"
+   "version": "3.11.13"
  }
 },
 "nbformat": 4,
--- a/week6/day2.ipynb
+++ b/week6/day2.ipynb
@@ -119,7 +119,7 @@
   "source": [
    "# Load in the same dataset as last time\n",
    "\n",
-    "items = ItemLoader(\"Appliances\").load()"
+    "items = ItemLoader(\"Home_and_Kitchen\").load()"
   ]
  },
  {
@@ -624,7 +624,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.12"
+   "version": "3.11.13"
  }
 },
 "nbformat": 4,
--- a/week6/day3.ipynb
+++ b/week6/day3.ipynb
@@ -918,7 +918,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.12"
+   "version": "3.11.13"
  }
 },
 "nbformat": 4,
--- a/week6/day4.ipynb
+++ b/week6/day4.ipynb
@@ -398,7 +398,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.11"
+   "version": "3.11.13"
  }
 },
 "nbformat": 4,
--- a/week6/lite.ipynb
+++ b/week6/lite.ipynb
@@ -427,7 +427,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.12"
+   "version": "3.11.13"
  }
 },
 "nbformat": 4,
--- a/week8/day3.ipynb
+++ b/week8/day3.ipynb
@@ -227,7 +227,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.11"
+   "version": "3.11.13"
  }
 },
 "nbformat": 4,
--- a/week8/day5.ipynb
+++ b/week8/day5.ipynb
@@ -171,7 +171,7 @@
    "            <span style=\"color:#900;\">If you're not fed up of product prices yet 😂 I've built this out some more!<br/>\n",
    "            If you look in my repo <a href=\"https://github.com/ed-donner/tech2ai\">tech2ai</a>, in segment3/lab1 is a neural network implementation of the pricer in pure PyTorch. It does pretty well..<br/>\n",
    "            And if you look in my repo <a href=\"https://github.com/ed-donner/agentic\">Agentic</a> in the workshop folder is the same Agent project taken further. There's a new version of the PlanningAgent called AutonomousPlanningAgent that uses multiple Tools, and a MessagingAgent that uses claude-3.7 to write texts. The AutonomousPlanningAgent uses the fantastic OpenAI Agents SDK and the mighty MCP protocol from Anthropic.<br/>\n",
-    "            If you're intrigued by Agents and MCP, and would like to learn more, then I also have a <a href=\"https://www.udemy.com/course/the-complete-agentic-ai-engineering-course/?referralCode=1B6986CDBD91FFC3651A\">companion course called the Complete Agentic AI Engineering Course</a> that might interest you (if you haven't had enough of me by now!!)\n",
+    "            If you're intrigued by Agents and MCP, and would like to learn more, then I also have a <a href=\"https://edwarddonner.com/2025/05/28/connecting-my-courses-become-an-llm-expert-and-leader/\">companion course called the Complete Agentic AI Engineering Course</a> that might interest you (if you haven't had enough of me by now!!), and also another course for leaders and founders looking to build a valuable business with LLMs.\n",
    "            </span>\n",
    "        </td>\n",
    "    </tr>\n",
@@ -223,7 +223,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.12"
+   "version": "3.11.13"
  }
 },
 "nbformat": 4,