1039 lines
32 KiB
Plaintext
1039 lines
32 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "06cf3063-9f3e-4551-a0d5-f08d9cabb927",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Welcome to Week 2!\n",
|
|
"\n",
|
|
"## Frontier Model APIs\n",
|
|
"\n",
|
|
"In Week 1, we used multiple Frontier LLMs through their Chat UI, and we connected with the OpenAI's API.\n",
|
|
"\n",
|
|
"Today we'll connect with them through their APIs.."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "2b268b6e-0ba4-461e-af86-74a41f4d681f",
|
|
"metadata": {},
|
|
"source": [
|
|
"<table style=\"margin: 0; text-align: left;\">\n",
|
|
" <tr>\n",
|
|
" <td style=\"width: 150px; height: 150px; vertical-align: middle;\">\n",
|
|
" <img src=\"../assets/important.jpg\" width=\"150\" height=\"150\" style=\"display: block;\" />\n",
|
|
" </td>\n",
|
|
" <td>\n",
|
|
" <h2 style=\"color:#900;\">Important Note - Please read me</h2>\n",
|
|
" <span style=\"color:#900;\">I'm continually improving these labs, adding more examples and exercises.\n",
|
|
" At the start of each week, it's worth checking you have the latest code.<br/>\n",
|
|
" First do a git pull and merge your changes as needed</a>. Check out the GitHub guide for instructions. Any problems? Try asking ChatGPT to clarify how to merge - or contact me!<br/>\n",
|
|
" </span>\n",
|
|
" </td>\n",
|
|
" </tr>\n",
|
|
"</table>\n",
|
|
"<table style=\"margin: 0; text-align: left;\">\n",
|
|
" <tr>\n",
|
|
" <td style=\"width: 150px; height: 150px; vertical-align: middle;\">\n",
|
|
" <img src=\"../assets/resources.jpg\" width=\"150\" height=\"150\" style=\"display: block;\" />\n",
|
|
" </td>\n",
|
|
" <td>\n",
|
|
" <h2 style=\"color:#f71;\">Reminder about the resources page</h2>\n",
|
|
" <span style=\"color:#f71;\">Here's a link to resources for the course. This includes links to all the slides.<br/>\n",
|
|
" <a href=\"https://edwarddonner.com/2024/11/13/llm-engineering-resources/\">https://edwarddonner.com/2024/11/13/llm-engineering-resources/</a><br/>\n",
|
|
" Please keep this bookmarked, and I'll continue to add more useful links there over time.\n",
|
|
" </span>\n",
|
|
" </td>\n",
|
|
" </tr>\n",
|
|
"</table>"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "85cfe275-4705-4d30-abea-643fbddf1db0",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Setting up your keys - OPTIONAL!\n",
|
|
"\n",
|
|
"We're now going to try asking a bunch of models some questions!\n",
|
|
"\n",
|
|
"This is totally optional. If you have keys to Anthropic, Gemini or others, then you can add them in.\n",
|
|
"\n",
|
|
"If you'd rather not spend the extra, then just watch me do it!\n",
|
|
"\n",
|
|
"For OpenAI, visit https://openai.com/api/ \n",
|
|
"For Anthropic, visit https://console.anthropic.com/ \n",
|
|
"For Google, visit https://ai.google.dev/gemini-api \n",
|
|
"For DeepSeek, visit https://platform.deepseek.com/ \n",
|
|
"For Groq, visit https://console.groq.com/ \n",
|
|
"For Grok, visit https://console.x.ai/ \n",
|
|
"\n",
|
|
"\n",
|
|
"You can also use OpenRouter as your one-stop-shop for many of these! OpenRouter is \"the unified interface for LLMs\":\n",
|
|
"\n",
|
|
"For OpenRouter, visit https://openrouter.ai/ \n",
|
|
"\n",
|
|
"\n",
|
|
"With each of the above, you typically have to navigate to:\n",
|
|
"1. Their billing page to add the minimum top-up (except Gemini, Groq, Google, OpenRouter may have free tiers)\n",
|
|
"2. Their API key page to collect your API key\n",
|
|
"\n",
|
|
"### Adding API keys to your .env file\n",
|
|
"\n",
|
|
"When you get your API keys, you need to set them as environment variables by adding them to your `.env` file.\n",
|
|
"\n",
|
|
"```\n",
|
|
"OPENAI_API_KEY=xxxx\n",
|
|
"ANTHROPIC_API_KEY=xxxx\n",
|
|
"GOOGLE_API_KEY=xxxx\n",
|
|
"DEEPSEEK_API_KEY=xxxx\n",
|
|
"GROQ_API_KEY=xxxx\n",
|
|
"GROK_API_KEY=xxxx\n",
|
|
"OPENROUTER_API_KEY=xxxx\n",
|
|
"```\n",
|
|
"\n",
|
|
"<table style=\"margin: 0; text-align: left;\">\n",
|
|
" <tr>\n",
|
|
" <td style=\"width: 150px; height: 150px; vertical-align: middle;\">\n",
|
|
" <img src=\"../assets/important.jpg\" width=\"150\" height=\"150\" style=\"display: block;\" />\n",
|
|
" </td>\n",
|
|
" <td>\n",
|
|
" <h2 style=\"color:#900;\">Any time you change your .env file</h2>\n",
|
|
" <span style=\"color:#900;\">Remember to Save it! And also rerun load_dotenv(override=True)<br/>\n",
|
|
" </span>\n",
|
|
" </td>\n",
|
|
" </tr>\n",
|
|
"</table>"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "de23bb9e-37c5-4377-9a82-d7b6c648eeb6",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# imports\n",
|
|
"\n",
|
|
"import os\n",
|
|
"import requests\n",
|
|
"from dotenv import load_dotenv\n",
|
|
"from openai import OpenAI\n",
|
|
"from IPython.display import Markdown, display"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "b0abffac",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"load_dotenv(override=True)\n",
|
|
"openai_api_key = os.getenv('OPENAI_API_KEY')\n",
|
|
"anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')\n",
|
|
"google_api_key = os.getenv('GOOGLE_API_KEY')\n",
|
|
"deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')\n",
|
|
"groq_api_key = os.getenv('GROQ_API_KEY')\n",
|
|
"grok_api_key = os.getenv('GROK_API_KEY')\n",
|
|
"openrouter_api_key = os.getenv('OPENROUTER_API_KEY')\n",
|
|
"\n",
|
|
"if openai_api_key:\n",
|
|
" print(f\"OpenAI API Key exists and begins {openai_api_key[:8]}\")\n",
|
|
"else:\n",
|
|
" print(\"OpenAI API Key not set\")\n",
|
|
" \n",
|
|
"if anthropic_api_key:\n",
|
|
" print(f\"Anthropic API Key exists and begins {anthropic_api_key[:7]}\")\n",
|
|
"else:\n",
|
|
" print(\"Anthropic API Key not set (and this is optional)\")\n",
|
|
"\n",
|
|
"if google_api_key:\n",
|
|
" print(f\"Google API Key exists and begins {google_api_key[:2]}\")\n",
|
|
"else:\n",
|
|
" print(\"Google API Key not set (and this is optional)\")\n",
|
|
"\n",
|
|
"if deepseek_api_key:\n",
|
|
" print(f\"DeepSeek API Key exists and begins {deepseek_api_key[:3]}\")\n",
|
|
"else:\n",
|
|
" print(\"DeepSeek API Key not set (and this is optional)\")\n",
|
|
"\n",
|
|
"if groq_api_key:\n",
|
|
" print(f\"Groq API Key exists and begins {groq_api_key[:4]}\")\n",
|
|
"else:\n",
|
|
" print(\"Groq API Key not set (and this is optional)\")\n",
|
|
"\n",
|
|
"if grok_api_key:\n",
|
|
" print(f\"Grok API Key exists and begins {grok_api_key[:4]}\")\n",
|
|
"else:\n",
|
|
" print(\"Grok API Key not set (and this is optional)\")\n",
|
|
"\n",
|
|
"if openrouter_api_key:\n",
|
|
" print(f\"OpenRouter API Key exists and begins {openrouter_api_key[:3]}\")\n",
|
|
"else:\n",
|
|
" print(\"OpenRouter API Key not set (and this is optional)\")\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "985a859a",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Connect to OpenAI client library\n",
|
|
"# A thin wrapper around calls to HTTP endpoints\n",
|
|
"\n",
|
|
"openai = OpenAI()\n",
|
|
"\n",
|
|
"# For Gemini, DeepSeek and Groq, we can use the OpenAI python client\n",
|
|
"# Because Google and DeepSeek have endpoints compatible with OpenAI\n",
|
|
"# And OpenAI allows you to change the base_url\n",
|
|
"\n",
|
|
"anthropic_url = \"https://api.anthropic.com/v1/\"\n",
|
|
"gemini_url = \"https://generativelanguage.googleapis.com/v1beta/openai/\"\n",
|
|
"deepseek_url = \"https://api.deepseek.com\"\n",
|
|
"groq_url = \"https://api.groq.com/openai/v1\"\n",
|
|
"grok_url = \"https://api.x.ai/v1\"\n",
|
|
"openrouter_url = \"https://openrouter.ai/api/v1\"\n",
|
|
"ollama_url = \"http://localhost:11434/v1\"\n",
|
|
"\n",
|
|
"anthropic = OpenAI(api_key=anthropic_api_key, base_url=anthropic_url)\n",
|
|
"gemini = OpenAI(api_key=google_api_key, base_url=gemini_url)\n",
|
|
"deepseek = OpenAI(api_key=deepseek_api_key, base_url=deepseek_url)\n",
|
|
"groq = OpenAI(api_key=groq_api_key, base_url=groq_url)\n",
|
|
"grok = OpenAI(api_key=grok_api_key, base_url=grok_url)\n",
|
|
"openrouter = OpenAI(base_url=openrouter_url, api_key=openrouter_api_key)\n",
|
|
"ollama = OpenAI(api_key=\"ollama\", base_url=ollama_url)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "16813180",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"tell_a_joke = [\n",
|
|
" {\"role\": \"user\", \"content\": \"Tell a joke for a student on the journey to becoming an expert in LLM Engineering\"},\n",
|
|
"]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "23e92304",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"response = openai.chat.completions.create(model=\"gpt-4.1-mini\", messages=tell_a_joke)\n",
|
|
"display(Markdown(response.choices[0].message.content))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "e03c11b9",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"response = anthropic.chat.completions.create(model=\"claude-sonnet-4-5-20250929\", messages=tell_a_joke)\n",
|
|
"display(Markdown(response.choices[0].message.content))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "ab6ea76a",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Training vs Inference time scaling"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "afe9e11c",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"easy_puzzle = [\n",
|
|
" {\"role\": \"user\", \"content\": \n",
|
|
" \"You toss 2 coins. One of them is heads. What's the probability the other is tails? Answer with the probability only.\"},\n",
|
|
"]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "4a887eb3",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"response = openai.chat.completions.create(model=\"gpt-5-nano\", messages=easy_puzzle, reasoning_effort=\"minimal\")\n",
|
|
"display(Markdown(response.choices[0].message.content))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "5f854d01",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"response = openai.chat.completions.create(model=\"gpt-5-nano\", messages=easy_puzzle, reasoning_effort=\"low\")\n",
|
|
"display(Markdown(response.choices[0].message.content))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "f45fc55b",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"response = openai.chat.completions.create(model=\"gpt-5-mini\", messages=easy_puzzle, reasoning_effort=\"minimal\")\n",
|
|
"display(Markdown(response.choices[0].message.content))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "ca713a5c",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Testing out the best models on the planet"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "df1e825b",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"hard = \"\"\"\n",
|
|
"On a bookshelf, two volumes of Pushkin stand side by side: the first and the second.\n",
|
|
"The pages of each volume together have a thickness of 2 cm, and each cover is 2 mm thick.\n",
|
|
"A worm gnawed (perpendicular to the pages) from the first page of the first volume to the last page of the second volume.\n",
|
|
"What distance did it gnaw through?\n",
|
|
"\"\"\"\n",
|
|
"hard_puzzle = [\n",
|
|
" {\"role\": \"user\", \"content\": hard}\n",
|
|
"]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "8f6a7827",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"response = openai.chat.completions.create(model=\"gpt-5-nano\", messages=hard_puzzle, reasoning_effort=\"minimal\")\n",
|
|
"display(Markdown(response.choices[0].message.content))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "d693ac0d",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"response = anthropic.chat.completions.create(model=\"claude-sonnet-4-5-20250929\", messages=hard_puzzle)\n",
|
|
"display(Markdown(response.choices[0].message.content))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "7de7818f",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"response = openai.chat.completions.create(model=\"gpt-5\", messages=hard_puzzle)\n",
|
|
"display(Markdown(response.choices[0].message.content))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "de1dc5fa",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"response = gemini.chat.completions.create(model=\"gemini-2.5-pro\", messages=hard_puzzle)\n",
|
|
"display(Markdown(response.choices[0].message.content))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "9a9faf98",
|
|
"metadata": {},
|
|
"source": [
|
|
"## A spicy challenge to test the competitive spirit"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "fc1824ad",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"dilemma_prompt = \"\"\"\n",
|
|
"You and a partner are contestants on a game show. You're each taken to separate rooms and given a choice:\n",
|
|
"Cooperate: Choose \"Share\" — if both of you choose this, you each win $1,000.\n",
|
|
"Defect: Choose \"Steal\" — if one steals and the other shares, the stealer gets $2,000 and the sharer gets nothing.\n",
|
|
"If both steal, you both get nothing.\n",
|
|
"Do you choose to Steal or Share? Pick one.\n",
|
|
"\"\"\"\n",
|
|
"\n",
|
|
"dilemma = [\n",
|
|
" {\"role\": \"user\", \"content\": dilemma_prompt},\n",
|
|
"]\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "09807f1a",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"response = anthropic.chat.completions.create(model=\"claude-sonnet-4-5-20250929\", messages=dilemma)\n",
|
|
"display(Markdown(response.choices[0].message.content))\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "230f49d6",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"response = groq.chat.completions.create(model=\"openai/gpt-oss-120b\", messages=dilemma)\n",
|
|
"display(Markdown(response.choices[0].message.content))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "421f08df",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"response = deepseek.chat.completions.create(model=\"deepseek-reasoner\", messages=dilemma)\n",
|
|
"display(Markdown(response.choices[0].message.content))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "2599fc6e",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"response = grok.chat.completions.create(model=\"grok-4\", messages=dilemma)\n",
|
|
"display(Markdown(response.choices[0].message.content))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "162752e9",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Going local\n",
|
|
"\n",
|
|
"Just use the OpenAI library pointed to localhost:11434/v1"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "ba03ee29",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"requests.get(\"http://localhost:11434/\").content\n",
|
|
"\n",
|
|
"# If not running, run ollama serve at a command line"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "f363cd6b",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"!ollama pull llama3.2"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "96e97263",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Only do this if you have a large machine - at least 16GB RAM\n",
|
|
"\n",
|
|
"!ollama pull gpt-oss:20b"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "a3bfc78a",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"response = ollama.chat.completions.create(model=\"llama3.2\", messages=easy_puzzle)\n",
|
|
"display(Markdown(response.choices[0].message.content))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "9a5527a3",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"response = ollama.chat.completions.create(model=\"gpt-oss:20b\", messages=easy_puzzle)\n",
|
|
"display(Markdown(response.choices[0].message.content))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "a0628309",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Gemini and Anthropic Client Library\n",
|
|
"\n",
|
|
"We're going via the OpenAI Python Client Library, but the other providers have their libraries too"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "f0a8ab2b-6134-4104-a1bc-c3cd7ea4cd36",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from google import genai\n",
|
|
"\n",
|
|
"client = genai.Client()\n",
|
|
"\n",
|
|
"response = client.models.generate_content(\n",
|
|
" model=\"gemini-2.5-flash-lite\", contents=\"Describe the color Blue to someone who's never been able to see in 1 sentence\"\n",
|
|
")\n",
|
|
"print(response.text)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "df7b6c63",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from anthropic import Anthropic\n",
|
|
"\n",
|
|
"client = Anthropic()\n",
|
|
"\n",
|
|
"response = client.messages.create(\n",
|
|
" model=\"claude-sonnet-4-5-20250929\",\n",
|
|
" messages=[{\"role\": \"user\", \"content\": \"Describe the color Blue to someone who's never been able to see in 1 sentence\"}],\n",
|
|
" max_tokens=100\n",
|
|
")\n",
|
|
"print(response.content[0].text)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "45a9d0eb",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Routers and Abtraction Layers\n",
|
|
"\n",
|
|
"Starting with the wonderful OpenRouter.ai - it can connect to all the models above!\n",
|
|
"\n",
|
|
"Visit openrouter.ai and browse the models.\n",
|
|
"\n",
|
|
"Here's one we haven't seen yet: GLM 4.5 from Chinese startup z.ai"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "9fac59dc",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"response = openrouter.chat.completions.create(model=\"z-ai/glm-4.5\", messages=tell_a_joke)\n",
|
|
"display(Markdown(response.choices[0].message.content))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "b58908e6",
|
|
"metadata": {},
|
|
"source": [
|
|
"## And now a first look at the powerful, mighty (and quite heavyweight) LangChain"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "02e145ad",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from langchain_openai import ChatOpenAI\n",
|
|
"\n",
|
|
"llm = ChatOpenAI(model=\"gpt-5-mini\")\n",
|
|
"response = llm.invoke(tell_a_joke)\n",
|
|
"\n",
|
|
"display(Markdown(response.content))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "92d49785",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Finally - my personal fave - the wonderfully lightweight LiteLLM"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "63e42515",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from litellm import completion\n",
|
|
"response = completion(model=\"openai/gpt-4.1\", messages=tell_a_joke)\n",
|
|
"reply = response.choices[0].message.content\n",
|
|
"display(Markdown(reply))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "36f787f5",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"print(f\"Input tokens: {response.usage.prompt_tokens}\")\n",
|
|
"print(f\"Output tokens: {response.usage.completion_tokens}\")\n",
|
|
"print(f\"Total tokens: {response.usage.total_tokens}\")\n",
|
|
"print(f\"Total cost: {response._hidden_params[\"response_cost\"]*100:.4f} cents\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "28126494",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Now - let's use LiteLLM to illustrate a Pro-feature: prompt caching"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "f8a91ef4",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"with open(\"hamlet.txt\", \"r\", encoding=\"utf-8\") as f:\n",
|
|
" hamlet = f.read()\n",
|
|
"\n",
|
|
"loc = hamlet.find(\"Speak, man\")\n",
|
|
"print(hamlet[loc:loc+100])"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "7f34f670",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"question = [{\"role\": \"user\", \"content\": \"In Hamlet, when Laertes asks 'Where is my father?' what is the reply?\"}]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "9db6c82b",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"response = completion(model=\"gemini/gemini-2.5-flash-lite\", messages=question)\n",
|
|
"display(Markdown(response.choices[0].message.content))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "228b7e7c",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"print(f\"Input tokens: {response.usage.prompt_tokens}\")\n",
|
|
"print(f\"Output tokens: {response.usage.completion_tokens}\")\n",
|
|
"print(f\"Total tokens: {response.usage.total_tokens}\")\n",
|
|
"print(f\"Total cost: {response._hidden_params[\"response_cost\"]*100:.4f} cents\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "11e37e43",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"question[0][\"content\"] += \"\\n\\nFor context, here is the entire text of Hamlet:\\n\\n\"+hamlet"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "37afb28b",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"response = completion(model=\"gemini/gemini-2.5-flash-lite\", messages=question)\n",
|
|
"display(Markdown(response.choices[0].message.content))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "d84edecf",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"print(f\"Input tokens: {response.usage.prompt_tokens}\")\n",
|
|
"print(f\"Output tokens: {response.usage.completion_tokens}\")\n",
|
|
"print(f\"Cached tokens: {response.usage.prompt_tokens_details.cached_tokens}\")\n",
|
|
"print(f\"Total cost: {response._hidden_params[\"response_cost\"]*100:.4f} cents\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "515d1a94",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"response = completion(model=\"gemini/gemini-2.5-flash-lite\", messages=question)\n",
|
|
"display(Markdown(response.choices[0].message.content))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "eb5dd403",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"print(f\"Input tokens: {response.usage.prompt_tokens}\")\n",
|
|
"print(f\"Output tokens: {response.usage.completion_tokens}\")\n",
|
|
"print(f\"Cached tokens: {response.usage.prompt_tokens_details.cached_tokens}\")\n",
|
|
"print(f\"Total cost: {response._hidden_params[\"response_cost\"]*100:.4f} cents\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "00f5a3b7",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Prompt Caching with OpenAI\n",
|
|
"\n",
|
|
"For OpenAI:\n",
|
|
"\n",
|
|
"https://platform.openai.com/docs/guides/prompt-caching\n",
|
|
"\n",
|
|
"> Cache hits are only possible for exact prefix matches within a prompt. To realize caching benefits, place static content like instructions and examples at the beginning of your prompt, and put variable content, such as user-specific information, at the end. This also applies to images and tools, which must be identical between requests.\n",
|
|
"\n",
|
|
"\n",
|
|
"Cached input is 4X cheaper\n",
|
|
"\n",
|
|
"https://openai.com/api/pricing/"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "b98964f9",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Prompt Caching with Anthropic\n",
|
|
"\n",
|
|
"https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching\n",
|
|
"\n",
|
|
"You have to tell Claude what you are caching\n",
|
|
"\n",
|
|
"You pay 25% MORE to \"prime\" the cache\n",
|
|
"\n",
|
|
"Then you pay 10X less to reuse from the cache with inputs.\n",
|
|
"\n",
|
|
"https://www.anthropic.com/pricing#api"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "67d960dd",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Gemini supports both 'implicit' and 'explicit' prompt caching\n",
|
|
"\n",
|
|
"https://ai.google.dev/gemini-api/docs/caching?lang=python"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "f6e09351-1fbe-422f-8b25-f50826ab4c5f",
|
|
"metadata": {},
|
|
"source": [
|
|
"## And now for some fun - an adversarial conversation between Chatbots..\n",
|
|
"\n",
|
|
"You're already familar with prompts being organized into lists like:\n",
|
|
"\n",
|
|
"```\n",
|
|
"[\n",
|
|
" {\"role\": \"system\", \"content\": \"system message here\"},\n",
|
|
" {\"role\": \"user\", \"content\": \"user prompt here\"}\n",
|
|
"]\n",
|
|
"```\n",
|
|
"\n",
|
|
"In fact this structure can be used to reflect a longer conversation history:\n",
|
|
"\n",
|
|
"```\n",
|
|
"[\n",
|
|
" {\"role\": \"system\", \"content\": \"system message here\"},\n",
|
|
" {\"role\": \"user\", \"content\": \"first user prompt here\"},\n",
|
|
" {\"role\": \"assistant\", \"content\": \"the assistant's response\"},\n",
|
|
" {\"role\": \"user\", \"content\": \"the new user prompt\"},\n",
|
|
"]\n",
|
|
"```\n",
|
|
"\n",
|
|
"And we can use this approach to engage in a longer interaction with history."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "bcb54183-45d3-4d08-b5b6-55e380dfdf1b",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Let's make a conversation between GPT-4.1-mini and Claude-3.5-haiku\n",
|
|
"# We're using cheap versions of models so the costs will be minimal\n",
|
|
"\n",
|
|
"gpt_model = \"gpt-4.1-mini\"\n",
|
|
"claude_model = \"claude-3-5-haiku-latest\"\n",
|
|
"\n",
|
|
"gpt_system = \"You are a chatbot who is very argumentative; \\\n",
|
|
"you disagree with anything in the conversation and you challenge everything, in a snarky way.\"\n",
|
|
"\n",
|
|
"claude_system = \"You are a very polite, courteous chatbot. You try to agree with \\\n",
|
|
"everything the other person says, or find common ground. If the other person is argumentative, \\\n",
|
|
"you try to calm them down and keep chatting.\"\n",
|
|
"\n",
|
|
"gpt_messages = [\"Hi there\"]\n",
|
|
"claude_messages = [\"Hi\"]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "1df47dc7-b445-4852-b21b-59f0e6c2030f",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def call_gpt():\n",
|
|
" messages = [{\"role\": \"system\", \"content\": gpt_system}]\n",
|
|
" for gpt, claude in zip(gpt_messages, claude_messages):\n",
|
|
" messages.append({\"role\": \"assistant\", \"content\": gpt})\n",
|
|
" messages.append({\"role\": \"user\", \"content\": claude})\n",
|
|
" response = openai.chat.completions.create(model=gpt_model, messages=messages)\n",
|
|
" return response.choices[0].message.content"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "9dc6e913-02be-4eb6-9581-ad4b2cffa606",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"call_gpt()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "7d2ed227-48c9-4cad-b146-2c4ecbac9690",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def call_claude():\n",
|
|
" messages = [{\"role\": \"system\", \"content\": claude_system}]\n",
|
|
" for gpt, claude_message in zip(gpt_messages, claude_messages):\n",
|
|
" messages.append({\"role\": \"user\", \"content\": gpt})\n",
|
|
" messages.append({\"role\": \"assistant\", \"content\": claude_message})\n",
|
|
" messages.append({\"role\": \"user\", \"content\": gpt_messages[-1]})\n",
|
|
" response = anthropic.chat.completions.create(model=claude_model, messages=messages)\n",
|
|
" return response.choices[0].message.content"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "01395200-8ae9-41f8-9a04-701624d3fd26",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"call_claude()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "08c2279e-62b0-4671-9590-c82eb8d1e1ae",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"call_gpt()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "0275b97f-7f90-4696-bbf5-b6642bd53cbd",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"gpt_messages = [\"Hi there\"]\n",
|
|
"claude_messages = [\"Hi\"]\n",
|
|
"\n",
|
|
"display(Markdown(f\"### GPT:\\n{gpt_messages[0]}\\n\"))\n",
|
|
"display(Markdown(f\"### Claude:\\n{claude_messages[0]}\\n\"))\n",
|
|
"\n",
|
|
"for i in range(5):\n",
|
|
" gpt_next = call_gpt()\n",
|
|
" display(Markdown(f\"### GPT:\\n{gpt_next}\\n\"))\n",
|
|
" gpt_messages.append(gpt_next)\n",
|
|
" \n",
|
|
" claude_next = call_claude()\n",
|
|
" display(Markdown(f\"### Claude:\\n{claude_next}\\n\"))\n",
|
|
" claude_messages.append(claude_next)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "1d10e705-db48-4290-9dc8-9efdb4e31323",
|
|
"metadata": {},
|
|
"source": [
|
|
"<table style=\"margin: 0; text-align: left;\">\n",
|
|
" <tr>\n",
|
|
" <td style=\"width: 150px; height: 150px; vertical-align: middle;\">\n",
|
|
" <img src=\"../assets/important.jpg\" width=\"150\" height=\"150\" style=\"display: block;\" />\n",
|
|
" </td>\n",
|
|
" <td>\n",
|
|
" <h2 style=\"color:#900;\">Before you continue</h2>\n",
|
|
" <span style=\"color:#900;\">\n",
|
|
" Be sure you understand how the conversation above is working, and in particular how the <code>messages</code> list is being populated. Add print statements as needed. Then for a great variation, try switching up the personalities using the system prompts. Perhaps one can be pessimistic, and one optimistic?<br/>\n",
|
|
" </span>\n",
|
|
" </td>\n",
|
|
" </tr>\n",
|
|
"</table>"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "3637910d-2c6f-4f19-b1fb-2f916d23f9ac",
|
|
"metadata": {},
|
|
"source": [
|
|
"# More advanced exercises\n",
|
|
"\n",
|
|
"Try creating a 3-way, perhaps bringing Gemini into the conversation! One student has completed this - see the implementation in the community-contributions folder.\n",
|
|
"\n",
|
|
"The most reliable way to do this involves thinking a bit differently about your prompts: just 1 system prompt and 1 user prompt each time, and in the user prompt list the full conversation so far.\n",
|
|
"\n",
|
|
"Something like:\n",
|
|
"\n",
|
|
"```python\n",
|
|
"system_prompt = \"\"\"\n",
|
|
"You are Alex, a chatbot who is very argumentative; you disagree with anything in the conversation and you challenge everything, in a snarky way.\n",
|
|
"You are in a conversation with Blake and Charlie.\n",
|
|
"\"\"\"\n",
|
|
"\n",
|
|
"user_prompt = f\"\"\"\n",
|
|
"You are Alex, in conversation with Blake and Charlie.\n",
|
|
"The conversation so far is as follows:\n",
|
|
"{conversation}\n",
|
|
"Now with this, respond with what you would like to say next, as Alex.\n",
|
|
"\"\"\"\n",
|
|
"```\n",
|
|
"\n",
|
|
"Try doing this yourself before you look at the solutions. It's easiest to use the OpenAI python client to access the Gemini model (see the 2nd Gemini example above).\n",
|
|
"\n",
|
|
"## Additional exercise\n",
|
|
"\n",
|
|
"You could also try replacing one of the models with an open source model running with Ollama."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "446c81e3-b67e-4cd9-8113-bc3092b93063",
|
|
"metadata": {},
|
|
"source": [
|
|
"<table style=\"margin: 0; text-align: left;\">\n",
|
|
" <tr>\n",
|
|
" <td style=\"width: 150px; height: 150px; vertical-align: middle;\">\n",
|
|
" <img src=\"../assets/business.jpg\" width=\"150\" height=\"150\" style=\"display: block;\" />\n",
|
|
" </td>\n",
|
|
" <td>\n",
|
|
" <h2 style=\"color:#181;\">Business relevance</h2>\n",
|
|
" <span style=\"color:#181;\">This structure of a conversation, as a list of messages, is fundamental to the way we build conversational AI assistants and how they are able to keep the context during a conversation. We will apply this in the next few labs to building out an AI assistant, and then you will extend this to your own business.</span>\n",
|
|
" </td>\n",
|
|
" </tr>\n",
|
|
"</table>"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "c23224f6-7008-44ed-a57f-718975f4e291",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": ".venv",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.12.9"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|