Launching refreshed version of LLM Engineering weeks 1-4 - see README
This commit is contained in:
@@ -1,380 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "5c291475-8c7c-461c-9b12-545a887b2432",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Jupyter Lab\n",
|
||||
"\n",
|
||||
"## A Quick Start Guide\n",
|
||||
"\n",
|
||||
"Welcome to the wonderful world of Jupyter lab! \n",
|
||||
"This is a Data Science playground where you can easily write code and investigate the results. It's an ideal environment for: \n",
|
||||
"- Research & Development\n",
|
||||
"- Prototyping\n",
|
||||
"- Learning (that's us!)\n",
|
||||
"\n",
|
||||
"It's not typically used for shipping production code, and in Week 8 we'll explore the bridge between Jupyter and python code.\n",
|
||||
"\n",
|
||||
"A file in Jupyter Lab, like this one, is called a **Notebook**.\n",
|
||||
"\n",
|
||||
"A long time ago, Jupyter used to be called \"IPython\", and so the extensions of notebooks are \".ipynb\" which stands for \"IPython Notebook\".\n",
|
||||
"\n",
|
||||
"On the left is a File Browser that lets you navigate around the directories and choose different notebooks. But you probably know that already, or you wouldn't have got here!\n",
|
||||
"\n",
|
||||
"The notebook consists of a series of square boxes called \"cells\". Some of them contain text, like this cell, and some of them contain code, like the cell below.\n",
|
||||
"\n",
|
||||
"Click in a cell with code and press `Shift + Return` (or `Shift + Enter`) to run the code and print the output.\n",
|
||||
"\n",
|
||||
"Do that now for the cell below this:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "33d37cd8-55c9-4e03-868c-34aa9cab2c80",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Click anywhere in this cell and press Shift + Return\n",
|
||||
"\n",
|
||||
"2 + 2"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "9e95df7b-55c6-4204-b8f9-cae83360fc23",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Congrats!\n",
|
||||
"\n",
|
||||
"Now run the next cell which sets a value, followed by the cells after it to print the value"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "585eb9c1-85ee-4c27-8dc2-b4d8d022eda0",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Set a value for a variable\n",
|
||||
"\n",
|
||||
"favorite_fruit = \"bananas\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "07792faa-761d-46cb-b9b7-2bbf70bb1628",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# The result of the last statement is shown after you run it\n",
|
||||
"\n",
|
||||
"favorite_fruit"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "a067d2b1-53d5-4aeb-8a3c-574d39ff654a",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Use the variable\n",
|
||||
"\n",
|
||||
"print(f\"My favorite fruit is {favorite_fruit}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "4c5a4e60-b7f4-4953-9e80-6d84ba4664ad",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Now change the variable\n",
|
||||
"\n",
|
||||
"favorite_fruit = f\"anything but {favorite_fruit}\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "9442d5c9-f57d-4839-b0af-dce58646c04f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Now go back and rerun the cell with the print statement, two cells back\n",
|
||||
"\n",
|
||||
"See how it prints something different, even though favorite_fruit was changed further down in the notebook? \n",
|
||||
"\n",
|
||||
"The order that code appears in the notebook doesn't matter. What matters is the order that the code is **executed**. There's a python process sitting behind this notebook in which the variables are being changed.\n",
|
||||
"\n",
|
||||
"This catches some people out when they first use Jupyter."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "8e5ec81d-7c5b-4025-bd2e-468d67b581b6",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Then run this cell twice, and see if you understand what's going on\n",
|
||||
"\n",
|
||||
"print(f\"My favorite fruit is {favorite_fruit}\")\n",
|
||||
"\n",
|
||||
"favorite_fruit = \"apples\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a29dab2d-bab9-4a54-8504-05e62594cc6f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Explaining the 'kernel'\n",
|
||||
"\n",
|
||||
"Sitting behind this notebook is a Python process which executes each cell when you run it. That Python process is known as the Kernel. Each notebook has its own separate Kernel.\n",
|
||||
"\n",
|
||||
"You can go to the Kernel menu and select \"Restart Kernel\".\n",
|
||||
"\n",
|
||||
"If you then try to run the next cell, you'll get an error, because favorite_fruit is no longer defined. You'll need to run the cells from the top of the notebook again. Then the next cell should run fine."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "84b1e410-5eda-4e2c-97ce-4eebcff816c5",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"print(f\"My favorite fruit is {favorite_fruit}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "4d4188fc-d9cc-42be-8b4e-ae8630456764",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Adding and moving cells\n",
|
||||
"\n",
|
||||
"Click in this cell, then click the \\[+\\] button in the toolbar above to create a new cell immediately below this one. Copy and paste in the code in the prior cell, then run it! There are also icons in the top right of the selected cell to delete it (bin), duplicate it, and move it up and down.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "ce258424-40c3-49a7-9462-e6fa25014b03",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "30e71f50-8f01-470a-9d7a-b82a6cef4236",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Cell output\n",
|
||||
"\n",
|
||||
"When you execute a cell, the standard output and the result of the last statement is written to the area immediately under the code, known as the 'cell output'. When you save a Notebook from the file menu (or command+S), the output is also saved, making it a useful record of what happened.\n",
|
||||
"\n",
|
||||
"You can clean this up by going to Edit menu >> Clear Outputs of All Cells, or Kernel menu >> Restart Kernel and Clear Outputs of All Cells."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "a4d021e2-c284-411f-8ab1-030530cfbe72",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"spams = [\"spam\"] * 1000\n",
|
||||
"print(spams)\n",
|
||||
"\n",
|
||||
"# Might be worth clearing output after running this!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "eac060f2-7a71-46e7-8235-b6ad0a76f5f8",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Using markdown\n",
|
||||
"\n",
|
||||
"So what's going on with these areas with writing in them, like this one? Well, there's actually a different kind of cell called a 'Markdown' cell for adding explanations like this. Click the + button to add a cell. Then in the toolbar, click where it says 'Code' and change it to 'Markdown'.\n",
|
||||
"\n",
|
||||
"Add some comments using Markdown format, perhaps copying and pasting from here:\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"# This is a heading\n",
|
||||
"## This is a sub-head\n",
|
||||
"### And a sub-sub-head\n",
|
||||
"\n",
|
||||
"I like Jupyter Lab because it's\n",
|
||||
"- Easy\n",
|
||||
"- Flexible\n",
|
||||
"- Satisfying\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"And to turn this into formatted text simply with Shift+Return in the cell.\n",
|
||||
"Click in the cell and press the Bin icon if you want to remove it."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "e1586320-c90f-4f22-8b39-df6865484950",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "1330c83c-67ac-4ca0-ac92-a71699e0c31b",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# The exclamation point\n",
|
||||
"\n",
|
||||
"There's a super useful feature of jupyter labs; you can type a command with a ! in front of it in a code cell, like:\n",
|
||||
"\n",
|
||||
"!pip install \\[some_package\\]\n",
|
||||
"\n",
|
||||
"And it will run it at the command line (as if in Windows Powershell or Mac Terminal) and print the result"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "82042fc5-a907-4381-a4b8-eb9386df19cd",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# list the current directory\n",
|
||||
"\n",
|
||||
"!ls"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "4fc3e3da-8a55-40cc-9706-48bf12a0e20e",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# ping cnn.com - press the stop button in the toolbar when you're bored\n",
|
||||
"\n",
|
||||
"!ping cnn.com"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "a58e9462-89a2-4b4f-b4aa-51c4bd9f796b",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# This is a useful command that ensures your Anaconda environment \n",
|
||||
"# is up to date with any new upgrades to packages;\n",
|
||||
"# But it might take a minute and will print a lot to output\n",
|
||||
"\n",
|
||||
"!conda env update -f ../environment.yml"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "4688baaf-a72c-41b5-90b6-474cb24790a7",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Minor things we encounter on the course\n",
|
||||
"\n",
|
||||
"This isn't necessarily a feature of Jupyter, but it's a nice package to know about that is useful in Jupyter Labs, and I use it in the course.\n",
|
||||
"\n",
|
||||
"The package `tqdm` will print a nice progress bar if you wrap any iterable."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "2646a4e5-3c23-4aee-a34d-d623815187d2",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Here's some code with no progress bar\n",
|
||||
"# It will take 10 seconds while you wonder what's happpening..\n",
|
||||
"\n",
|
||||
"import time\n",
|
||||
"\n",
|
||||
"spams = [\"spam\"] * 1000\n",
|
||||
"\n",
|
||||
"for spam in spams:\n",
|
||||
" time.sleep(0.01)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "6e96be3d-fa82-42a3-a8aa-b81dd20563a5",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# And now, with a nice little progress bar:\n",
|
||||
"\n",
|
||||
"import time\n",
|
||||
"from tqdm import tqdm\n",
|
||||
"\n",
|
||||
"spams = [\"spam\"] * 1000\n",
|
||||
"\n",
|
||||
"for spam in tqdm(spams):\n",
|
||||
" time.sleep(0.01)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "63c788dd-4618-4bb4-a5ce-204411a38ade",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# On a different topic, here's a useful way to print output in markdown\n",
|
||||
"\n",
|
||||
"from IPython.display import Markdown, display\n",
|
||||
"\n",
|
||||
"display(Markdown(\"# This is a big heading!\\n\\n- And this is a bullet-point\\n- So is this\\n- Me, too!\"))\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "9d14c1fb-3321-4387-b6ca-9af27676f980",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# That's it! You're up to speed on Jupyter Lab.\n",
|
||||
"\n",
|
||||
"## Want to be even more advanced?\n",
|
||||
"\n",
|
||||
"If you want to become a pro at Jupyter Lab, you can read their tutorial [here](https://jupyterlab.readthedocs.io/en/latest/). But this isn't required for our course; just a good technique for hitting Shift + Return and enjoying the result!"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.11"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
@@ -1,486 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "5c291475-8c7c-461c-9b12-545a887b2432",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Intermediate Level Python\n",
|
||||
"\n",
|
||||
"## Getting you up to speed\n",
|
||||
"\n",
|
||||
"This course assumes that you're at an intermediate level of python. For example, you should have a decent idea what something like this might do:\n",
|
||||
"\n",
|
||||
"`yield from {book.get(\"author\") for book in books if book.get(\"author\")}`\n",
|
||||
"\n",
|
||||
"If not - then you've come to the right place! Welcome to the crash course in intermediate level python. The best way to learn is by doing!\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "542f0577-a826-4613-a5d7-4170e9666d04",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## First: if you need a refresher on the foundations\n",
|
||||
"\n",
|
||||
"I'm going to defer to an AI friend for this, because these explanations are so well written with great examples. Copy and paste the code examples into a new cell to give them a try. Pick whichever section(s) you'd like to brush up on.\n",
|
||||
"\n",
|
||||
"**Python imports:** \n",
|
||||
"https://chatgpt.com/share/672f9f31-8114-8012-be09-29ef0d0140fb\n",
|
||||
"\n",
|
||||
"**Python functions** including default arguments: \n",
|
||||
"https://chatgpt.com/share/672f9f99-7060-8012-bfec-46d4cf77d672\n",
|
||||
"\n",
|
||||
"**Python strings**, including slicing, split/join, replace and literals: \n",
|
||||
"https://chatgpt.com/share/672fb526-0aa0-8012-9e00-ad1687c04518\n",
|
||||
"\n",
|
||||
"**Python f-strings** including number and date formatting: \n",
|
||||
"https://chatgpt.com/share/672fa125-0de0-8012-8e35-27918cbb481c\n",
|
||||
"\n",
|
||||
"**Python lists, dicts and sets**, including the `get()` method: \n",
|
||||
"https://chatgpt.com/share/672fa225-3f04-8012-91af-f9c95287da8d\n",
|
||||
"\n",
|
||||
"**Python files** including modes, encoding, context managers, Path, glob.glob: \n",
|
||||
"https://chatgpt.com/share/673b53b2-6d5c-8012-a344-221056c2f960\n",
|
||||
"\n",
|
||||
"**Python classes:** \n",
|
||||
"https://chatgpt.com/share/672fa07a-1014-8012-b2ea-6dc679552715\n",
|
||||
"\n",
|
||||
"**Pickling Python objects and converting to JSON:** \n",
|
||||
"https://chatgpt.com/share/673b553e-9d0c-8012-9919-f3bb5aa23e31"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "f9e0f8e1-09b3-478b-ada7-c8c35003929b",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## With this in mind - understanding NameErrors in Python\n",
|
||||
"\n",
|
||||
"It's quite common to hit a NameError in python. With foundational knowledge, you should always feel equipped to debug a NameError and get to the bottom of it.\n",
|
||||
"\n",
|
||||
"If you're unsure how to fix a NameError, please see this [initial guide](https://chatgpt.com/share/67958312-ada0-8012-a1d3-62b3a5fcbbfc) and this [second guide with exercises](https://chatgpt.com/share/67a57e0b-0194-8012-bb50-8ea76c5995b8), and work through them both until you have high confidence.\n",
|
||||
"\n",
|
||||
"There's some repetition here, so feel free to skip it if you're already confident.\n",
|
||||
"\n",
|
||||
"## And now, on to the code!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "5802e2f0-0ea0-4237-bbb7-f375a34260f0",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# First let's create some things:\n",
|
||||
"\n",
|
||||
"fruits = [\"Apples\", \"Bananas\", \"Pears\"]\n",
|
||||
"\n",
|
||||
"book1 = {\"title\": \"Great Expectations\", \"author\": \"Charles Dickens\"}\n",
|
||||
"book2 = {\"title\": \"Bleak House\", \"author\": \"Charles Dickens\"}\n",
|
||||
"book3 = {\"title\": \"An Book By No Author\"}\n",
|
||||
"book4 = {\"title\": \"Moby Dick\", \"author\": \"Herman Melville\"}\n",
|
||||
"\n",
|
||||
"books = [book1, book2, book3, book4]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "9b941e6a-3658-4144-a8d4-72f5e72f3707",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Part 1: List and dict comprehensions"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "61992bb8-735d-4dad-8747-8c10b63aec82",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Simple enough to start\n",
|
||||
"\n",
|
||||
"for fruit in fruits:\n",
|
||||
" print(fruit)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "c89c3842-9b74-47fa-8424-0fcb08e4177c",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Let's make a new version of fruits\n",
|
||||
"\n",
|
||||
"fruits_shouted = []\n",
|
||||
"for fruit in fruits:\n",
|
||||
" fruits_shouted.append(fruit.upper())\n",
|
||||
"\n",
|
||||
"fruits_shouted"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "4ec13b3a-9545-44f1-874a-2910a0663560",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# You probably already know this\n",
|
||||
"# There's a nice Python construct called \"list comprehension\" that does this:\n",
|
||||
"\n",
|
||||
"fruits_shouted2 = [fruit.upper() for fruit in fruits]\n",
|
||||
"fruits_shouted2"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "ecc08c3c-181d-4b64-a3e1-b0ccffc6c0cd",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# But you may not know that you can do this to create dictionaries, too:\n",
|
||||
"\n",
|
||||
"fruit_mapping = {fruit: fruit.upper() for fruit in fruits}\n",
|
||||
"fruit_mapping"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "500c2406-00d2-4793-b57b-f49b612760c8",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# you can also use the if statement to filter the results\n",
|
||||
"\n",
|
||||
"fruits_with_longer_names_shouted = [fruit.upper() for fruit in fruits if len(fruit)>5]\n",
|
||||
"fruits_with_longer_names_shouted"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "38c11c34-d71e-45ba-945b-a3d37dc29793",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"fruit_mapping_unless_starts_with_a = {fruit: fruit.upper() for fruit in fruits if not fruit.startswith('A')}\n",
|
||||
"fruit_mapping_unless_starts_with_a"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "5c97d8e8-31de-4afa-973e-28d8e5cab749",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Another comprehension\n",
|
||||
"\n",
|
||||
"[book['title'] for book in books]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "50be0edc-a4cd-493f-a680-06080bb497b4",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# This code will fail with an error because one of our books doesn't have an author\n",
|
||||
"\n",
|
||||
"[book['author'] for book in books]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "53794083-cc09-4edb-b448-2ffb7e8495c2",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# But this will work, because get() returns None\n",
|
||||
"\n",
|
||||
"[book.get('author') for book in books]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "b8e4b859-24f8-4016-8d74-c2cef226d049",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# And this variation will filter out the None\n",
|
||||
"\n",
|
||||
"[book.get('author') for book in books if book.get('author')]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "c44bb999-52b4-4dee-810b-8a400db8f25f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# And this version will convert it into a set, removing duplicates\n",
|
||||
"\n",
|
||||
"set([book.get('author') for book in books if book.get('author')])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "80a65156-6192-4bb4-b4e6-df3fdc933891",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# And finally, this version is even nicer\n",
|
||||
"# curly braces creates a set, so this is a set comprehension\n",
|
||||
"\n",
|
||||
"{book.get('author') for book in books if book.get('author')}"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "c100e5db-5438-4715-921c-3f7152f83f4a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Part 2: Generators\n",
|
||||
"\n",
|
||||
"We use Generators in the course because AI models can stream back results.\n",
|
||||
"\n",
|
||||
"If you've not used Generators before, please start with this excellent intro from ChatGPT:\n",
|
||||
"\n",
|
||||
"https://chatgpt.com/share/672faa6e-7dd0-8012-aae5-44fc0d0ec218\n",
|
||||
"\n",
|
||||
"Try pasting some of its examples into a cell."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "1efc26fa-9144-4352-9a17-dfec1d246aad",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# First define a generator; it looks like a function, but it has yield instead of return\n",
|
||||
"\n",
|
||||
"import time\n",
|
||||
"\n",
|
||||
"def come_up_with_fruit_names():\n",
|
||||
" for fruit in fruits:\n",
|
||||
" time.sleep(1) # thinking of a fruit\n",
|
||||
" yield fruit"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "eac338bb-285c-45c8-8a3e-dbfc41409ca3",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Then use it\n",
|
||||
"\n",
|
||||
"for fruit in come_up_with_fruit_names():\n",
|
||||
" print(fruit)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "f6880578-a3de-4502-952a-4572b95eb9ff",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Here's another one\n",
|
||||
"\n",
|
||||
"def authors_generator():\n",
|
||||
" for book in books:\n",
|
||||
" if book.get(\"author\"):\n",
|
||||
" yield book.get(\"author\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "9e316f02-f87f-441d-a01f-024ade949607",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Use it\n",
|
||||
"\n",
|
||||
"for author in authors_generator():\n",
|
||||
" print(author)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "7535c9d0-410e-4e56-a86c-ae6c0e16053f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Here's the same thing written with list comprehension\n",
|
||||
"\n",
|
||||
"def authors_generator():\n",
|
||||
" for author in [book.get(\"author\") for book in books if book.get(\"author\")]:\n",
|
||||
" yield author"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "dad34494-0f6c-4edb-b03f-b8d49ee186f2",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Use it\n",
|
||||
"\n",
|
||||
"for author in authors_generator():\n",
|
||||
" print(author)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "abeb7e61-d8aa-4af0-b05a-ae17323e678c",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Here's a nice shortcut\n",
|
||||
"# You can use \"yield from\" to yield each item of an iterable\n",
|
||||
"\n",
|
||||
"def authors_generator():\n",
|
||||
" yield from [book.get(\"author\") for book in books if book.get(\"author\")]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "05b0cb43-aa83-4762-a797-d3beb0f22c44",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Use it\n",
|
||||
"\n",
|
||||
"for author in authors_generator():\n",
|
||||
" print(author)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "fdfea58e-d809-4dd4-b7b0-c26427f8be55",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# And finally - we can replace the list comprehension with a set comprehension\n",
|
||||
"\n",
|
||||
"def unique_authors_generator():\n",
|
||||
" yield from {book.get(\"author\") for book in books if book.get(\"author\")}"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "3e821d08-97be-4db9-9a5b-ce5dced3eff8",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Use it\n",
|
||||
"\n",
|
||||
"for author in unique_authors_generator():\n",
|
||||
" print(author)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "905ba603-15d8-4d01-9a79-60ec293d7ca1",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# And for some fun - press the stop button in the toolbar when bored!\n",
|
||||
"# It's like we've made our own Large Language Model... although not particularly large..\n",
|
||||
"# See if you understand why it prints a letter at a time, instead of a word at a time. If you're unsure, try removing the keyword \"from\" everywhere in the code.\n",
|
||||
"\n",
|
||||
"import random\n",
|
||||
"import time\n",
|
||||
"\n",
|
||||
"pronouns = [\"I\", \"You\", \"We\", \"They\"]\n",
|
||||
"verbs = [\"eat\", \"detest\", \"bathe in\", \"deny the existence of\", \"resent\", \"pontificate about\", \"juggle\", \"impersonate\", \"worship\", \"misplace\", \"conspire with\", \"philosophize about\", \"tap dance on\", \"dramatically renounce\", \"secretly collect\"]\n",
|
||||
"adjectives = [\"turqoise\", \"smelly\", \"arrogant\", \"festering\", \"pleasing\", \"whimsical\", \"disheveled\", \"pretentious\", \"wobbly\", \"melodramatic\", \"pompous\", \"fluorescent\", \"bewildered\", \"suspicious\", \"overripe\"]\n",
|
||||
"nouns = [\"turnips\", \"rodents\", \"eels\", \"walruses\", \"kumquats\", \"monocles\", \"spreadsheets\", \"bagpipes\", \"wombats\", \"accordions\", \"mustaches\", \"calculators\", \"jellyfish\", \"thermostats\"]\n",
|
||||
"\n",
|
||||
"def infinite_random_sentences():\n",
|
||||
" while True:\n",
|
||||
" yield from random.choice(pronouns)\n",
|
||||
" yield \" \"\n",
|
||||
" yield from random.choice(verbs)\n",
|
||||
" yield \" \"\n",
|
||||
" yield from random.choice(adjectives)\n",
|
||||
" yield \" \"\n",
|
||||
" yield from random.choice(nouns)\n",
|
||||
" yield \". \"\n",
|
||||
"\n",
|
||||
"for letter in infinite_random_sentences():\n",
|
||||
" print(letter, end=\"\", flush=True)\n",
|
||||
" time.sleep(0.02)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "04832ea2-2447-4473-a449-104f80e24d85",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Exercise\n",
|
||||
"\n",
|
||||
"Write some python classes for the books example.\n",
|
||||
"\n",
|
||||
"Write a Book class with a title and author. Include a method has_author()\n",
|
||||
"\n",
|
||||
"Write a BookShelf class with a list of books. Include a generator method unique_authors()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "35760406-fe6c-41f9-b0c0-3e8cf73aafd0",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Finally\n",
|
||||
"\n",
|
||||
"Here are some intermediate level details of Classes from our AI friend, including use of type hints, inheritance and class methods. This includes a Book example.\n",
|
||||
"\n",
|
||||
"https://chatgpt.com/share/67348aca-65fc-8012-a4a9-fd1b8f04ba59"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.11"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
235
week1/day1.ipynb
235
week1/day1.ipynb
@@ -8,64 +8,31 @@
|
||||
"# YOUR FIRST LAB\n",
|
||||
"### Please read this section. This is valuable to get you prepared, even if it's a long read -- it's important stuff.\n",
|
||||
"\n",
|
||||
"## Your first Frontier LLM Project\n",
|
||||
"Be sure to read the README.md first!\n",
|
||||
"\n",
|
||||
"Let's build a useful LLM solution - in a matter of minutes.\n",
|
||||
"## Your first Frontier LLM Project\n",
|
||||
"\n",
|
||||
"By the end of this course, you will have built an autonomous Agentic AI solution with 7 agents that collaborate to solve a business problem. All in good time! We will start with something smaller...\n",
|
||||
"\n",
|
||||
"Our goal is to code a new kind of Web Browser. Give it a URL, and it will respond with a summary. The Reader's Digest of the internet!!\n",
|
||||
"\n",
|
||||
"Before starting, you should have completed the setup for [PC](../SETUP-PC.md) or [Mac](../SETUP-mac.md) and you hopefully launched this jupyter lab from within the project root directory, with your environment activated.\n",
|
||||
"Before starting, you should have completed the setup linked in the README.\n",
|
||||
"\n",
|
||||
"## If you're new to Jupyter Lab\n",
|
||||
"### If you're new to working in \"Notebooks\" (also known as Labs or Jupyter Lab)\n",
|
||||
"\n",
|
||||
"Welcome to the wonderful world of Data Science experimentation! Once you've used Jupyter Lab, you'll wonder how you ever lived without it. Simply click in each \"cell\" with code in it, such as the cell immediately below this text, and hit Shift+Return to execute that cell. As you wish, you can add a cell with the + button in the toolbar, and print values of variables, or try out variations. \n",
|
||||
"Welcome to the wonderful world of Data Science experimentation! Simply click in each \"cell\" with code in it, such as the cell immediately below this text, and hit Shift+Return to execute that cell. Be sure to run every cell, starting at the top, in order.\n",
|
||||
"\n",
|
||||
"I've written a notebook called [Guide to Jupyter](Guide%20to%20Jupyter.ipynb) to help you get more familiar with Jupyter Labs, including adding Markdown comments, using `!` to run shell commands, and `tqdm` to show progress.\n",
|
||||
"\n",
|
||||
"## If you're new to the Command Line\n",
|
||||
"\n",
|
||||
"Please see these excellent guides: [Command line on PC](https://chatgpt.com/share/67b0acea-ba38-8012-9c34-7a2541052665) and [Command line on Mac](https://chatgpt.com/canvas/shared/67b0b10c93a081918210723867525d2b). \n",
|
||||
"\n",
|
||||
"## If you'd prefer to work in IDEs\n",
|
||||
"\n",
|
||||
"If you're more comfortable in IDEs like VSCode, Cursor or PyCharm, they both work great with these lab notebooks too. \n",
|
||||
"If you'd prefer to work in VSCode, [here](https://chatgpt.com/share/676f2e19-c228-8012-9911-6ca42f8ed766) are instructions from an AI friend on how to configure it for the course.\n",
|
||||
"\n",
|
||||
"## If you'd like to brush up your Python\n",
|
||||
"\n",
|
||||
"I've added a notebook called [Intermediate Python](Intermediate%20Python.ipynb) to get you up to speed. But you should give it a miss if you already have a good idea what this code does: \n",
|
||||
"`yield from {book.get(\"author\") for book in books if book.get(\"author\")}`\n",
|
||||
"Please look in the [Guides folder](../guides/01_intro.ipynb) for all the guides.\n",
|
||||
"\n",
|
||||
"## I am here to help\n",
|
||||
"\n",
|
||||
"If you have any problems at all, please do reach out. \n",
|
||||
"I'm available through the platform, or at ed@edwarddonner.com, or at https://www.linkedin.com/in/eddonner/ if you'd like to connect (and I love connecting!) \n",
|
||||
"And this is new to me, but I'm also trying out X/Twitter at [@edwarddonner](https://x.com/edwarddonner) - if you're on X, please show me how it's done 😂 \n",
|
||||
"And this is new to me, but I'm also trying out X at [@edwarddonner](https://x.com/edwarddonner) - if you're on X, please show me how it's done 😂 \n",
|
||||
"\n",
|
||||
"## More troubleshooting\n",
|
||||
"\n",
|
||||
"Please see the [troubleshooting](troubleshooting.ipynb) notebook in this folder to diagnose and fix common problems. At the very end of it is a diagnostics script with some useful debug info.\n",
|
||||
"\n",
|
||||
"## For foundational technical knowledge (eg Git, APIs, debugging) \n",
|
||||
"\n",
|
||||
"If you're relatively new to programming -- I've got your back! While it's ideal to have some programming experience for this course, there's only one mandatory prerequisite: plenty of patience. 😁 I've put together a set of self-study guides that cover Git and GitHub, APIs and endpoints, beginner python and more.\n",
|
||||
"\n",
|
||||
"This covers Git and GitHub; what they are, the difference, and how to use them: \n",
|
||||
"https://github.com/ed-donner/agents/blob/main/guides/03_git_and_github.ipynb\n",
|
||||
"\n",
|
||||
"This covers technical foundations: \n",
|
||||
"ChatGPT vs API; taking screenshots; Environment Variables; Networking basics; APIs and endpoints: \n",
|
||||
"https://github.com/ed-donner/agents/blob/main/guides/04_technical_foundations.ipynb\n",
|
||||
"\n",
|
||||
"This covers Python for beginners, and making sure that a `NameError` never trips you up: \n",
|
||||
"https://github.com/ed-donner/agents/blob/main/guides/06_python_foundations.ipynb\n",
|
||||
"\n",
|
||||
"This covers the essential techniques for figuring out errors: \n",
|
||||
"https://github.com/ed-donner/agents/blob/main/guides/08_debugging.ipynb\n",
|
||||
"\n",
|
||||
"And you'll find other useful guides in the same folder in GitHub. Some information applies to my other Udemy course (eg Async Python) but most of it is very relevant for LLM engineering.\n",
|
||||
"Please see the [troubleshooting](../setup/troubleshooting.ipynb) notebook in the setup folder to diagnose and fix common problems. At the very end of it is a diagnostics script with some useful debug info.\n",
|
||||
"\n",
|
||||
"## If this is old hat!\n",
|
||||
"\n",
|
||||
@@ -74,7 +41,7 @@
|
||||
"<table style=\"margin: 0; text-align: left;\">\n",
|
||||
" <tr>\n",
|
||||
" <td style=\"width: 150px; height: 150px; vertical-align: middle;\">\n",
|
||||
" <img src=\"../important.jpg\" width=\"150\" height=\"150\" style=\"display: block;\" />\n",
|
||||
" <img src=\"../assets/important.jpg\" width=\"150\" height=\"150\" style=\"display: block;\" />\n",
|
||||
" </td>\n",
|
||||
" <td>\n",
|
||||
" <h2 style=\"color:#900;\">Please read - important note</h2>\n",
|
||||
@@ -85,7 +52,7 @@
|
||||
"<table style=\"margin: 0; text-align: left;\">\n",
|
||||
" <tr>\n",
|
||||
" <td style=\"width: 150px; height: 150px; vertical-align: middle;\">\n",
|
||||
" <img src=\"../resources.jpg\" width=\"150\" height=\"150\" style=\"display: block;\" />\n",
|
||||
" <img src=\"../assets/resources.jpg\" width=\"150\" height=\"150\" style=\"display: block;\" />\n",
|
||||
" </td>\n",
|
||||
" <td>\n",
|
||||
" <h2 style=\"color:#f71;\">This code is a live resource - keep an eye out for my emails</h2>\n",
|
||||
@@ -98,7 +65,7 @@
|
||||
"<table style=\"margin: 0; text-align: left;\">\n",
|
||||
" <tr>\n",
|
||||
" <td style=\"width: 150px; height: 150px; vertical-align: middle;\">\n",
|
||||
" <img src=\"../business.jpg\" width=\"150\" height=\"150\" style=\"display: block;\" />\n",
|
||||
" <img src=\"../assets/business.jpg\" width=\"150\" height=\"150\" style=\"display: block;\" />\n",
|
||||
" </td>\n",
|
||||
" <td>\n",
|
||||
" <h2 style=\"color:#181;\">Business value of these exercises</h2>\n",
|
||||
@@ -108,6 +75,33 @@
|
||||
"</table>"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "83f28feb",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### If necessary, install Cursor Extensions\n",
|
||||
"\n",
|
||||
"1. From the View menu, select Extensions\n",
|
||||
"2. Search for Python\n",
|
||||
"3. Click on \"Python\" made by \"ms-python\" and select Install if not already installed\n",
|
||||
"4. Search for Jupyter\n",
|
||||
"5. Click on \"Jupyter\" made by \"ms-toolsai\" and select Install of not already installed\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"### Next Select the Kernel\n",
|
||||
"\n",
|
||||
"Click on \"Select Kernel\" on the Top Right\n",
|
||||
"\n",
|
||||
"Choose \"Python Environments...\"\n",
|
||||
"\n",
|
||||
"Then choose the one that looks like `.venv (Python 3.12.x) .venv/bin/python` - it should be marked as \"Recommended\" and have a big star next to it.\n",
|
||||
"\n",
|
||||
"Any problems with this? Head over to the troubleshooting.\n",
|
||||
"\n",
|
||||
"### Note: you'll need to set the Kernel with every notebook.."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
@@ -118,9 +112,8 @@
|
||||
"# imports\n",
|
||||
"\n",
|
||||
"import os\n",
|
||||
"import requests\n",
|
||||
"from dotenv import load_dotenv\n",
|
||||
"from bs4 import BeautifulSoup\n",
|
||||
"from scraper import fetch_website_contents\n",
|
||||
"from IPython.display import Markdown, display\n",
|
||||
"from openai import OpenAI\n",
|
||||
"\n",
|
||||
@@ -140,9 +133,9 @@
|
||||
"\n",
|
||||
"## Troubleshooting if you have problems:\n",
|
||||
"\n",
|
||||
"Head over to the [troubleshooting](troubleshooting.ipynb) notebook in this folder for step by step code to identify the root cause and fix it!\n",
|
||||
"If you get a \"Name Error\" - have you run all cells from the top down? Head over to the Python Foundations guide for a bulletproof way to find and fix all Name Errors.\n",
|
||||
"\n",
|
||||
"If you make a change, try restarting the \"Kernel\" (the python process sitting behind this notebook) by Kernel menu >> Restart Kernel and Clear Outputs of All Cells. Then try this notebook again, starting at the top.\n",
|
||||
"If that doesn't fix it, head over to the [troubleshooting](../setup/troubleshooting.ipynb) notebook for step by step code to identify the root cause and fix it!\n",
|
||||
"\n",
|
||||
"Or, contact me! Message me or email ed@edwarddonner.com and we will get this to work.\n",
|
||||
"\n",
|
||||
@@ -173,19 +166,6 @@
|
||||
" print(\"API key found and looks good so far!\")\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "019974d9-f3ad-4a8a-b5f9-0a3719aea2d3",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"openai = OpenAI()\n",
|
||||
"\n",
|
||||
"# If this doesn't work, try Kernel menu >> Restart Kernel and Clear Outputs Of All Cells, then run the cells from the top of this notebook down.\n",
|
||||
"# If it STILL doesn't work (horrors!) then please see the Troubleshooting notebook in this folder for full instructions"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "442fc84b-0815-4f40-99ab-d9a5da6bda91",
|
||||
@@ -204,8 +184,23 @@
|
||||
"# To give you a preview -- calling OpenAI with these messages is this easy. Any problems, head over to the Troubleshooting notebook.\n",
|
||||
"\n",
|
||||
"message = \"Hello, GPT! This is my first ever message to you! Hi!\"\n",
|
||||
"response = openai.chat.completions.create(model=\"gpt-4o-mini\", messages=[{\"role\":\"user\", \"content\":message}])\n",
|
||||
"print(response.choices[0].message.content)"
|
||||
"\n",
|
||||
"messages = [{\"role\": \"user\", \"content\": message}]\n",
|
||||
"\n",
|
||||
"messages\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "08330159",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"openai = OpenAI()\n",
|
||||
"\n",
|
||||
"response = openai.chat.completions.create(model=\"gpt-5-nano\", messages=messages)\n",
|
||||
"response.choices[0].message.content"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -216,36 +211,6 @@
|
||||
"## OK onwards with our first project"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "c5e793b2-6775-426a-a139-4848291d0463",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# A class to represent a Webpage\n",
|
||||
"# If you're not familiar with Classes, check out the \"Intermediate Python\" notebook\n",
|
||||
"\n",
|
||||
"# Some websites need you to use proper headers when fetching them:\n",
|
||||
"headers = {\n",
|
||||
" \"User-Agent\": \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36\"\n",
|
||||
"}\n",
|
||||
"\n",
|
||||
"class Website:\n",
|
||||
"\n",
|
||||
" def __init__(self, url):\n",
|
||||
" \"\"\"\n",
|
||||
" Create this Website object from the given url using the BeautifulSoup library\n",
|
||||
" \"\"\"\n",
|
||||
" self.url = url\n",
|
||||
" response = requests.get(url, headers=headers)\n",
|
||||
" soup = BeautifulSoup(response.content, 'html.parser')\n",
|
||||
" self.title = soup.title.string if soup.title else \"No title found\"\n",
|
||||
" for irrelevant in soup.body([\"script\", \"style\", \"img\", \"input\"]):\n",
|
||||
" irrelevant.decompose()\n",
|
||||
" self.text = soup.body.get_text(separator=\"\\n\", strip=True)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
@@ -253,11 +218,10 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Let's try one out. Change the website and add print statements to follow along.\n",
|
||||
"# Let's try out this utility\n",
|
||||
"\n",
|
||||
"ed = Website(\"https://edwarddonner.com\")\n",
|
||||
"print(ed.title)\n",
|
||||
"print(ed.text)"
|
||||
"ed = fetch_website_contents(\"https://edwarddonner.com\")\n",
|
||||
"print(ed)"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -269,7 +233,7 @@
|
||||
"\n",
|
||||
"You may know this already - but if not, you will get very familiar with it!\n",
|
||||
"\n",
|
||||
"Models like GPT4o have been trained to receive instructions in a particular way.\n",
|
||||
"Models like GPT have been trained to receive instructions in a particular way.\n",
|
||||
"\n",
|
||||
"They expect to receive:\n",
|
||||
"\n",
|
||||
@@ -287,9 +251,11 @@
|
||||
"source": [
|
||||
"# Define our system prompt - you can experiment with this later, changing the last sentence to 'Respond in markdown in Spanish.\"\n",
|
||||
"\n",
|
||||
"system_prompt = \"You are an assistant that analyzes the contents of a website \\\n",
|
||||
"and provides a short summary, ignoring text that might be navigation related. \\\n",
|
||||
"Respond in markdown.\""
|
||||
"system_prompt = \"\"\"\n",
|
||||
"You are a snarkyassistant that analyzes the contents of a website,\n",
|
||||
"and provides a short, snarky, humorous summary, ignoring text that might be navigation related.\n",
|
||||
"Respond in markdown. Do not wrap the markdown in a code block - respond just with the markdown.\n",
|
||||
"\"\"\""
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -299,25 +265,14 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# A function that writes a User Prompt that asks for summaries of websites:\n",
|
||||
"# Define our user prompt\n",
|
||||
"\n",
|
||||
"def user_prompt_for(website):\n",
|
||||
" user_prompt = f\"You are looking at a website titled {website.title}\"\n",
|
||||
" user_prompt += \"\\nThe contents of this website is as follows; \\\n",
|
||||
"please provide a short summary of this website in markdown. \\\n",
|
||||
"If it includes news or announcements, then summarize these too.\\n\\n\"\n",
|
||||
" user_prompt += website.text\n",
|
||||
" return user_prompt"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "26448ec4-5c00-4204-baec-7df91d11ff2e",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"print(user_prompt_for(ed))"
|
||||
"user_prompt_prefix = \"\"\"\n",
|
||||
"Here are the contents of a website.\n",
|
||||
"Provide a short summary of this website.\n",
|
||||
"If it includes news or announcements, then summarize these too.\n",
|
||||
"\n",
|
||||
"\"\"\""
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -347,22 +302,12 @@
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"messages = [\n",
|
||||
" {\"role\": \"system\", \"content\": \"You are a snarky assistant\"},\n",
|
||||
" {\"role\": \"system\", \"content\": \"You are a helpful assistant\"},\n",
|
||||
" {\"role\": \"user\", \"content\": \"What is 2 + 2?\"}\n",
|
||||
"]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "21ed95c5-7001-47de-a36d-1d6673b403ce",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# To give you a preview -- calling OpenAI with system and user messages:\n",
|
||||
"]\n",
|
||||
"\n",
|
||||
"response = openai.chat.completions.create(model=\"gpt-4o-mini\", messages=messages)\n",
|
||||
"print(response.choices[0].message.content)"
|
||||
"response = openai.chat.completions.create(model=\"gpt-4.1-nano\", messages=messages)\n",
|
||||
"response.choices[0].message.content"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -370,7 +315,7 @@
|
||||
"id": "d06e8d78-ce4c-4b05-aa8e-17050c82bb47",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## And now let's build useful messages for GPT-4o-mini, using a function"
|
||||
"## And now let's build useful messages for GPT-4.1-mini, using a function"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -385,7 +330,7 @@
|
||||
"def messages_for(website):\n",
|
||||
" return [\n",
|
||||
" {\"role\": \"system\", \"content\": system_prompt},\n",
|
||||
" {\"role\": \"user\", \"content\": user_prompt_for(website)}\n",
|
||||
" {\"role\": \"user\", \"content\": user_prompt_prefix + website}\n",
|
||||
" ]"
|
||||
]
|
||||
},
|
||||
@@ -419,9 +364,9 @@
|
||||
"# And now: call the OpenAI API. You will get very familiar with this!\n",
|
||||
"\n",
|
||||
"def summarize(url):\n",
|
||||
" website = Website(url)\n",
|
||||
" website = fetch_website_contents(url)\n",
|
||||
" response = openai.chat.completions.create(\n",
|
||||
" model = \"gpt-4o-mini\",\n",
|
||||
" model = \"gpt-4.1-mini\",\n",
|
||||
" messages = messages_for(website)\n",
|
||||
" )\n",
|
||||
" return response.choices[0].message.content"
|
||||
@@ -444,7 +389,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# A function to display this nicely in the Jupyter output, using markdown\n",
|
||||
"# A function to display this nicely in the output, using markdown\n",
|
||||
"\n",
|
||||
"def display_summary(url):\n",
|
||||
" summary = summarize(url)\n",
|
||||
@@ -505,7 +450,7 @@
|
||||
"<table style=\"margin: 0; text-align: left;\">\n",
|
||||
" <tr>\n",
|
||||
" <td style=\"width: 150px; height: 150px; vertical-align: middle;\">\n",
|
||||
" <img src=\"../business.jpg\" width=\"150\" height=\"150\" style=\"display: block;\" />\n",
|
||||
" <img src=\"../assets/business.jpg\" width=\"150\" height=\"150\" style=\"display: block;\" />\n",
|
||||
" </td>\n",
|
||||
" <td>\n",
|
||||
" <h2 style=\"color:#181;\">Business applications</h2>\n",
|
||||
@@ -519,7 +464,7 @@
|
||||
"<table style=\"margin: 0; text-align: left;\">\n",
|
||||
" <tr>\n",
|
||||
" <td style=\"width: 150px; height: 150px; vertical-align: middle;\">\n",
|
||||
" <img src=\"../important.jpg\" width=\"150\" height=\"150\" style=\"display: block;\" />\n",
|
||||
" <img src=\"../assets/important.jpg\" width=\"150\" height=\"150\" style=\"display: block;\" />\n",
|
||||
" </td>\n",
|
||||
" <td>\n",
|
||||
" <h2 style=\"color:#900;\">Before you continue - now try yourself</h2>\n",
|
||||
@@ -549,12 +494,10 @@
|
||||
"messages = [] # fill this in\n",
|
||||
"\n",
|
||||
"# Step 3: Call OpenAI\n",
|
||||
"\n",
|
||||
"response =\n",
|
||||
"# response =\n",
|
||||
"\n",
|
||||
"# Step 4: print the result\n",
|
||||
"\n",
|
||||
"print("
|
||||
"# print("
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -593,7 +536,7 @@
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"display_name": ".venv",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
@@ -607,7 +550,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.12"
|
||||
"version": "3.12.9"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
||||
@@ -1,316 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "d15d8294-3328-4e07-ad16-8a03e9bbfdb9",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Welcome to your first assignment!\n",
|
||||
"\n",
|
||||
"Instructions are below. Please give this a try, and look in the solutions folder if you get stuck (or feel free to ask me!)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "ada885d9-4d42-4d9b-97f0-74fbbbfe93a9",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<table style=\"margin: 0; text-align: left;\">\n",
|
||||
" <tr>\n",
|
||||
" <td style=\"width: 150px; height: 150px; vertical-align: middle;\">\n",
|
||||
" <img src=\"../resources.jpg\" width=\"150\" height=\"150\" style=\"display: block;\" />\n",
|
||||
" </td>\n",
|
||||
" <td>\n",
|
||||
" <h2 style=\"color:#f71;\">Just before we get to the assignment --</h2>\n",
|
||||
" <span style=\"color:#f71;\">I thought I'd take a second to point you at this page of useful resources for the course. This includes links to all the slides.<br/>\n",
|
||||
" <a href=\"https://edwarddonner.com/2024/11/13/llm-engineering-resources/\">https://edwarddonner.com/2024/11/13/llm-engineering-resources/</a><br/>\n",
|
||||
" Please keep this bookmarked, and I'll continue to add more useful links there over time.\n",
|
||||
" </span>\n",
|
||||
" </td>\n",
|
||||
" </tr>\n",
|
||||
"</table>"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "6e9fa1fc-eac5-4d1d-9be4-541b3f2b3458",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# HOMEWORK EXERCISE ASSIGNMENT\n",
|
||||
"\n",
|
||||
"Upgrade the day 1 project to summarize a webpage to use an Open Source model running locally via Ollama rather than OpenAI\n",
|
||||
"\n",
|
||||
"You'll be able to use this technique for all subsequent projects if you'd prefer not to use paid APIs.\n",
|
||||
"\n",
|
||||
"**Benefits:**\n",
|
||||
"1. No API charges - open-source\n",
|
||||
"2. Data doesn't leave your box\n",
|
||||
"\n",
|
||||
"**Disadvantages:**\n",
|
||||
"1. Significantly less power than Frontier Model\n",
|
||||
"\n",
|
||||
"## Recap on installation of Ollama\n",
|
||||
"\n",
|
||||
"Simply visit [ollama.com](https://ollama.com) and install!\n",
|
||||
"\n",
|
||||
"Once complete, the ollama server should already be running locally. \n",
|
||||
"If you visit: \n",
|
||||
"[http://localhost:11434/](http://localhost:11434/)\n",
|
||||
"\n",
|
||||
"You should see the message `Ollama is running`. \n",
|
||||
"\n",
|
||||
"If not, bring up a new Terminal (Mac) or Powershell (Windows) and enter `ollama serve` \n",
|
||||
"And in another Terminal (Mac) or Powershell (Windows), enter `ollama pull llama3.2` \n",
|
||||
"Then try [http://localhost:11434/](http://localhost:11434/) again.\n",
|
||||
"\n",
|
||||
"If Ollama is slow on your machine, try using `llama3.2:1b` as an alternative. Run `ollama pull llama3.2:1b` from a Terminal or Powershell, and change the code below from `MODEL = \"llama3.2\"` to `MODEL = \"llama3.2:1b\"`"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "4e2a9393-7767-488e-a8bf-27c12dca35bd",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# imports\n",
|
||||
"\n",
|
||||
"import requests\n",
|
||||
"from bs4 import BeautifulSoup\n",
|
||||
"from IPython.display import Markdown, display"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "29ddd15d-a3c5-4f4e-a678-873f56162724",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Constants\n",
|
||||
"\n",
|
||||
"OLLAMA_API = \"http://localhost:11434/api/chat\"\n",
|
||||
"HEADERS = {\"Content-Type\": \"application/json\"}\n",
|
||||
"MODEL = \"llama3.2\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "dac0a679-599c-441f-9bf2-ddc73d35b940",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Create a messages list using the same format that we used for OpenAI\n",
|
||||
"\n",
|
||||
"messages = [\n",
|
||||
" {\"role\": \"user\", \"content\": \"Describe some of the business applications of Generative AI\"}\n",
|
||||
"]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "7bb9c624-14f0-4945-a719-8ddb64f66f47",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"payload = {\n",
|
||||
" \"model\": MODEL,\n",
|
||||
" \"messages\": messages,\n",
|
||||
" \"stream\": False\n",
|
||||
" }"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "479ff514-e8bd-4985-a572-2ea28bb4fa40",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Let's just make sure the model is loaded\n",
|
||||
"\n",
|
||||
"!ollama pull llama3.2"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "42b9f644-522d-4e05-a691-56e7658c0ea9",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# If this doesn't work for any reason, try the 2 versions in the following cells\n",
|
||||
"# And double check the instructions in the 'Recap on installation of Ollama' at the top of this lab\n",
|
||||
"# And if none of that works - contact me!\n",
|
||||
"\n",
|
||||
"response = requests.post(OLLAMA_API, json=payload, headers=HEADERS)\n",
|
||||
"print(response.json()['message']['content'])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "6a021f13-d6a1-4b96-8e18-4eae49d876fe",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Introducing the ollama package\n",
|
||||
"\n",
|
||||
"And now we'll do the same thing, but using the elegant ollama python package instead of a direct HTTP call.\n",
|
||||
"\n",
|
||||
"Under the hood, it's making the same call as above to the ollama server running at localhost:11434"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "7745b9c4-57dc-4867-9180-61fa5db55eb8",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import ollama\n",
|
||||
"\n",
|
||||
"response = ollama.chat(model=MODEL, messages=messages)\n",
|
||||
"print(response['message']['content'])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a4704e10-f5fb-4c15-a935-f046c06fb13d",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Alternative approach - using OpenAI python library to connect to Ollama"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "23057e00-b6fc-4678-93a9-6b31cb704bff",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# There's actually an alternative approach that some people might prefer\n",
|
||||
"# You can use the OpenAI client python library to call Ollama:\n",
|
||||
"\n",
|
||||
"from openai import OpenAI\n",
|
||||
"ollama_via_openai = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')\n",
|
||||
"\n",
|
||||
"response = ollama_via_openai.chat.completions.create(\n",
|
||||
" model=MODEL,\n",
|
||||
" messages=messages\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"print(response.choices[0].message.content)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "9f9e22da-b891-41f6-9ac9-bd0c0a5f4f44",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Are you confused about why that works?\n",
|
||||
"\n",
|
||||
"It seems strange, right? We just used OpenAI code to call Ollama?? What's going on?!\n",
|
||||
"\n",
|
||||
"Here's the scoop:\n",
|
||||
"\n",
|
||||
"The python class `OpenAI` is simply code written by OpenAI engineers that makes calls over the internet to an endpoint. \n",
|
||||
"\n",
|
||||
"When you call `openai.chat.completions.create()`, this python code just makes a web request to the following url: \"https://api.openai.com/v1/chat/completions\"\n",
|
||||
"\n",
|
||||
"Code like this is known as a \"client library\" - it's just wrapper code that runs on your machine to make web requests. The actual power of GPT is running on OpenAI's cloud behind this API, not on your computer!\n",
|
||||
"\n",
|
||||
"OpenAI was so popular, that lots of other AI providers provided identical web endpoints, so you could use the same approach.\n",
|
||||
"\n",
|
||||
"So Ollama has an endpoint running on your local box at http://localhost:11434/v1/chat/completions \n",
|
||||
"And in week 2 we'll discover that lots of other providers do this too, including Gemini and DeepSeek.\n",
|
||||
"\n",
|
||||
"And then the team at OpenAI had a great idea: they can extend their client library so you can specify a different 'base url', and use their library to call any compatible API.\n",
|
||||
"\n",
|
||||
"That's it!\n",
|
||||
"\n",
|
||||
"So when you say: `ollama_via_openai = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')` \n",
|
||||
"Then this will make the same endpoint calls, but to Ollama instead of OpenAI."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "bc7d1de3-e2ac-46ff-a302-3b4ba38c4c90",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Also trying the amazing reasoning model DeepSeek\n",
|
||||
"\n",
|
||||
"Here we use the version of DeepSeek-reasoner that's been distilled to 1.5B. \n",
|
||||
"This is actually a 1.5B variant of Qwen that has been fine-tuned using synethic data generated by Deepseek R1.\n",
|
||||
"\n",
|
||||
"Other sizes of DeepSeek are [here](https://ollama.com/library/deepseek-r1) all the way up to the full 671B parameter version, which would use up 404GB of your drive and is far too large for most!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "cf9eb44e-fe5b-47aa-b719-0bb63669ab3d",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"!ollama pull deepseek-r1:1.5b"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "1d3d554b-e00d-4c08-9300-45e073950a76",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# This may take a few minutes to run! You should then see a fascinating \"thinking\" trace inside <think> tags, followed by some decent definitions\n",
|
||||
"\n",
|
||||
"response = ollama_via_openai.chat.completions.create(\n",
|
||||
" model=\"deepseek-r1:1.5b\",\n",
|
||||
" messages=[{\"role\": \"user\", \"content\": \"Please give definitions of some core concepts behind LLMs: a neural network, attention and the transformer\"}]\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"print(response.choices[0].message.content)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "1622d9bb-5c68-4d4e-9ca4-b492c751f898",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# NOW the exercise for you\n",
|
||||
"\n",
|
||||
"Take the code from day1 and incorporate it here, to build a website summarizer that uses Llama 3.2 running locally instead of OpenAI; use either of the above approaches."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "6de38216-6d1c-48c4-877b-86d403f4e0f8",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.12"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
389
week1/day2.ipynb
Normal file
389
week1/day2.ipynb
Normal file
@@ -0,0 +1,389 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "d15d8294-3328-4e07-ad16-8a03e9bbfdb9",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Welcome to the Day 2 Lab!\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "ada885d9-4d42-4d9b-97f0-74fbbbfe93a9",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<table style=\"margin: 0; text-align: left;\">\n",
|
||||
" <tr>\n",
|
||||
" <td style=\"width: 150px; height: 150px; vertical-align: middle;\">\n",
|
||||
" <img src=\"../assets/resources.jpg\" width=\"150\" height=\"150\" style=\"display: block;\" />\n",
|
||||
" </td>\n",
|
||||
" <td>\n",
|
||||
" <h2 style=\"color:#f71;\">Just before we get started --</h2>\n",
|
||||
" <span style=\"color:#f71;\">I thought I'd take a second to point you at this page of useful resources for the course. This includes links to all the slides.<br/>\n",
|
||||
" <a href=\"https://edwarddonner.com/2024/11/13/llm-engineering-resources/\">https://edwarddonner.com/2024/11/13/llm-engineering-resources/</a><br/>\n",
|
||||
" Please keep this bookmarked, and I'll continue to add more useful links there over time.\n",
|
||||
" </span>\n",
|
||||
" </td>\n",
|
||||
" </tr>\n",
|
||||
"</table>"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "79ffe36f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## First - let's talk about the Chat Completions API\n",
|
||||
"\n",
|
||||
"1. The simplest way to call an LLM\n",
|
||||
"2. It's called Chat Completions because it's saying: \"here is a conversation, please predict what should come next\"\n",
|
||||
"3. The Chat Completions API was invented by OpenAI, but it's so popular that everybody uses it!\n",
|
||||
"\n",
|
||||
"### We will start by calling OpenAI again - but don't worry non-OpenAI people, your time is coming!\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "e38f17a0",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"from dotenv import load_dotenv\n",
|
||||
"\n",
|
||||
"load_dotenv(override=True)\n",
|
||||
"api_key = os.getenv('OPENAI_API_KEY')\n",
|
||||
"\n",
|
||||
"if not api_key:\n",
|
||||
" print(\"No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!\")\n",
|
||||
"elif not api_key.startswith(\"sk-proj-\"):\n",
|
||||
" print(\"An API key was found, but it doesn't start sk-proj-; please check you're using the right key - see troubleshooting notebook\")\n",
|
||||
"else:\n",
|
||||
" print(\"API key found and looks good so far!\")\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "97846274",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Do you know what an Endpoint is?\n",
|
||||
"\n",
|
||||
"If not, please review the Technical Foundations guide in the guides folder\n",
|
||||
"\n",
|
||||
"And, here is an endpoint that might interest you..."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "5af5c188",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import requests\n",
|
||||
"\n",
|
||||
"headers = {\"Authorization\": f\"Bearer {api_key}\", \"Content-Type\": \"application/json\"}\n",
|
||||
"\n",
|
||||
"payload = {\n",
|
||||
" \"model\": \"gpt-5-nano\",\n",
|
||||
" \"messages\": [\n",
|
||||
" {\"role\": \"user\", \"content\": \"Tell me a fun fact\"}]\n",
|
||||
"}\n",
|
||||
"\n",
|
||||
"payload"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "2d0ab242",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"response = requests.post(\n",
|
||||
" \"https://api.openai.com/v1/chat/completions\",\n",
|
||||
" headers=headers,\n",
|
||||
" json=payload\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"response.json()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "cb11a9f6",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"response.json()[\"choices\"][0][\"message\"][\"content\"]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "cea3026a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# What is the openai package?\n",
|
||||
"\n",
|
||||
"It's known as a Python Client Library.\n",
|
||||
"\n",
|
||||
"It's nothing more than a wrapper around making this exact call to the http endpoint.\n",
|
||||
"\n",
|
||||
"It just allows you to work with nice Python code instead of messing around with janky json objects.\n",
|
||||
"\n",
|
||||
"But that's it. It's open-source and lightweight. Some people think it contains OpenAI model code - it doesn't!\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "490fdf09",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Create OpenAI client\n",
|
||||
"\n",
|
||||
"from openai import OpenAI\n",
|
||||
"openai = OpenAI()\n",
|
||||
"\n",
|
||||
"response = openai.chat.completions.create(model=\"gpt-5-nano\", messages=[{\"role\": \"user\", \"content\": \"Tell me a fun fact\"}])\n",
|
||||
"\n",
|
||||
"response.choices[0].message.content\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "c7739cda",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## And then this great thing happened:\n",
|
||||
"\n",
|
||||
"OpenAI's Chat Completions API was so popular, that the other model providers created endpoints that are identical.\n",
|
||||
"\n",
|
||||
"They are known as the \"OpenAI Compatible Endpoints\".\n",
|
||||
"\n",
|
||||
"For example, google made one here: https://generativelanguage.googleapis.com/v1beta/openai/\n",
|
||||
"\n",
|
||||
"And OpenAI decided to be kind: they said, hey, you can just use the same client library that we made for GPT. We'll allow you to specify a different endpoint URL and a different key, to use another provider.\n",
|
||||
"\n",
|
||||
"So you can use:\n",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"gemini = OpenAI(base_url=\"https://generativelanguage.googleapis.com/v1beta/openai/\", api_key=\"AIz....\")\n",
|
||||
"gemini.chat.completions.create(...)\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"And to be clear - even though OpenAI is in the code, we're only using this lightweight python client library to call the endpoint - there's no OpenAI model involved here.\n",
|
||||
"\n",
|
||||
"If you're confused, please review Guide 9 in the Guides folder!\n",
|
||||
"\n",
|
||||
"And now let's try it!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "f74293bc",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"GEMINI_BASE_URL = \"https://generativelanguage.googleapis.com/v1beta/openai/\"\n",
|
||||
"\n",
|
||||
"google_api_key = os.getenv(\"GOOGLE_API_KEY\")\n",
|
||||
"\n",
|
||||
"if not google_api_key:\n",
|
||||
" print(\"No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!\")\n",
|
||||
"elif not google_api_key.startswith(\"AIz\"):\n",
|
||||
" print(\"An API key was found, but it doesn't start AIz\")\n",
|
||||
"else:\n",
|
||||
" print(\"API key found and looks good so far!\")\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "d060f484",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"gemini = OpenAI(base_url=GEMINI_BASE_URL, api_key=google_api_key)\n",
|
||||
"\n",
|
||||
"response = gemini.chat.completions.create(model=\"gemini-2.5-pro\", messages=[{\"role\": \"user\", \"content\": \"Tell me a fun fact\"}])\n",
|
||||
"\n",
|
||||
"response.choices[0].message.content"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "a5b069be",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "65272432",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## And Ollama also gives an OpenAI compatible endpoint\n",
|
||||
"\n",
|
||||
"...and it's on your local machine!\n",
|
||||
"\n",
|
||||
"If the next cell doesn't print \"Ollama is running\" then please open a terminal and run `ollama serve`"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "f06280ad",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"requests.get(\"http://localhost:11434\").content"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "c6ef3807",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Download llama3.2 from meta\n",
|
||||
"\n",
|
||||
"Change this to llama3.2:1b if your computer is smaller.\n",
|
||||
"\n",
|
||||
"Don't use llama3.3 or llama4! They are too big for your computer.."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "e633481d",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"!ollama pull llama3.2"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "d9419762",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"OLLAMA_BASE_URL = \"http://localhost:11434/v1\"\n",
|
||||
"\n",
|
||||
"ollama = OpenAI(base_url=OLLAMA_BASE_URL, api_key='ollama')\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "e2456cdf",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Get a fun fact\n",
|
||||
"\n",
|
||||
"response = ollama.chat.completions.create(model=\"llama3.2\", messages=[{\"role\": \"user\", \"content\": \"Tell me a fun fact\"}])\n",
|
||||
"\n",
|
||||
"response.choices[0].message.content"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "1e6cae7f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Now let's try deepseek-r1:1.5b - this is DeepSeek \"distilled\" into Qwen from Alibaba Cloud\n",
|
||||
"\n",
|
||||
"!ollama pull deepseek-r1:1.5b"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "25002f25",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"response = ollama.chat.completions.create(model=\"deepseek-r1:1.5b\", messages=[{\"role\": \"user\", \"content\": \"Tell me a fun fact\"}])\n",
|
||||
"\n",
|
||||
"response.choices[0].message.content"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "6e9fa1fc-eac5-4d1d-9be4-541b3f2b3458",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# HOMEWORK EXERCISE ASSIGNMENT\n",
|
||||
"\n",
|
||||
"Upgrade the day 1 project to summarize a webpage to use an Open Source model running locally via Ollama rather than OpenAI\n",
|
||||
"\n",
|
||||
"You'll be able to use this technique for all subsequent projects if you'd prefer not to use paid APIs.\n",
|
||||
"\n",
|
||||
"**Benefits:**\n",
|
||||
"1. No API charges - open-source\n",
|
||||
"2. Data doesn't leave your box\n",
|
||||
"\n",
|
||||
"**Disadvantages:**\n",
|
||||
"1. Significantly less power than Frontier Model\n",
|
||||
"\n",
|
||||
"## Recap on installation of Ollama\n",
|
||||
"\n",
|
||||
"Simply visit [ollama.com](https://ollama.com) and install!\n",
|
||||
"\n",
|
||||
"Once complete, the ollama server should already be running locally. \n",
|
||||
"If you visit: \n",
|
||||
"[http://localhost:11434/](http://localhost:11434/)\n",
|
||||
"\n",
|
||||
"You should see the message `Ollama is running`. \n",
|
||||
"\n",
|
||||
"If not, bring up a new Terminal (Mac) or Powershell (Windows) and enter `ollama serve` \n",
|
||||
"And in another Terminal (Mac) or Powershell (Windows), enter `ollama pull llama3.2` \n",
|
||||
"Then try [http://localhost:11434/](http://localhost:11434/) again.\n",
|
||||
"\n",
|
||||
"If Ollama is slow on your machine, try using `llama3.2:1b` as an alternative. Run `ollama pull llama3.2:1b` from a Terminal or Powershell, and change the code from `MODEL = \"llama3.2\"` to `MODEL = \"llama3.2:1b\"`"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "6de38216-6d1c-48c4-877b-86d403f4e0f8",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": ".venv",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.12.9"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
303
week1/day4.ipynb
Normal file
303
week1/day4.ipynb
Normal file
@@ -0,0 +1,303 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "d9e61417",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Day 4\n",
|
||||
"\n",
|
||||
"## Tokenizing with code"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "7dc1c1d9",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import tiktoken\n",
|
||||
"\n",
|
||||
"encoding = tiktoken.encoding_for_model(\"gpt-4.1-mini\")\n",
|
||||
"\n",
|
||||
"tokens = encoding.encode(\"Hi my name is Ed and I like banoffee pie\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "7632966c",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"[12194, 922, 1308, 382, 6117, 326, 357, 1299, 9171, 26458, 5148]"
|
||||
]
|
||||
},
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"tokens"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "cce0c188",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"12194 = Hi\n",
|
||||
"922 = my\n",
|
||||
"1308 = name\n",
|
||||
"382 = is\n",
|
||||
"6117 = Ed\n",
|
||||
"326 = and\n",
|
||||
"357 = I\n",
|
||||
"1299 = like\n",
|
||||
"9171 = ban\n",
|
||||
"26458 = offee\n",
|
||||
"5148 = pie\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"for token_id in tokens:\n",
|
||||
" token_text = encoding.decode([token_id])\n",
|
||||
" print(f\"{token_id} = {token_text}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "98e3bbd2",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"' and'"
|
||||
]
|
||||
},
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"encoding.decode([326])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "538efe61",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# And another topic!\n",
|
||||
"\n",
|
||||
"### The Illusion of \"memory\"\n",
|
||||
"\n",
|
||||
"Many of you will know this already. But for those that don't -- this might be an \"AHA\" moment!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "83a4b3eb",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"from dotenv import load_dotenv\n",
|
||||
"\n",
|
||||
"load_dotenv(override=True)\n",
|
||||
"api_key = os.getenv('OPENAI_API_KEY')\n",
|
||||
"\n",
|
||||
"if not api_key:\n",
|
||||
" print(\"No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!\")\n",
|
||||
"elif not api_key.startswith(\"sk-proj-\"):\n",
|
||||
" print(\"An API key was found, but it doesn't start sk-proj-; please check you're using the right key - see troubleshooting notebook\")\n",
|
||||
"else:\n",
|
||||
" print(\"API key found and looks good so far!\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "b618859b",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### You should be very comfortable with what the next cell is doing!\n",
|
||||
"\n",
|
||||
"_I'm creating a new instance of the OpenAI Python Client library, a lightweight wrapper around making HTTP calls to an endpoint for calling the GPT LLM, or other LLM providers_"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "b959be3b",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from openai import OpenAI\n",
|
||||
"\n",
|
||||
"openai = OpenAI()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "aa889e80",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### A message to OpenAI is a list of dicts"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "97298fea",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"messages = [\n",
|
||||
" {\"role\": \"system\", \"content\": \"You are a helpful assistant\"},\n",
|
||||
" {\"role\": \"user\", \"content\": \"Hi! I'm Ed!\"}\n",
|
||||
" ]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "3475a36d",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"response = openai.chat.completions.create(model=\"gpt-4.1-mini\", messages=messages)\n",
|
||||
"response.choices[0].message.content"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a5f45ed8",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### OK let's now ask a follow-up question"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "6bce2208",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"messages = [\n",
|
||||
" {\"role\": \"system\", \"content\": \"You are a helpful assistant\"},\n",
|
||||
" {\"role\": \"user\", \"content\": \"What's my name?\"}\n",
|
||||
" ]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "404462f5",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"response = openai.chat.completions.create(model=\"gpt-4.1-mini\", messages=messages)\n",
|
||||
"response.choices[0].message.content"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "098237ef",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Wait, wha??\n",
|
||||
"\n",
|
||||
"We just told you!\n",
|
||||
"\n",
|
||||
"What's going on??\n",
|
||||
"\n",
|
||||
"Here's the thing: every call to an LLM is completely STATELESS. It's a totally new call, every single time. As AI engineers, it's OUR JOB to devise techniques to give the impression that the LLM has a \"memory\"."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "b6d43f92",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"messages = [\n",
|
||||
" {\"role\": \"system\", \"content\": \"You are a helpful assistant\"},\n",
|
||||
" {\"role\": \"user\", \"content\": \"Hi! I'm Ed!\"},\n",
|
||||
" {\"role\": \"assistant\", \"content\": \"Hi Ed! How can I assist you today?\"},\n",
|
||||
" {\"role\": \"user\", \"content\": \"What's my name?\"}\n",
|
||||
" ]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "e7ac742c",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"response = openai.chat.completions.create(model=\"gpt-4.1-mini\", messages=messages)\n",
|
||||
"response.choices[0].message.content"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "96c49557",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## To recap\n",
|
||||
"\n",
|
||||
"With apologies if this is obvious to you - but it's still good to reinforce:\n",
|
||||
"\n",
|
||||
"1. Every call to an LLM is stateless\n",
|
||||
"2. We pass in the entire conversation so far in the input prompt, every time\n",
|
||||
"3. This gives the illusion that the LLM has memory - it apparently keeps the context of the conversation\n",
|
||||
"4. But this is a trick; it's a by-product of providing the entire conversation, every time\n",
|
||||
"5. An LLM just predicts the most likely next tokens in the sequence; if that sequence contains \"My name is Ed\" and later \"What's my name?\" then it will predict.. Ed!\n",
|
||||
"\n",
|
||||
"The ChatGPT product uses exactly this trick - every time you send a message, it's the entire conversation that gets passed in.\n",
|
||||
"\n",
|
||||
"\"Does that mean we have to pay extra each time for all the conversation so far\"\n",
|
||||
"\n",
|
||||
"For sure it does. And that's what we WANT. We want the LLM to predict the next tokens in the sequence, looking back on the entire conversation. We want that compute to happen, so we need to pay the electricity bill for it!\n",
|
||||
"\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": ".venv",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.12.9"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
1658
week1/day5.ipynb
1658
week1/day5.ipynb
File diff suppressed because one or more lines are too long
@@ -1,419 +0,0 @@
|
||||
import os
|
||||
import sys
|
||||
import platform
|
||||
import subprocess
|
||||
import shutil
|
||||
import time
|
||||
import ssl
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
from datetime import datetime
|
||||
|
||||
class Diagnostics:
|
||||
|
||||
FILENAME = 'report.txt'
|
||||
|
||||
def __init__(self):
|
||||
self.errors = []
|
||||
self.warnings = []
|
||||
if os.path.exists(self.FILENAME):
|
||||
os.remove(self.FILENAME)
|
||||
|
||||
def log(self, message):
|
||||
print(message)
|
||||
with open(self.FILENAME, 'a', encoding='utf-8') as f:
|
||||
f.write(message + "\n")
|
||||
|
||||
def start(self):
|
||||
now = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
||||
self.log(f"Starting diagnostics at {now}\n")
|
||||
|
||||
def end(self):
|
||||
now = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
||||
self.log(f"\n\nCompleted diagnostics at {now}\n")
|
||||
print("\nPlease send these diagnostics to me at ed@edwarddonner.com")
|
||||
print(f"Either copy & paste the above output into an email, or attach the file {self.FILENAME} that has been created in this directory.")
|
||||
|
||||
|
||||
def _log_error(self, message):
|
||||
self.log(f"ERROR: {message}")
|
||||
self.errors.append(message)
|
||||
|
||||
def _log_warning(self, message):
|
||||
self.log(f"WARNING: {message}")
|
||||
self.warnings.append(message)
|
||||
|
||||
def run(self):
|
||||
self.start()
|
||||
self._step1_system_info()
|
||||
self._step2_check_files()
|
||||
self._step3_git_repo()
|
||||
self._step4_check_env_file()
|
||||
self._step5_anaconda_check()
|
||||
self._step6_virtualenv_check()
|
||||
self._step7_network_connectivity()
|
||||
self._step8_environment_variables()
|
||||
self._step9_additional_diagnostics()
|
||||
|
||||
if self.warnings:
|
||||
self.log("\n===== Warnings Found =====")
|
||||
self.log("The following warnings were detected. They might not prevent the program from running but could cause unexpected behavior:")
|
||||
for warning in self.warnings:
|
||||
self.log(f"- {warning}")
|
||||
|
||||
if self.errors:
|
||||
self.log("\n===== Errors Found =====")
|
||||
self.log("The following critical issues were detected. Please address them before proceeding:")
|
||||
for error in self.errors:
|
||||
self.log(f"- {error}")
|
||||
|
||||
if not self.errors and not self.warnings:
|
||||
self.log("\n✅ All diagnostics passed successfully!")
|
||||
|
||||
self.end()
|
||||
|
||||
def _step1_system_info(self):
|
||||
self.log("===== System Information =====")
|
||||
try:
|
||||
system = platform.system()
|
||||
self.log(f"Operating System: {system}")
|
||||
|
||||
if system == "Windows":
|
||||
release, version, csd, ptype = platform.win32_ver()
|
||||
self.log(f"Windows Release: {release}")
|
||||
self.log(f"Windows Version: {version}")
|
||||
elif system == "Darwin":
|
||||
release, version, machine = platform.mac_ver()
|
||||
self.log(f"MacOS Version: {release}")
|
||||
else:
|
||||
self.log(f"Platform: {platform.platform()}")
|
||||
|
||||
self.log(f"Architecture: {platform.architecture()}")
|
||||
self.log(f"Machine: {platform.machine()}")
|
||||
self.log(f"Processor: {platform.processor()}")
|
||||
|
||||
try:
|
||||
import psutil
|
||||
ram = psutil.virtual_memory()
|
||||
total_ram_gb = ram.total / (1024 ** 3)
|
||||
available_ram_gb = ram.available / (1024 ** 3)
|
||||
self.log(f"Total RAM: {total_ram_gb:.2f} GB")
|
||||
self.log(f"Available RAM: {available_ram_gb:.2f} GB")
|
||||
|
||||
if available_ram_gb < 2:
|
||||
self._log_warning(f"Low available RAM: {available_ram_gb:.2f} GB")
|
||||
except ImportError:
|
||||
self._log_warning("psutil module not found. Cannot determine RAM information.")
|
||||
|
||||
total, used, free = shutil.disk_usage(os.path.expanduser("~"))
|
||||
free_gb = free / (1024 ** 3)
|
||||
self.log(f"Free Disk Space: {free_gb:.2f} GB")
|
||||
|
||||
if free_gb < 5:
|
||||
self._log_warning(f"Low disk space: {free_gb:.2f} GB free")
|
||||
|
||||
except Exception as e:
|
||||
self._log_error(f"System information check failed: {e}")
|
||||
|
||||
def _step2_check_files(self):
|
||||
self.log("\n===== File System Information =====")
|
||||
try:
|
||||
current_dir = os.getcwd()
|
||||
self.log(f"Current Directory: {current_dir}")
|
||||
|
||||
# Check write permissions
|
||||
test_file = Path(current_dir) / ".test_write_permission"
|
||||
try:
|
||||
test_file.touch(exist_ok=True)
|
||||
test_file.unlink()
|
||||
self.log("Write permission: OK")
|
||||
except Exception as e:
|
||||
self._log_error(f"No write permission in current directory: {e}")
|
||||
|
||||
self.log("\nFiles in Current Directory:")
|
||||
try:
|
||||
for item in sorted(os.listdir(current_dir)):
|
||||
self.log(f" - {item}")
|
||||
except Exception as e:
|
||||
self._log_error(f"Cannot list directory contents: {e}")
|
||||
|
||||
except Exception as e:
|
||||
self._log_error(f"File system check failed: {e}")
|
||||
|
||||
def _step3_git_repo(self):
|
||||
self.log("\n===== Git Repository Information =====")
|
||||
try:
|
||||
result = subprocess.run(['git', 'rev-parse', '--show-toplevel'],
|
||||
stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
|
||||
if result.returncode == 0:
|
||||
git_root = result.stdout.strip()
|
||||
self.log(f"Git Repository Root: {git_root}")
|
||||
|
||||
result = subprocess.run(['git', 'rev-parse', 'HEAD'],
|
||||
stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
|
||||
if result.returncode == 0:
|
||||
self.log(f"Current Commit: {result.stdout.strip()}")
|
||||
else:
|
||||
self._log_warning(f"Could not get current commit: {result.stderr.strip()}")
|
||||
|
||||
result = subprocess.run(['git', 'remote', 'get-url', 'origin'],
|
||||
stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
|
||||
if result.returncode == 0:
|
||||
self.log(f"Remote Origin: {result.stdout.strip()}")
|
||||
else:
|
||||
self._log_warning("No remote 'origin' configured")
|
||||
else:
|
||||
self._log_warning("Not a git repository")
|
||||
except FileNotFoundError:
|
||||
self._log_warning("Git is not installed or not in PATH")
|
||||
except Exception as e:
|
||||
self._log_error(f"Git check failed: {e}")
|
||||
|
||||
def _step4_check_env_file(self):
|
||||
self.log("\n===== Environment File Check =====")
|
||||
try:
|
||||
result = subprocess.run(['git', 'rev-parse', '--show-toplevel'],
|
||||
stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
|
||||
if result.returncode == 0:
|
||||
git_root = result.stdout.strip()
|
||||
env_path = os.path.join(git_root, '.env')
|
||||
|
||||
if os.path.isfile(env_path):
|
||||
self.log(f".env file exists at: {env_path}")
|
||||
try:
|
||||
with open(env_path, 'r') as f:
|
||||
has_api_key = any(line.strip().startswith('OPENAI_API_KEY=') for line in f)
|
||||
if has_api_key:
|
||||
self.log("OPENAI_API_KEY found in .env file")
|
||||
else:
|
||||
self._log_warning("OPENAI_API_KEY not found in .env file")
|
||||
except Exception as e:
|
||||
self._log_error(f"Cannot read .env file: {e}")
|
||||
else:
|
||||
self._log_warning(".env file not found in project root")
|
||||
|
||||
# Check for additional .env files
|
||||
for root, _, files in os.walk(git_root):
|
||||
if '.env' in files and os.path.join(root, '.env') != env_path:
|
||||
self._log_warning(f"Additional .env file found at: {os.path.join(root, '.env')}")
|
||||
else:
|
||||
self._log_warning("Git root directory not found. Cannot perform .env file check.")
|
||||
except FileNotFoundError:
|
||||
self._log_warning("Git is not installed or not in PATH")
|
||||
except Exception as e:
|
||||
self._log_error(f"Environment file check failed: {e}")
|
||||
|
||||
def _step5_anaconda_check(self):
|
||||
self.log("\n===== Anaconda Environment Check =====")
|
||||
try:
|
||||
conda_prefix = os.environ.get('CONDA_PREFIX')
|
||||
if conda_prefix:
|
||||
self.log("Anaconda environment is active:")
|
||||
self.log(f"Environment Path: {conda_prefix}")
|
||||
self.log(f"Environment Name: {os.path.basename(conda_prefix)}")
|
||||
|
||||
conda_exe = os.environ.get('CONDA_EXE', 'conda')
|
||||
result = subprocess.run([conda_exe, '--version'],
|
||||
stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
|
||||
if result.returncode == 0:
|
||||
self.log(f"Conda Version: {result.stdout.strip()}")
|
||||
else:
|
||||
self._log_warning("Could not determine Conda version")
|
||||
|
||||
self._check_python_packages()
|
||||
else:
|
||||
self.log("No active Anaconda environment detected")
|
||||
except Exception as e:
|
||||
self._log_error(f"Anaconda environment check failed: {e}")
|
||||
|
||||
def _step6_virtualenv_check(self):
|
||||
self.log("\n===== Virtualenv Check =====")
|
||||
try:
|
||||
virtual_env = os.environ.get('VIRTUAL_ENV')
|
||||
if virtual_env:
|
||||
self.log("Virtualenv is active:")
|
||||
self.log(f"Environment Path: {virtual_env}")
|
||||
self.log(f"Environment Name: {os.path.basename(virtual_env)}")
|
||||
|
||||
self._check_python_packages()
|
||||
else:
|
||||
self.log("No active virtualenv detected")
|
||||
|
||||
if not virtual_env and not os.environ.get('CONDA_PREFIX'):
|
||||
self._log_warning("Neither virtualenv nor Anaconda environment is active")
|
||||
except Exception as e:
|
||||
self._log_error(f"Virtualenv check failed: {e}")
|
||||
|
||||
def _check_python_packages(self):
|
||||
self.log("\nPython Environment:")
|
||||
self.log(f"Python Version: {sys.version}")
|
||||
self.log(f"Python Executable: {sys.executable}")
|
||||
|
||||
required_packages = ['openai', 'python-dotenv', 'requests', 'gradio', 'transformers']
|
||||
|
||||
try:
|
||||
import pkg_resources
|
||||
installed = {pkg.key: pkg.version for pkg in pkg_resources.working_set}
|
||||
|
||||
self.log("\nRequired Package Versions:")
|
||||
for package in required_packages:
|
||||
if package in installed:
|
||||
self.log(f"{package}: {installed[package]}")
|
||||
else:
|
||||
self._log_error(f"Required package '{package}' is not installed")
|
||||
|
||||
# Check for potentially conflicting packages
|
||||
problem_pairs = [
|
||||
('openai', 'openai-python'),
|
||||
('python-dotenv', 'dotenv')
|
||||
]
|
||||
|
||||
for pkg1, pkg2 in problem_pairs:
|
||||
if pkg1 in installed and pkg2 in installed:
|
||||
self._log_warning(f"Potentially conflicting packages: {pkg1} and {pkg2}")
|
||||
except ImportError:
|
||||
self._log_error("Could not import 'pkg_resources' to check installed packages")
|
||||
except Exception as e:
|
||||
self._log_error(f"Package check failed: {e}")
|
||||
|
||||
def _step7_network_connectivity(self):
|
||||
self.log("\n===== Network Connectivity Check =====")
|
||||
try:
|
||||
self.log(f"SSL Version: {ssl.OPENSSL_VERSION}")
|
||||
|
||||
import requests
|
||||
import speedtest # Importing the speedtest-cli library
|
||||
|
||||
# Basic connectivity check
|
||||
urls = [
|
||||
'https://www.google.com',
|
||||
'https://www.cloudflare.com'
|
||||
]
|
||||
|
||||
connected = False
|
||||
for url in urls:
|
||||
try:
|
||||
start_time = time.time()
|
||||
response = requests.get(url, timeout=10)
|
||||
elapsed_time = time.time() - start_time
|
||||
response.raise_for_status()
|
||||
self.log(f"✓ Connected to {url}")
|
||||
self.log(f" Response time: {elapsed_time:.2f}s")
|
||||
|
||||
if elapsed_time > 2:
|
||||
self._log_warning(f"Slow response from {url}: {elapsed_time:.2f}s")
|
||||
connected = True
|
||||
break
|
||||
except requests.exceptions.RequestException as e:
|
||||
self._log_warning(f"Failed to connect to {url}: {e}")
|
||||
else:
|
||||
self.log("Basic connectivity OK")
|
||||
|
||||
if not connected:
|
||||
self._log_error("Failed to connect to any test URLs")
|
||||
return
|
||||
|
||||
# Bandwidth test using speedtest-cli
|
||||
self.log("\nPerforming bandwidth test using speedtest-cli...")
|
||||
try:
|
||||
st = speedtest.Speedtest()
|
||||
st.get_best_server()
|
||||
download_speed = st.download() # Bits per second
|
||||
upload_speed = st.upload() # Bits per second
|
||||
|
||||
download_mbps = download_speed / 1e6 # Convert to Mbps
|
||||
upload_mbps = upload_speed / 1e6
|
||||
|
||||
self.log(f"Download speed: {download_mbps:.2f} Mbps")
|
||||
self.log(f"Upload speed: {upload_mbps:.2f} Mbps")
|
||||
|
||||
if download_mbps < 1:
|
||||
self._log_warning("Download speed is low")
|
||||
if upload_mbps < 0.5:
|
||||
self._log_warning("Upload speed is low")
|
||||
except speedtest.ConfigRetrievalError:
|
||||
self._log_error("Failed to retrieve speedtest configuration")
|
||||
except Exception as e:
|
||||
self._log_warning(f"Bandwidth test failed: {e}")
|
||||
|
||||
except ImportError:
|
||||
self._log_error("Required packages are not installed. Please install them using 'pip install requests speedtest-cli'")
|
||||
except Exception as e:
|
||||
self._log_error(f"Network connectivity check failed: {e}")
|
||||
|
||||
|
||||
def _step8_environment_variables(self):
|
||||
self.log("\n===== Environment Variables Check =====")
|
||||
try:
|
||||
# Check Python paths
|
||||
pythonpath = os.environ.get('PYTHONPATH')
|
||||
if pythonpath:
|
||||
self.log("\nPYTHONPATH:")
|
||||
for path in pythonpath.split(os.pathsep):
|
||||
self.log(f" - {path}")
|
||||
else:
|
||||
self.log("\nPYTHONPATH is not set.")
|
||||
|
||||
self.log("\nPython sys.path:")
|
||||
for path in sys.path:
|
||||
self.log(f" - {path}")
|
||||
|
||||
# Check OPENAI_API_KEY
|
||||
from dotenv import load_dotenv
|
||||
load_dotenv()
|
||||
api_key = os.environ.get('OPENAI_API_KEY')
|
||||
if api_key:
|
||||
self.log("OPENAI_API_KEY is set after calling load_dotenv()")
|
||||
if not api_key.startswith('sk-proj-') or len(api_key)<12:
|
||||
self._log_warning("OPENAI_API_KEY format looks incorrect after calling load_dotenv()")
|
||||
else:
|
||||
self._log_warning("OPENAI_API_KEY environment variable is not set after calling load_dotenv()")
|
||||
except Exception as e:
|
||||
self._log_error(f"Environment variables check failed: {e}")
|
||||
|
||||
def _step9_additional_diagnostics(self):
|
||||
self.log("\n===== Additional Diagnostics =====")
|
||||
try:
|
||||
# Get the site-packages directory paths
|
||||
import site
|
||||
site_packages_paths = site.getsitepackages()
|
||||
if hasattr(site, 'getusersitepackages'):
|
||||
site_packages_paths.append(site.getusersitepackages())
|
||||
|
||||
# Function to check if a path is within site-packages
|
||||
def is_in_site_packages(path):
|
||||
return any(os.path.commonpath([path, sp]) == sp for sp in site_packages_paths)
|
||||
|
||||
# Check for potential name conflicts in the current directory and sys.path
|
||||
conflict_names = ['openai.py', 'dotenv.py']
|
||||
|
||||
# Check current directory
|
||||
current_dir = os.getcwd()
|
||||
for name in conflict_names:
|
||||
conflict_path = os.path.join(current_dir, name)
|
||||
if os.path.isfile(conflict_path):
|
||||
self._log_warning(f"Found '{name}' in the current directory, which may cause import conflicts: {conflict_path}")
|
||||
|
||||
# Check sys.path directories
|
||||
for path in sys.path:
|
||||
if not path or is_in_site_packages(path):
|
||||
continue # Skip site-packages and empty paths
|
||||
for name in conflict_names:
|
||||
conflict_file = os.path.join(path, name)
|
||||
if os.path.isfile(conflict_file):
|
||||
self._log_warning(f"Potential naming conflict: {conflict_file}")
|
||||
|
||||
# Check temp directory
|
||||
try:
|
||||
with tempfile.NamedTemporaryFile() as tmp:
|
||||
self.log(f"Temp directory is writable: {os.path.dirname(tmp.name)}")
|
||||
except Exception as e:
|
||||
self._log_error(f"Cannot write to temp directory: {e}")
|
||||
|
||||
except Exception as e:
|
||||
self._log_error(f"Additional diagnostics failed: {e}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
diagnostics = Diagnostics()
|
||||
diagnostics.run()
|
||||
37
week1/scraper.py
Normal file
37
week1/scraper.py
Normal file
@@ -0,0 +1,37 @@
|
||||
from bs4 import BeautifulSoup
|
||||
import requests
|
||||
|
||||
|
||||
# Standard headers to fetch a website
|
||||
headers = {
|
||||
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
|
||||
}
|
||||
|
||||
|
||||
def fetch_website_contents(url):
|
||||
"""
|
||||
Return the title and contents of the website at the given url;
|
||||
truncate to 2,000 characters as a sensible limit
|
||||
"""
|
||||
response = requests.get(url, headers=headers)
|
||||
soup = BeautifulSoup(response.content, "html.parser")
|
||||
title = soup.title.string if soup.title else "No title found"
|
||||
if soup.body:
|
||||
for irrelevant in soup.body(["script", "style", "img", "input"]):
|
||||
irrelevant.decompose()
|
||||
text = soup.body.get_text(separator="\n", strip=True)
|
||||
else:
|
||||
text = ""
|
||||
return (title + "\n\n" + text)[:2_000]
|
||||
|
||||
|
||||
def fetch_website_links(url):
|
||||
"""
|
||||
Return the links on the webiste at the given url
|
||||
I realize this is inefficient as we're parsing twice! This is to keep the code in the lab simple.
|
||||
Feel free to use a class and optimize it!
|
||||
"""
|
||||
response = requests.get(url, headers=headers)
|
||||
soup = BeautifulSoup(response.content, "html.parser")
|
||||
links = [link.get("href") for link in soup.find_all("a")]
|
||||
return [link for link in links if link]
|
||||
53
week1/solution.py
Normal file
53
week1/solution.py
Normal file
@@ -0,0 +1,53 @@
|
||||
"""
|
||||
Website summarizer using Ollama instead of OpenAI.
|
||||
"""
|
||||
|
||||
from openai import OpenAI
|
||||
from scraper import fetch_website_contents
|
||||
|
||||
OLLAMA_BASE_URL = "http://localhost:11434/v1"
|
||||
MODEL = "llama3.2"
|
||||
|
||||
system_prompt = """
|
||||
You are a snarky assistant that analyzes the contents of a website,
|
||||
and provides a short, snarky, humorous summary, ignoring text that might be navigation related.
|
||||
Respond in markdown. Do not wrap the markdown in a code block - respond just with the markdown.
|
||||
"""
|
||||
|
||||
user_prompt_prefix = """
|
||||
Here are the contents of a website.
|
||||
Provide a short summary of this website.
|
||||
If it includes news or announcements, then summarize these too.
|
||||
|
||||
"""
|
||||
|
||||
|
||||
def messages_for(website):
|
||||
"""Create message list for the LLM."""
|
||||
return [
|
||||
{"role": "system", "content": system_prompt},
|
||||
{"role": "user", "content": user_prompt_prefix + website}
|
||||
]
|
||||
|
||||
|
||||
def summarize(url):
|
||||
"""Fetch and summarize a website using Ollama."""
|
||||
ollama = OpenAI(base_url=OLLAMA_BASE_URL, api_key='ollama')
|
||||
website = fetch_website_contents(url)
|
||||
response = ollama.chat.completions.create(
|
||||
model=MODEL,
|
||||
messages=messages_for(website)
|
||||
)
|
||||
return response.choices[0].message.content
|
||||
|
||||
|
||||
def main():
|
||||
"""Main entry point for testing."""
|
||||
url = input("Enter a URL to summarize: ")
|
||||
print("\nFetching and summarizing...\n")
|
||||
summary = summarize(url)
|
||||
print(summary)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,509 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "2a793b1d-a0a9-404c-ada6-58937227cfce",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Oh dear!\n",
|
||||
"\n",
|
||||
"If you've got here, then you're still having problems setting up your environment. I'm so sorry! Hang in there and we should have you up and running in no time.\n",
|
||||
"\n",
|
||||
"Setting up a Data Science environment can be challenging because there's a lot going on under the hood. But we will get there.\n",
|
||||
"\n",
|
||||
"And please remember - I'm standing by to help out. Message me or email ed@edwarddonner.com and I'll get on the case. The very last cell in this notebook has some diagnostics that will help me figure out what's happening.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "98787335-346f-4ee4-9cb7-6181b0e1b964",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Before we begin\n",
|
||||
"\n",
|
||||
"## Checking your internet connection\n",
|
||||
"\n",
|
||||
"First let's check that there's no VPN or Firewall or Certs problem.\n",
|
||||
"\n",
|
||||
"Click in the cell below and press Shift+Return to run it. \n",
|
||||
"If this gives you problems, then please try working through these instructions to address: \n",
|
||||
"https://chatgpt.com/share/676e6e3b-db44-8012-abaa-b3cf62c83eb3\n",
|
||||
"\n",
|
||||
"I've also heard that you might have problems if you are using a work computer that's running security software zscaler.\n",
|
||||
"\n",
|
||||
"Some advice from students in this situation with zscaler:\n",
|
||||
"\n",
|
||||
"> In the anaconda prompt, this helped sometimes, although still got failures occasionally running code in Jupyter:\n",
|
||||
"`conda config --set ssl_verify false` \n",
|
||||
"Another thing that helped was to add `verify=False` anytime where there is `request.get(..)`, so `request.get(url, headers=headers)` becomes `request.get(url, headers=headers, verify=False)`"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "d296f9b6-8de4-44db-b5f5-9b653dfd3d81",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import urllib.request\n",
|
||||
"\n",
|
||||
"try:\n",
|
||||
" response = urllib.request.urlopen(\"https://www.google.com\", timeout=10)\n",
|
||||
" if response.status != 200:\n",
|
||||
" print(\"Unable to reach google - there may be issues with your internet / VPN / firewall?\")\n",
|
||||
" else:\n",
|
||||
" print(\"Connected to the internet and can reach Google\")\n",
|
||||
"except Exception as e:\n",
|
||||
" print(f\"Failed to connect with this error: {e}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "d91da3b2-5a41-4233-9ed6-c53a7661b328",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Another mention of occasional \"gotchas\" for PC people\n",
|
||||
"\n",
|
||||
"There are 4 snafus on Windows to be aware of: \n",
|
||||
"1. Permissions. Please take a look at this [tutorial](https://chatgpt.com/share/67b0ae58-d1a8-8012-82ca-74762b0408b0) on permissions on Windows\n",
|
||||
"2. Anti-virus, Firewall, VPN. These can interfere with installations and network access; try temporarily disabling them as needed\n",
|
||||
"3. The evil Windows 260 character limit to filenames - here is a full [explanation and fix](https://chatgpt.com/share/67b0afb9-1b60-8012-a9f7-f968a5a910c7)!\n",
|
||||
"4. If you've not worked with Data Science packages on your computer before, you might need to install Microsoft Build Tools. Here are [instructions](https://chatgpt.com/share/67b0b762-327c-8012-b809-b4ec3b9e7be0). A student also mentioned that [these instructions](https://github.com/bycloudai/InstallVSBuildToolsWindows) might be helpful for people on Windows 11. \n",
|
||||
"\n",
|
||||
"## And for Mac people\n",
|
||||
"\n",
|
||||
"1. If you're new to developing on your Mac, you may need to install XCode developer tools. Here are [instructions](https://chatgpt.com/share/67b0b8d7-8eec-8012-9a37-6973b9db11f5).\n",
|
||||
"2. As with PC people, Anti-virus, Firewall, VPN can be problematic. These can interfere with installations and network access; try temporarily disabling them as needed"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "f5190688-205a-46d1-a0dc-9136a42ad0db",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Step 1\n",
|
||||
"\n",
|
||||
"Try running the next cell (click in the cell under this one and hit shift+return).\n",
|
||||
"\n",
|
||||
"If this gives an error, then you're likely not running in an \"activated\" environment. Please check back in Part 5 of the SETUP guide for [PC](../SETUP-PC.md) or [Mac](../SETUP-mac.md) for setting up the Anaconda (or virtualenv) environment and activating it, before running `jupyter lab`.\n",
|
||||
"\n",
|
||||
"If you look in the Anaconda prompt (PC) or the Terminal (Mac), you should see `(llms)` in your prompt where you launch `jupyter lab` - that's your clue that the llms environment is activated.\n",
|
||||
"\n",
|
||||
"If you are in an activated environment, the next thing to try is to restart everything:\n",
|
||||
"1. Close down all Jupyter windows, like this one\n",
|
||||
"2. Exit all command prompts / Terminals / Anaconda\n",
|
||||
"3. Repeat Part 5 from the SETUP instructions to begin a new activated environment and launch `jupyter lab` from the `llm_engineering` directory \n",
|
||||
"4. Come back to this notebook, and do Kernel menu >> Restart Kernel and Clear Outputs of All Cells\n",
|
||||
"5. Try the cell below again.\n",
|
||||
"\n",
|
||||
"If **that** doesn't work, then please contact me! I'll respond quickly, and we'll figure it out. Please run the diagnostics (last cell in this notebook) so I can debug. If you used Anaconda, it might be that for some reason your environment is corrupted, in which case the simplest fix is to use the virtualenv approach instead (Part 2B in the setup guides)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "7c8c0bb3-0e94-466e-8d1a-4dfbaa014cbe",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Some quick checks that your Conda environment or VirtualEnv is as expected\n",
|
||||
"# The Environment Name should be: llms\n",
|
||||
"\n",
|
||||
"import os\n",
|
||||
"conda_name, venv_name = \"\", \"\"\n",
|
||||
"\n",
|
||||
"conda_prefix = os.environ.get('CONDA_PREFIX')\n",
|
||||
"if conda_prefix:\n",
|
||||
" print(\"Anaconda environment is active:\")\n",
|
||||
" print(f\"Environment Path: {conda_prefix}\")\n",
|
||||
" conda_name = os.path.basename(conda_prefix)\n",
|
||||
" print(f\"Environment Name: {conda_name}\")\n",
|
||||
"\n",
|
||||
"virtual_env = os.environ.get('VIRTUAL_ENV')\n",
|
||||
"if virtual_env:\n",
|
||||
" print(\"Virtualenv is active:\")\n",
|
||||
" print(f\"Environment Path: {virtual_env}\")\n",
|
||||
" venv_name = os.path.basename(virtual_env)\n",
|
||||
" print(f\"Environment Name: {venv_name}\")\n",
|
||||
"\n",
|
||||
"if conda_name != \"llms\" and venv_name != \"llms\" and venv_name != \"venv\":\n",
|
||||
" print(\"Neither Anaconda nor Virtualenv seem to be activated with the expected name 'llms' or 'venv'\")\n",
|
||||
" print(\"Did you run 'jupyter lab' from an activated environment with (llms) showing on the command line?\")\n",
|
||||
" print(\"If in doubt, close down all jupyter lab, and follow Part 5 in the SETUP-PC or SETUP-mac guide.\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "45e2cc99-b7d3-48bd-b27c-910206c4171a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Step 1.1\n",
|
||||
"\n",
|
||||
"## It's time to check that the environment is good and dependencies are installed\n",
|
||||
"\n",
|
||||
"And now, this next cell should run with no output - no import errors. \n",
|
||||
"\n",
|
||||
"Import errors might indicate that you started jupyter lab without your environment activated? See SETUP Part 5. \n",
|
||||
"\n",
|
||||
"Or you might need to restart your Kernel and Jupyter Lab. \n",
|
||||
"\n",
|
||||
"Or it's possible that something is wrong with Anaconda. \n",
|
||||
"If so, here are some recovery instructions: \n",
|
||||
"First, close everything down and restart your computer. \n",
|
||||
"Then in an Anaconda Prompt (PC) or Terminal (Mac), from an activated environment, with **(llms)** showing in the prompt, from the llm_engineering directory, run this: \n",
|
||||
"`python -m pip install --upgrade pip` \n",
|
||||
"`pip install --retries 5 --timeout 15 --no-cache-dir --force-reinstall -r requirements.txt` \n",
|
||||
"Watch carefully for any errors, and let me know. \n",
|
||||
"If you see instructions to install Microsoft Build Tools, or Apple XCode tools, then follow the instructions. \n",
|
||||
"Then try again!\n",
|
||||
"\n",
|
||||
"Finally, if that doesn't work, please try SETUP Part 2B, the alternative to Part 2 (with Python 3.11 or Python 3.12). \n",
|
||||
"\n",
|
||||
"If you're unsure, please run the diagnostics (last cell in this notebook) and then email me at ed@edwarddonner.com"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "6c78b7d9-1eea-412d-8751-3de20c0f6e2f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# This import should work if your environment is active and dependencies are installed!\n",
|
||||
"\n",
|
||||
"from openai import OpenAI"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "b66a8460-7b37-4b4c-a64b-24ae45cf07eb",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Step 2\n",
|
||||
"\n",
|
||||
"Let's check your .env file exists and has the OpenAI key set properly inside it. \n",
|
||||
"Please run this code and check that it prints a successful message, otherwise follow its instructions.\n",
|
||||
"\n",
|
||||
"If it isn't successful, then it's not able to find a file called `.env` in the `llm_engineering` folder. \n",
|
||||
"The name of the file must be exactly `.env` - it won't work if it's called `my-keys.env` or `.env.doc`. \n",
|
||||
"Is it possible that `.env` is actually called `.env.txt`? In Windows, you may need to change a setting in the File Explorer to ensure that file extensions are showing (\"Show file extensions\" set to \"On\"). You should also see file extensions if you type `dir` in the `llm_engineering` directory.\n",
|
||||
"\n",
|
||||
"Nasty gotchas to watch out for: \n",
|
||||
"- In the .env file, there should be no space between the equals sign and the key. Like: `OPENAI_API_KEY=sk-proj-...`\n",
|
||||
"- If you copied and pasted your API key from another application, make sure that it didn't replace hyphens in your key with long dashes \n",
|
||||
"\n",
|
||||
"Note that the `.env` file won't show up in your Jupyter Lab file browser, because Jupyter hides files that start with a dot for your security; they're considered hidden files. If you need to change the name, you'll need to use a command terminal or File Explorer (PC) / Finder Window (Mac). Ask ChatGPT if that's giving you problems, or email me!\n",
|
||||
"\n",
|
||||
"If you're having challenges creating the `.env` file, we can also do it with code! See the cell after the next one.\n",
|
||||
"\n",
|
||||
"It's important to launch `jupyter lab` from the project root directory, `llm_engineering`. If you didn't do that, this cell might give you problems."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "caa4837e-b970-4f89-aa9a-8aa793c754fd",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from pathlib import Path\n",
|
||||
"\n",
|
||||
"parent_dir = Path(\"..\")\n",
|
||||
"env_path = parent_dir / \".env\"\n",
|
||||
"\n",
|
||||
"if env_path.exists() and env_path.is_file():\n",
|
||||
" print(\".env file found.\")\n",
|
||||
"\n",
|
||||
" # Read the contents of the .env file\n",
|
||||
" with env_path.open(\"r\") as env_file:\n",
|
||||
" contents = env_file.readlines()\n",
|
||||
"\n",
|
||||
" key_exists = any(line.startswith(\"OPENAI_API_KEY=\") for line in contents)\n",
|
||||
" good_key = any(line.startswith(\"OPENAI_API_KEY=sk-proj-\") for line in contents)\n",
|
||||
" classic_problem = any(\"OPEN_\" in line for line in contents)\n",
|
||||
" \n",
|
||||
" if key_exists and good_key:\n",
|
||||
" print(\"SUCCESS! OPENAI_API_KEY found and it has the right prefix\")\n",
|
||||
" elif key_exists:\n",
|
||||
" print(\"Found an OPENAI_API_KEY although it didn't have the expected prefix sk-proj- \\nPlease double check your key in the file..\")\n",
|
||||
" elif classic_problem:\n",
|
||||
" print(\"Didn't find an OPENAI_API_KEY, but I notice that 'OPEN_' appears - do you have a typo like OPEN_API_KEY instead of OPENAI_API_KEY?\")\n",
|
||||
" else:\n",
|
||||
" print(\"Didn't find an OPENAI_API_KEY in the .env file\")\n",
|
||||
"else:\n",
|
||||
" print(\".env file not found in the llm_engineering directory. It needs to have exactly the name: .env\")\n",
|
||||
" \n",
|
||||
" possible_misnamed_files = list(parent_dir.glob(\"*.env*\"))\n",
|
||||
" \n",
|
||||
" if possible_misnamed_files:\n",
|
||||
" print(\"\\nWarning: No '.env' file found, but the following files were found in the llm_engineering directory that contain '.env' in the name. Perhaps this needs to be renamed?\")\n",
|
||||
" for file in possible_misnamed_files:\n",
|
||||
" print(file.name)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "105f9e0a-9ff4-4344-87c8-e3e41bc50869",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Fallback plan - python code to create the .env file for you\n",
|
||||
"\n",
|
||||
"Only run the next cell if you're having problems making the .env file. \n",
|
||||
"Replace the text in the first line of code with your key from OpenAI."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "ab9ea6ef-49ee-4899-a1c7-75a8bd9ac36b",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Only run this code in this cell if you want to have a .env file created for you!\n",
|
||||
"\n",
|
||||
"# Put your key inside the quote marks\n",
|
||||
"make_me_a_file_with_this_key = \"put your key here inside these quotes.. it should start sk-proj-\"\n",
|
||||
"\n",
|
||||
"# Change this to True if you already have a .env file and you want me to replace it\n",
|
||||
"overwrite_if_already_exists = False \n",
|
||||
"\n",
|
||||
"from pathlib import Path\n",
|
||||
"\n",
|
||||
"parent_dir = Path(\"..\")\n",
|
||||
"env_path = parent_dir / \".env\"\n",
|
||||
"\n",
|
||||
"if env_path.exists() and not overwrite_if_already_exists:\n",
|
||||
" print(\"There is already a .env file - if you want me to create a new one, change the variable overwrite_if_already_exists to True above\")\n",
|
||||
"else:\n",
|
||||
" try:\n",
|
||||
" with env_path.open(mode='w', encoding='utf-8') as env_file:\n",
|
||||
" env_file.write(f\"OPENAI_API_KEY={make_me_a_file_with_this_key}\")\n",
|
||||
" print(f\"Successfully created the .env file at {env_path}\")\n",
|
||||
" if not make_me_a_file_with_this_key.startswith(\"sk-proj-\"):\n",
|
||||
" print(f\"The key that you provided started with '{make_me_a_file_with_this_key[:8]}' which is different to sk-proj- is that what you intended?\")\n",
|
||||
" print(\"Now rerun the previous cell to confirm that the file is created and the key is correct.\")\n",
|
||||
" except Exception as e:\n",
|
||||
" print(f\"An error occurred while creating the .env file: {e}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "0ba9420d-3bf0-4e08-abac-f2fbf0e9c7f1",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Step 3\n",
|
||||
"\n",
|
||||
"Now let's check that your API key is correct set up in your `.env` file, and available using the dotenv package.\n",
|
||||
"Try running the next cell."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "0ee8e613-5a6e-4d1f-96ef-91132da545c8",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# This should print your API key to the output - please follow the instructions that get printed\n",
|
||||
"\n",
|
||||
"import os\n",
|
||||
"from dotenv import load_dotenv\n",
|
||||
"load_dotenv(override=True)\n",
|
||||
"\n",
|
||||
"api_key = os.getenv(\"OPENAI_API_KEY\")\n",
|
||||
"\n",
|
||||
"if not api_key:\n",
|
||||
" print(\"No API key was found - please try Kernel menu >> Restart Kernel And Clear Outputs of All Cells\")\n",
|
||||
"elif not api_key.startswith(\"sk-proj-\"):\n",
|
||||
" print(f\"An API key was found, but it starts with {api_key[:8]} rather than sk-proj- please double check this is as expected.\")\n",
|
||||
"elif api_key.strip() != api_key:\n",
|
||||
" print(\"An API key was found, but it looks like it might have space or tab characters at the start or end - please remove them\")\n",
|
||||
"else:\n",
|
||||
" print(\"API key found and looks good so far!\")\n",
|
||||
"\n",
|
||||
"if api_key:\n",
|
||||
" problematic_unicode_chars = ['\\u2013', '\\u2014', '\\u201c', '\\u201d', '\\u2026', '\\u2018', '\\u2019']\n",
|
||||
" forbidden_chars = [\"'\", \" \", \"\\n\", \"\\r\", '\"']\n",
|
||||
" \n",
|
||||
" if not all(32 <= ord(char) <= 126 for char in api_key):\n",
|
||||
" print(\"Potential problem: there might be unprintable characters accidentally included in the key?\")\n",
|
||||
" elif any(char in api_key for char in problematic_unicode_chars):\n",
|
||||
" print(\"Potential problem: there might be special characters, like long hyphens or curly quotes in the key - did you copy it via a word processor?\")\n",
|
||||
" elif any(char in api_key for char in forbidden_chars):\n",
|
||||
" print(\"Potential problem: there are quote marks, spaces or empty lines in your key?\")\n",
|
||||
" else:\n",
|
||||
" print(\"The API key contains valid characters\")\n",
|
||||
" \n",
|
||||
"print(f\"\\nHere is the key --> {api_key} <--\")\n",
|
||||
"print()\n",
|
||||
"print(\"If this key looks good, please go to the Edit menu >> Clear Cell Output so that your key is no longer displayed here!\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "f403e515-0e7d-4be4-bb79-5a102dbd6c94",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## It should print some checks including something like:\n",
|
||||
"\n",
|
||||
"`Here is the key --> sk-proj-blahblahblah <--`\n",
|
||||
"\n",
|
||||
"If it didn't print a key, then hopefully it's given you enough information to figure this out. Or contact me!\n",
|
||||
"\n",
|
||||
"There is a final fallback approach if you wish: you can avoid using .env files altogether, and simply always provide your API key manually. \n",
|
||||
"Whenever you see this in the code: \n",
|
||||
"`openai = OpenAI()` \n",
|
||||
"You can replace it with: \n",
|
||||
"`openai = OpenAI(api_key=\"sk-proj-xxx\")`\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "42afad1f-b0bf-4882-b469-7709060fee3a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Step 4\n",
|
||||
"\n",
|
||||
"Now run the below code and you will hopefully see that GPT can handle basic arithmetic!!\n",
|
||||
"\n",
|
||||
"If not, see the cell below."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "cccb58e7-6626-4033-9dc1-e7e3ff742f6b",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from openai import OpenAI\n",
|
||||
"from dotenv import load_dotenv\n",
|
||||
"load_dotenv(override=True)\n",
|
||||
"\n",
|
||||
"my_api_key = os.getenv(\"OPENAI_API_KEY\")\n",
|
||||
"\n",
|
||||
"print(f\"Using API key --> {my_api_key} <--\")\n",
|
||||
"\n",
|
||||
"openai = OpenAI()\n",
|
||||
"completion = openai.chat.completions.create(\n",
|
||||
" model='gpt-4o-mini',\n",
|
||||
" messages=[{\"role\":\"user\", \"content\": \"What's 2+2?\"}],\n",
|
||||
")\n",
|
||||
"print(completion.choices[0].message.content)\n",
|
||||
"print(\"Now go to Edit menu >> Clear Cell Output to remove the display of your key.\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "81046a77-c359-4388-929f-ffc8ad5cb93c",
|
||||
"metadata": {
|
||||
"jp-MarkdownHeadingCollapsed": true
|
||||
},
|
||||
"source": [
|
||||
"## If the key was set correctly, and this still didn't work\n",
|
||||
"\n",
|
||||
"### If there's an error from OpenAI about your key, or a Rate Limit Error, then there's something up with your API key!\n",
|
||||
"\n",
|
||||
"First check [this webpage](https://platform.openai.com/settings/organization/billing/overview) to make sure you have a positive credit balance.\n",
|
||||
"OpenAI requires that you have a positive credit balance and it has minimums, typically around $5 in local currency. My salespitch for OpenAI is that this is well worth it for your education: for less than the price of a music album, you will build so much valuable commercial experience. But it's not required for this course at all; the README has instructions to call free open-source models via Ollama whenever we use OpenAI.\n",
|
||||
"\n",
|
||||
"OpenAI billing page with credit balance is here: \n",
|
||||
"https://platform.openai.com/settings/organization/billing/overview \n",
|
||||
"OpenAI can take a few minutes to enable your key after you top up your balance. \n",
|
||||
"A student outside the US mentioned that he needed to allow international payments on his credit card for this to work. \n",
|
||||
"\n",
|
||||
"It's unlikely, but if there's something wrong with your key, you could also try creating a new key (button on the top right) here: \n",
|
||||
"https://platform.openai.com/api-keys\n",
|
||||
"\n",
|
||||
"### Check that you can use gpt-4o-mini from the OpenAI playground\n",
|
||||
"\n",
|
||||
"To confirm that billing is set up and your key is good, you could try using gtp-4o-mini directly: \n",
|
||||
"https://platform.openai.com/playground/chat?models=gpt-4o-mini\n",
|
||||
"\n",
|
||||
"### If there's a cert related error\n",
|
||||
"\n",
|
||||
"If you encountered a certificates error like: \n",
|
||||
"`ConnectError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)` \n",
|
||||
"Then please replace:\n",
|
||||
"`openai = OpenAI()` \n",
|
||||
"with: \n",
|
||||
"`import httpx` \n",
|
||||
"`openai = OpenAI(http_client=httpx.Client(verify=False))` \n",
|
||||
"And also please replace: \n",
|
||||
"`requests.get(url, headers=headers)` \n",
|
||||
"with: \n",
|
||||
"`requests.get(url, headers=headers, verify=False)` \n",
|
||||
"And if that works, you're in good shape. You'll just have to change the labs in the same way any time you hit this cert error. \n",
|
||||
"This approach isn't OK for production code, but it's fine for our experiments. You may need to contact IT support to understand whether there are restrictions in your environment.\n",
|
||||
"\n",
|
||||
"## If all else fails:\n",
|
||||
"\n",
|
||||
"(1) Try pasting your error into ChatGPT or Claude! It's amazing how often they can figure things out\n",
|
||||
"\n",
|
||||
"(2) Try creating another key and replacing it in the .env file and rerunning!\n",
|
||||
"\n",
|
||||
"(3) Contact me! Please run the diagnostics in the cell below, then email me your problems to ed@edwarddonner.com\n",
|
||||
"\n",
|
||||
"Thanks so much, and I'm sorry this is giving you bother!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "dc83f944-6ce0-4b5c-817f-952676e284ec",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Gathering Essential Diagnostic information\n",
|
||||
"\n",
|
||||
"## Please run this next cell to gather some important data\n",
|
||||
"\n",
|
||||
"Please run the next cell; it should take a minute or so to run. Most of the time is checking your network bandwidth.\n",
|
||||
"Then email me the output of the last cell to ed@edwarddonner.com. \n",
|
||||
"Alternatively: this will create a file called report.txt - just attach the file to your email."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "248204f0-7bad-482a-b715-fb06a3553916",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Run my diagnostics report to collect key information for debugging\n",
|
||||
"# Please email me the results. Either copy & paste the output, or attach the file report.txt\n",
|
||||
"\n",
|
||||
"!pip install -q requests speedtest-cli psutil setuptools\n",
|
||||
"from diagnostics import Diagnostics\n",
|
||||
"Diagnostics().run()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "e1955b9a-d344-4782-b448-2770d0edd90c",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.11"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
Reference in New Issue
Block a user