247 lines
8.7 KiB
Plaintext
247 lines
8.7 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "fe12c203-e6a6-452c-a655-afb8a03a4ff5",
|
||
"metadata": {},
|
||
"source": [
|
||
"# End of week 1 exercise\n",
|
||
"\n",
|
||
"To demonstrate your familiarity with OpenAI API, and also Ollama, build a tool that takes a technical question, \n",
|
||
"and responds with an explanation. This is a tool that you will be able to use yourself during the course!"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 13,
|
||
"id": "c1070317-3ed9-4659-abe3-828943230e03",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"import os\n",
|
||
"import requests\n",
|
||
"from dotenv import load_dotenv\n",
|
||
"from bs4 import BeautifulSoup\n",
|
||
"from openai import OpenAI\n",
|
||
"import ollama\n",
|
||
"from IPython.display import Markdown, clear_output, display"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"id": "4a456906-915a-4bfd-bb9d-57e505c5093f",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# constants\n",
|
||
"\n",
|
||
"MODEL_GPT = 'gpt-4o-mini'\n",
|
||
"MODEL_LLAMA = 'llama3.2'"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"id": "a8d7923c-5f28-4c30-8556-342d7c8497c1",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# set up environment\n",
|
||
"load_dotenv(override=True)\n",
|
||
"apikey = os.getenv(\"OPENAI_API_KEY\")\n",
|
||
"openai = OpenAI()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 9,
|
||
"id": "3f0d0137-52b0-47a8-81a8-11a90a010798",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# here is the question; type over this to ask something new\n",
|
||
"\n",
|
||
"question = \"\"\"\n",
|
||
"Please explain what this code does and why:\n",
|
||
"yield from {book.get(\"author\") for book in books if book.get(\"author\")}\n",
|
||
"\"\"\""
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 51,
|
||
"id": "d9630ca0-fa23-4f80-8c52-4c51b0f25534",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"messages = [\n",
|
||
" {\n",
|
||
" \"role\":\"system\",\n",
|
||
" \"content\" : '''You are a technical adviser. the student is learning llm engineering \n",
|
||
" and you will be asked few lines of codes to explain with an example. \n",
|
||
" mostly in python'''\n",
|
||
" },\n",
|
||
" {\n",
|
||
" \"role\":\"user\",\n",
|
||
" \"content\":question\n",
|
||
" }\n",
|
||
"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 37,
|
||
"id": "60ce7000-a4a5-4cce-a261-e75ef45063b4",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/markdown": [
|
||
"This line of code uses a generator in Python to yield values from a set comprehension. Let’s break it down:\n",
|
||
"\n",
|
||
"1. **`{book.get(\"author\") for book in books if book.get(\"author\")}`**:\n",
|
||
" - This is a set comprehension that creates a set of unique authors from a collection called `books`.\n",
|
||
" - `books` is expected to be a list (or any iterable) where each item (called `book`) is likely a dictionary.\n",
|
||
" - The expression `book.get(\"author\")` attempts to retrieve the value associated with the key `\"author\"` from each `book` dictionary.\n",
|
||
" - The `if book.get(\"author\")` condition filters out any books where the `author` key does not exist or is `None`, ensuring only valid author names are included in the set.\n",
|
||
" - Since it’s a set comprehension, any duplicate authors will be automatically removed, resulting in a set of unique authors.\n",
|
||
"\n",
|
||
"2. **`yield from`**:\n",
|
||
" - The `yield from` syntax is used within a generator function to yield all values from another iterable. In this case, it is yielding each item from the set created by the comprehension.\n",
|
||
" - This means that when this generator function is called, it will produce each unique author found in the `books` iterable one at a time.\n",
|
||
"\n",
|
||
"### Summary\n",
|
||
"The line of code effectively constructs a generator that will yield unique authors from a list of book dictionaries, where each dictionary is expected to contain an `\"author\"` key. The use of `yield from` allows the generator to yield each author in the set without further iteration code. This approach is efficient and neatly combines filtering, uniqueness, and yielding into a single line of code."
|
||
],
|
||
"text/plain": [
|
||
"<IPython.core.display.Markdown object>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"stream = openai.chat.completions.create(\n",
|
||
" model=MODEL_GPT,\n",
|
||
" messages=messages,\n",
|
||
" stream=True)\n",
|
||
"stringx = \"\"\n",
|
||
"print(stream)\n",
|
||
"for x in stream:\n",
|
||
" if getattr(x.choices[0].delta, \"content\", None):\n",
|
||
" stringx+=x.choices[0].delta.content\n",
|
||
" clear_output(wait=True)\n",
|
||
" display(Markdown(stringx))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 52,
|
||
"id": "4d482c69-b61a-4a94-84df-73f1d97a4419",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/markdown": [
|
||
"Let's break down this line of code:\n",
|
||
"\n",
|
||
"**Code Analysis**\n",
|
||
"\n",
|
||
"```python\n",
|
||
"yield from {book.get(\"author\") for book in books if book.get(\"author\")}\n",
|
||
"```\n",
|
||
"\n",
|
||
"**Explanation**\n",
|
||
"\n",
|
||
"This is a Python generator expression that uses the `yield from` syntax.\n",
|
||
"\n",
|
||
"Here's what it does:\n",
|
||
"\n",
|
||
"1. **List Comprehension**: `{...}` is a list comprehension, which generates a new list containing the results of an expression applied to each item in the input iterable (`books`).\n",
|
||
"2. **Filtering**: The condition `if book.get(\"author\")` filters out any items from the `books` list where `\"author\"` is not present as a key-value pair.\n",
|
||
"3. **Dictionary Lookup**: `.get(\"author\")` looks up the value associated with the key `\"author\"` in each dictionary (`book`) and returns it if found, or `None` otherwise.\n",
|
||
"\n",
|
||
"**What does `yield from` do?**\n",
|
||
"\n",
|
||
"The `yield from` keyword is used to \"forward\" the iteration of another generator (or iterable) into this one. In other words, instead of creating a new list containing all the values generated by the inner iterator (`{book.get(\"author\") for book in books if book.get(\"author\")}`), it yields each value **one at a time**, as if you were iterating over the original `books` list.\n",
|
||
"\n",
|
||
"**Why is this useful?**\n",
|
||
"\n",
|
||
"By using `yield from`, we can create a generator that:\n",
|
||
"\n",
|
||
"* Only generates values when they are actually needed (i.e., only when an iteration is requested).\n",
|
||
"* Does not consume extra memory for creating an intermediate list.\n",
|
||
"\n",
|
||
"This makes it more memory-efficient, especially when dealing with large datasets or infinite iterations.\n",
|
||
"\n",
|
||
"**Example**\n",
|
||
"\n",
|
||
"Suppose we have a list of books with authors:\n",
|
||
"```python\n",
|
||
"books = [\n",
|
||
" {\"title\": \"Book 1\", \"author\": \"Author A\"},\n",
|
||
" {\"title\": \"Book 2\", \"author\": None},\n",
|
||
" {\"title\": \"Book 3\", \"author\": \"Author C\"}\n",
|
||
"]\n",
|
||
"```\n",
|
||
"If we apply the generator expression to this list, it would yield:\n",
|
||
"```python\n",
|
||
"yield from {book.get(\"author\") for book in books if book.get(\"author\")}\n",
|
||
"```\n",
|
||
"The output would be: `['Author A', 'Author C']`\n",
|
||
"\n",
|
||
"Note that the second book (\"Book 2\") is skipped because its author is `None`."
|
||
],
|
||
"text/plain": [
|
||
"<IPython.core.display.Markdown object>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"text = \"\"\n",
|
||
"for obj in ollama.chat(\n",
|
||
" model=MODEL_LLAMA,\n",
|
||
" messages=messages,\n",
|
||
" stream=True):\n",
|
||
" text+=obj.message.content\n",
|
||
" clear_output(wait=True)\n",
|
||
" display(Markdown(text))\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "ef1194fc-3c9c-432c-86cc-f77f33916188",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3 (ipykernel)",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.11.13"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 5
|
||
}
|