{ "cells": [ { "cell_type": "markdown", "id": "fe12c203-e6a6-452c-a655-afb8a03a4ff5", "metadata": {}, "source": [ "# End of week 1 exercise\n", "\n", "To demonstrate your familiarity with OpenAI API, and also Ollama, build a tool that takes a technical question, \n", "and responds with an explanation. This is a tool that you will be able to use yourself during the course!" ] }, { "cell_type": "code", "execution_count": 13, "id": "c1070317-3ed9-4659-abe3-828943230e03", "metadata": {}, "outputs": [], "source": [ "import os\n", "import requests\n", "from dotenv import load_dotenv\n", "from bs4 import BeautifulSoup\n", "from openai import OpenAI\n", "import ollama\n", "from IPython.display import Markdown, clear_output, display" ] }, { "cell_type": "code", "execution_count": 7, "id": "4a456906-915a-4bfd-bb9d-57e505c5093f", "metadata": {}, "outputs": [], "source": [ "# constants\n", "\n", "MODEL_GPT = 'gpt-4o-mini'\n", "MODEL_LLAMA = 'llama3.2'" ] }, { "cell_type": "code", "execution_count": 8, "id": "a8d7923c-5f28-4c30-8556-342d7c8497c1", "metadata": {}, "outputs": [], "source": [ "# set up environment\n", "load_dotenv(override=True)\n", "apikey = os.getenv(\"OPENAI_API_KEY\")\n", "openai = OpenAI()" ] }, { "cell_type": "code", "execution_count": 9, "id": "3f0d0137-52b0-47a8-81a8-11a90a010798", "metadata": {}, "outputs": [], "source": [ "# here is the question; type over this to ask something new\n", "\n", "question = \"\"\"\n", "Please explain what this code does and why:\n", "yield from {book.get(\"author\") for book in books if book.get(\"author\")}\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": 51, "id": "d9630ca0-fa23-4f80-8c52-4c51b0f25534", "metadata": {}, "outputs": [], "source": [ "messages = [\n", " {\n", " \"role\":\"system\",\n", " \"content\" : '''You are a technical adviser. the student is learning llm engineering \n", " and you will be asked few lines of codes to explain with an example. \n", " mostly in python'''\n", " },\n", " {\n", " \"role\":\"user\",\n", " \"content\":question\n", " }\n", "]" ] }, { "cell_type": "code", "execution_count": 37, "id": "60ce7000-a4a5-4cce-a261-e75ef45063b4", "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "This line of code uses a generator in Python to yield values from a set comprehension. Let’s break it down:\n", "\n", "1. **`{book.get(\"author\") for book in books if book.get(\"author\")}`**:\n", " - This is a set comprehension that creates a set of unique authors from a collection called `books`.\n", " - `books` is expected to be a list (or any iterable) where each item (called `book`) is likely a dictionary.\n", " - The expression `book.get(\"author\")` attempts to retrieve the value associated with the key `\"author\"` from each `book` dictionary.\n", " - The `if book.get(\"author\")` condition filters out any books where the `author` key does not exist or is `None`, ensuring only valid author names are included in the set.\n", " - Since it’s a set comprehension, any duplicate authors will be automatically removed, resulting in a set of unique authors.\n", "\n", "2. **`yield from`**:\n", " - The `yield from` syntax is used within a generator function to yield all values from another iterable. In this case, it is yielding each item from the set created by the comprehension.\n", " - This means that when this generator function is called, it will produce each unique author found in the `books` iterable one at a time.\n", "\n", "### Summary\n", "The line of code effectively constructs a generator that will yield unique authors from a list of book dictionaries, where each dictionary is expected to contain an `\"author\"` key. The use of `yield from` allows the generator to yield each author in the set without further iteration code. This approach is efficient and neatly combines filtering, uniqueness, and yielding into a single line of code." ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "stream = openai.chat.completions.create(\n", " model=MODEL_GPT,\n", " messages=messages,\n", " stream=True)\n", "stringx = \"\"\n", "print(stream)\n", "for x in stream:\n", " if getattr(x.choices[0].delta, \"content\", None):\n", " stringx+=x.choices[0].delta.content\n", " clear_output(wait=True)\n", " display(Markdown(stringx))" ] }, { "cell_type": "code", "execution_count": 52, "id": "4d482c69-b61a-4a94-84df-73f1d97a4419", "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "Let's break down this line of code:\n", "\n", "**Code Analysis**\n", "\n", "```python\n", "yield from {book.get(\"author\") for book in books if book.get(\"author\")}\n", "```\n", "\n", "**Explanation**\n", "\n", "This is a Python generator expression that uses the `yield from` syntax.\n", "\n", "Here's what it does:\n", "\n", "1. **List Comprehension**: `{...}` is a list comprehension, which generates a new list containing the results of an expression applied to each item in the input iterable (`books`).\n", "2. **Filtering**: The condition `if book.get(\"author\")` filters out any items from the `books` list where `\"author\"` is not present as a key-value pair.\n", "3. **Dictionary Lookup**: `.get(\"author\")` looks up the value associated with the key `\"author\"` in each dictionary (`book`) and returns it if found, or `None` otherwise.\n", "\n", "**What does `yield from` do?**\n", "\n", "The `yield from` keyword is used to \"forward\" the iteration of another generator (or iterable) into this one. In other words, instead of creating a new list containing all the values generated by the inner iterator (`{book.get(\"author\") for book in books if book.get(\"author\")}`), it yields each value **one at a time**, as if you were iterating over the original `books` list.\n", "\n", "**Why is this useful?**\n", "\n", "By using `yield from`, we can create a generator that:\n", "\n", "* Only generates values when they are actually needed (i.e., only when an iteration is requested).\n", "* Does not consume extra memory for creating an intermediate list.\n", "\n", "This makes it more memory-efficient, especially when dealing with large datasets or infinite iterations.\n", "\n", "**Example**\n", "\n", "Suppose we have a list of books with authors:\n", "```python\n", "books = [\n", " {\"title\": \"Book 1\", \"author\": \"Author A\"},\n", " {\"title\": \"Book 2\", \"author\": None},\n", " {\"title\": \"Book 3\", \"author\": \"Author C\"}\n", "]\n", "```\n", "If we apply the generator expression to this list, it would yield:\n", "```python\n", "yield from {book.get(\"author\") for book in books if book.get(\"author\")}\n", "```\n", "The output would be: `['Author A', 'Author C']`\n", "\n", "Note that the second book (\"Book 2\") is skipped because its author is `None`." ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "text = \"\"\n", "for obj in ollama.chat(\n", " model=MODEL_LLAMA,\n", " messages=messages,\n", " stream=True):\n", " text+=obj.message.content\n", " clear_output(wait=True)\n", " display(Markdown(text))\n" ] }, { "cell_type": "code", "execution_count": null, "id": "ef1194fc-3c9c-432c-86cc-f77f33916188", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.13" } }, "nbformat": 4, "nbformat_minor": 5 }