{ "cells": [ { "cell_type": "markdown", "id": "1bf0f654", "metadata": {}, "source": [ "# Custom Price Estimator\n", "\n", "This notebook mirrors the week 6 day 5 fine-tuning workflow and pushes it a little further with the goal of beating the $76 average error target on the shared product dataset." ] }, { "cell_type": "markdown", "id": "4b4a89e6", "metadata": {}, "source": [ "## Plan\n", "- Load the curated `Item` objects that we prepared earlier in week 6.\n", "- Create train/validation splits sized for a stronger fine-tune than the baseline.\n", "- Package the conversations in JSONL format and launch an OpenAI fine-tuning job.\n", "- Retrieve the tuned model, score it with the shared tester, and aim for < $76 average error." ] }, { "cell_type": "markdown", "id": "8dc5b7b0", "metadata": {}, "source": [ "## Environment Setup\n", "Pull in the packages, load API keys from `.env`, and make sure we can talk to both the OpenAI and Hugging Face services used elsewhere in the course." ] }, { "cell_type": "code", "execution_count": null, "id": "f6332b2b", "metadata": {}, "outputs": [], "source": [ "import os\n", "import json\n", "import pickle\n", "import random\n", "import re\n", "from pathlib import Path\n", "\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "from dotenv import load_dotenv\n", "from huggingface_hub import login\n", "\n", "from items import Item\n", "from testing import Tester\n", "from openai import OpenAI\n" ] }, { "cell_type": "code", "execution_count": null, "id": "14eb4e29", "metadata": {}, "outputs": [], "source": [ "# Load secrets from the .env file so the OpenAI client picks them up.\n", "load_dotenv(override=True)\n", "os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY', 'set-your-openai-key')\n", "os.environ['HF_TOKEN'] = os.getenv('HF_TOKEN', 'set-your-hf-token')\n" ] }, { "cell_type": "code", "execution_count": null, "id": "b07a6cab", "metadata": {}, "outputs": [], "source": [ "# Log in to Hugging Face once per session (needed for the tokenizer used in Item).\n", "hf_token = os.environ['HF_TOKEN']\n", "if hf_token and hf_token != 'set-your-hf-token':\n", " login(hf_token, add_to_git_credential=True)\n", "else:\n", " print('⚠️ Provide a valid HF_TOKEN in your .env if you need to download tokenizer weights.')\n" ] }, { "cell_type": "code", "execution_count": null, "id": "113d520b", "metadata": {}, "outputs": [], "source": [ "openai = OpenAI()\n", "%matplotlib inline\n" ] }, { "cell_type": "markdown", "id": "04ae4263", "metadata": {}, "source": [ "## Load the Week 6 Dataset\n", "We reuse the curated pickled `Item` objects. If the pickle files are missing, circle back to the earlier data curation notebook to regenerate them." ] }, { "cell_type": "code", "execution_count": null, "id": "6ca7ca03", "metadata": {}, "outputs": [], "source": [ "#Let's avoid curating all our data again! Load in the pickle files:\n", "with open('train_lite.pkl', 'rb') as file:\n", " train = pickle.load(file)\n", "\n", "with open('test_lite.pkl', 'rb') as file:\n", " test = pickle.load(file)\n", "\n", "len(train), len(test)\n" ] }, { "cell_type": "markdown", "id": "35e6dde7", "metadata": {}, "source": [ "We will widen the training split beyond the day 5 baseline to squeeze out better accuracy." ] }, { "cell_type": "code", "execution_count": null, "id": "0ea1ba91", "metadata": {}, "outputs": [], "source": [ "TRAIN_SIZE = 400\n", "VAL_SIZE = 100\n", "RANDOM_SEED = 42\n", "\n", "rng = random.Random(RANDOM_SEED)\n", "shuffled = train[:]\n", "rng.shuffle(shuffled)\n", "fine_tune_train = shuffled[:TRAIN_SIZE]\n", "fine_tune_validation = shuffled[TRAIN_SIZE:TRAIN_SIZE+VAL_SIZE]\n", "\n", "len(fine_tune_train), len(fine_tune_validation)\n" ] }, { "cell_type": "markdown", "id": "4a1c67fa", "metadata": {}, "source": [ "## Step 1 — Build Training Conversations\n", "Frontier models handled the unaltered prompt, but for the fine-tune we keep the instruction tight and leave the assistant answer as just the numerical price." ] }, { "cell_type": "code", "execution_count": null, "id": "436b78b5", "metadata": {}, "outputs": [], "source": [ "SYSTEM_MESSAGE = 'You are an ecommerce pricing assistant. Respond with the price only, no text before or after.'\n", "ASSISTANT_PREFIX = 'Price is $'\n", "\n", "def clean_user_prompt(item):\n", " prompt = item.test_prompt().replace(' to the nearest dollar', '')\n", " return prompt.replace(ASSISTANT_PREFIX, '')\n", "\n", "def messages_for_training(item):\n", " return [\n", " {\"role\": \"system\", \"content\": SYSTEM_MESSAGE},\n", " {\"role\": \"user\", \"content\": clean_user_prompt(item)},\n", " {\"role\": \"assistant\", \"content\": f'{ASSISTANT_PREFIX}{item.price:.2f}'}\n", " ]\n", "\n", "def messages_for_inference(item):\n", " return [\n", " {\"role\": \"system\", \"content\": SYSTEM_MESSAGE},\n", " {\"role\": \"user\", \"content\": clean_user_prompt(item)},\n", " {\"role\": \"assistant\", \"content\": ASSISTANT_PREFIX}\n", " ]\n", "\n", "messages_for_training(fine_tune_train[0])\n" ] }, { "cell_type": "code", "execution_count": null, "id": "ecf456c2", "metadata": {}, "outputs": [], "source": [ "def make_jsonl(items):\n", " lines = []\n", " for item in items:\n", " lines.append(json.dumps({\"messages\": messages_for_training(item)}))\n", " return '\\n'.join(lines)\n", "\n", "def write_jsonl(items, filename):\n", " payload = make_jsonl(items)\n", " with open(filename, 'w') as f:\n", " f.write(payload)\n", "\n", "write_jsonl(fine_tune_train, 'fine_tune_train.jsonl')\n", "write_jsonl(fine_tune_validation, 'fine_tune_validation.jsonl')\n", "\n", "Path('fine_tune_train.jsonl').stat().st_size, Path('fine_tune_validation.jsonl').stat().st_size\n" ] }, { "cell_type": "markdown", "id": "7dfde306", "metadata": {}, "source": [ "Upload the datasets so the fine-tuning job can consume them." ] }, { "cell_type": "code", "execution_count": null, "id": "2c522928", "metadata": {}, "outputs": [], "source": [ "with open('fine_tune_train.jsonl', 'rb') as file:\n", " train_file = openai.files.create(file=file, purpose='fine-tune')\n", "train_file\n" ] }, { "cell_type": "code", "execution_count": null, "id": "d3660112", "metadata": {}, "outputs": [], "source": [ "with open('fine_tune_validation.jsonl', 'rb') as file:\n", " validation_file = openai.files.create(file=file, purpose='fine-tune')\n", "validation_file\n" ] }, { "cell_type": "markdown", "id": "9eaf47e1", "metadata": {}, "source": [ "## Step 2 — Launch the Fine-Tune\n", "Weights & Biases logging is optional but handy for tracking metrics over time." ] }, { "cell_type": "code", "execution_count": null, "id": "d758ba4b", "metadata": {}, "outputs": [], "source": [ "wandb_integration = {\"type\": \"wandb\", \"wandb\": {\"project\": \"gpt-pricer\"}}\n", "train_file.id, validation_file.id\n" ] }, { "cell_type": "code", "execution_count": null, "id": "b7152b9b", "metadata": {}, "outputs": [], "source": [ "fine_tune_job = openai.fine_tuning.jobs.create(\n", " training_file=train_file.id,\n", " validation_file=validation_file.id,\n", " model='gpt-4o-mini-2024-07-18',\n", " seed=RANDOM_SEED,\n", " hyperparameters={\"n_epochs\": 2, \"learning_rate_multiplier\": 1.5},\n", " suffix='emmy-pricer'\n", ")\n", "fine_tune_job\n" ] }, { "cell_type": "code", "execution_count": null, "id": "cd047075", "metadata": {}, "outputs": [], "source": [ "job_id = fine_tune_job.id\n", "job_id\n" ] }, { "cell_type": "code", "execution_count": null, "id": "cd830d14", "metadata": {}, "outputs": [], "source": [ "openai.fine_tuning.jobs.retrieve(job_id)\n" ] }, { "cell_type": "code", "execution_count": null, "id": "d2b25992", "metadata": {}, "outputs": [], "source": [ "openai.fine_tuning.jobs.list_events(fine_tuning_job_id=job_id, limit=10).data\n" ] }, { "cell_type": "markdown", "id": "f0328367", "metadata": {}, "source": [ "If you connected Weights & Biases under Settings → Integrations in the OpenAI dashboard, sync the run for richer charts." ] }, { "cell_type": "code", "execution_count": null, "id": "5995f1d6", "metadata": {}, "outputs": [], "source": [ "import wandb\n", "from wandb.integration.openai.fine_tuning import WandbLogger\n", "\n", "wandb.login()\n", "WandbLogger.sync(fine_tune_job_id=job_id, project='gpt-pricer')\n" ] }, { "cell_type": "markdown", "id": "7961d020", "metadata": {}, "source": [ "## Step 3 — Evaluate the Tuned Model\n", "Once the job is complete, grab the resulting model name and use the shared tester harness to verify we cleared the $76 average error goal." ] }, { "cell_type": "code", "execution_count": null, "id": "7742bad2", "metadata": {}, "outputs": [], "source": [ "fine_tuned_model_name = openai.fine_tuning.jobs.retrieve(job_id).fine_tuned_model\n", "fine_tuned_model_name\n" ] }, { "cell_type": "code", "execution_count": null, "id": "8d18cc45", "metadata": {}, "outputs": [], "source": [ "def get_price(text):\n", " cleaned = text.replace('$', '').replace(',', '').strip()\n", " match = re.search(r'[-+]?\\d*\\.?\\d+', cleaned)\n", " return float(match.group()) if match else 0.0\n", "\n", "def gpt_pricer(item):\n", " response = openai.chat.completions.create(\n", " model=fine_tuned_model_name,\n", " messages=messages_for_inference(item),\n", " seed=RANDOM_SEED,\n", " max_tokens=8\n", " )\n", " reply = response.choices[0].message.content\n", " return get_price(reply)\n" ] }, { "cell_type": "code", "execution_count": null, "id": "3a491e4b", "metadata": {}, "outputs": [], "source": [ "Tester.test(gpt_pricer, test)\n" ] } ], "metadata": { "kernelspec": { "display_name": "llm-engineering (3.12.10)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.10" } }, "nbformat": 4, "nbformat_minor": 5 }