LLM_Engineering_OLD/week6/community-contributions/emmy/price_estimator.ipynb

{
  "cells": [
    {
      "cell_type": "markdown",
      "id": "1bf0f654",
      "metadata": {},
      "source": [
        "# Custom Price Estimator\n",
        "\n",
        "This notebook mirrors the week 6 day 5 fine-tuning workflow and pushes it a little further with the goal of beating the $76 average error target on the shared product dataset."
      ]
    },
    {
      "cell_type": "markdown",
      "id": "4b4a89e6",
      "metadata": {},
      "source": [
        "## Plan\n",
        "- Load the curated `Item` objects that we prepared earlier in week 6.\n",
        "- Create train/validation splits sized for a stronger fine-tune than the baseline.\n",
        "- Package the conversations in JSONL format and launch an OpenAI fine-tuning job.\n",
        "- Retrieve the tuned model, score it with the shared tester, and aim for < $76 average error."
      ]
    },
    {
      "cell_type": "markdown",
      "id": "8dc5b7b0",
      "metadata": {},
      "source": [
        "## Environment Setup\n",
        "Pull in the packages, load API keys from `.env`, and make sure we can talk to both the OpenAI and Hugging Face services used elsewhere in the course."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "f6332b2b",
      "metadata": {},
      "outputs": [],
      "source": [
        "import os\n",
        "import json\n",
        "import pickle\n",
        "import random\n",
        "import re\n",
        "from pathlib import Path\n",
        "\n",
        "import numpy as np\n",
        "import matplotlib.pyplot as plt\n",
        "from dotenv import load_dotenv\n",
        "from huggingface_hub import login\n",
        "\n",
        "from items import Item\n",
        "from testing import Tester\n",
        "from openai import OpenAI\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "14eb4e29",
      "metadata": {},
      "outputs": [],
      "source": [
        "# Load secrets from the .env file so the OpenAI client picks them up.\n",
        "load_dotenv(override=True)\n",
        "os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY', 'set-your-openai-key')\n",
        "os.environ['HF_TOKEN'] = os.getenv('HF_TOKEN', 'set-your-hf-token')\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "b07a6cab",
      "metadata": {},
      "outputs": [],
      "source": [
        "# Log in to Hugging Face once per session (needed for the tokenizer used in Item).\n",
        "hf_token = os.environ['HF_TOKEN']\n",
        "if hf_token and hf_token != 'set-your-hf-token':\n",
        "    login(hf_token, add_to_git_credential=True)\n",
        "else:\n",
        "    print('⚠️  Provide a valid HF_TOKEN in your .env if you need to download tokenizer weights.')\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "113d520b",
      "metadata": {},
      "outputs": [],
      "source": [
        "openai = OpenAI()\n",
        "%matplotlib inline\n"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "04ae4263",
      "metadata": {},
      "source": [
        "## Load the Week 6 Dataset\n",
        "We reuse the curated pickled `Item` objects. If the pickle files are missing, circle back to the earlier data curation notebook to regenerate them."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "6ca7ca03",
      "metadata": {},
      "outputs": [],
      "source": [
        "#Let's avoid curating all our data again! Load in the pickle files:\n",
        "with open('train_lite.pkl', 'rb') as file:\n",
        "    train = pickle.load(file)\n",
        "\n",
        "with open('test_lite.pkl', 'rb') as file:\n",
        "    test = pickle.load(file)\n",
        "\n",
        "len(train), len(test)\n"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "35e6dde7",
      "metadata": {},
      "source": [
        "We will widen the training split beyond the day 5 baseline to squeeze out better accuracy."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "0ea1ba91",
      "metadata": {},
      "outputs": [],
      "source": [
        "TRAIN_SIZE = 400\n",
        "VAL_SIZE = 100\n",
        "RANDOM_SEED = 42\n",
        "\n",
        "rng = random.Random(RANDOM_SEED)\n",
        "shuffled = train[:]\n",
        "rng.shuffle(shuffled)\n",
        "fine_tune_train = shuffled[:TRAIN_SIZE]\n",
        "fine_tune_validation = shuffled[TRAIN_SIZE:TRAIN_SIZE+VAL_SIZE]\n",
        "\n",
        "len(fine_tune_train), len(fine_tune_validation)\n"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "4a1c67fa",
      "metadata": {},
      "source": [
        "## Step 1 — Build Training Conversations\n",
        "Frontier models handled the unaltered prompt, but for the fine-tune we keep the instruction tight and leave the assistant answer as just the numerical price."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "436b78b5",
      "metadata": {},
      "outputs": [],
      "source": [
        "SYSTEM_MESSAGE = 'You are an ecommerce pricing assistant. Respond with the price only, no text before or after.'\n",
        "ASSISTANT_PREFIX = 'Price is $'\n",
        "\n",
        "def clean_user_prompt(item):\n",
        "    prompt = item.test_prompt().replace(' to the nearest dollar', '')\n",
        "    return prompt.replace(ASSISTANT_PREFIX, '')\n",
        "\n",
        "def messages_for_training(item):\n",
        "    return [\n",
        "        {\"role\": \"system\", \"content\": SYSTEM_MESSAGE},\n",
        "        {\"role\": \"user\", \"content\": clean_user_prompt(item)},\n",
        "        {\"role\": \"assistant\", \"content\": f'{ASSISTANT_PREFIX}{item.price:.2f}'}\n",
        "    ]\n",
        "\n",
        "def messages_for_inference(item):\n",
        "    return [\n",
        "        {\"role\": \"system\", \"content\": SYSTEM_MESSAGE},\n",
        "        {\"role\": \"user\", \"content\": clean_user_prompt(item)},\n",
        "        {\"role\": \"assistant\", \"content\": ASSISTANT_PREFIX}\n",
        "    ]\n",
        "\n",
        "messages_for_training(fine_tune_train[0])\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "ecf456c2",
      "metadata": {},
      "outputs": [],
      "source": [
        "def make_jsonl(items):\n",
        "    lines = []\n",
        "    for item in items:\n",
        "        lines.append(json.dumps({\"messages\": messages_for_training(item)}))\n",
        "    return '\\n'.join(lines)\n",
        "\n",
        "def write_jsonl(items, filename):\n",
        "    payload = make_jsonl(items)\n",
        "    with open(filename, 'w') as f:\n",
        "        f.write(payload)\n",
        "\n",
        "write_jsonl(fine_tune_train, 'fine_tune_train.jsonl')\n",
        "write_jsonl(fine_tune_validation, 'fine_tune_validation.jsonl')\n",
        "\n",
        "Path('fine_tune_train.jsonl').stat().st_size, Path('fine_tune_validation.jsonl').stat().st_size\n"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "7dfde306",
      "metadata": {},
      "source": [
        "Upload the datasets so the fine-tuning job can consume them."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "2c522928",
      "metadata": {},
      "outputs": [],
      "source": [
        "with open('fine_tune_train.jsonl', 'rb') as file:\n",
        "    train_file = openai.files.create(file=file, purpose='fine-tune')\n",
        "train_file\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "d3660112",
      "metadata": {},
      "outputs": [],
      "source": [
        "with open('fine_tune_validation.jsonl', 'rb') as file:\n",
        "    validation_file = openai.files.create(file=file, purpose='fine-tune')\n",
        "validation_file\n"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "9eaf47e1",
      "metadata": {},
      "source": [
        "## Step 2 — Launch the Fine-Tune\n",
        "Weights & Biases logging is optional but handy for tracking metrics over time."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "d758ba4b",
      "metadata": {},
      "outputs": [],
      "source": [
        "wandb_integration = {\"type\": \"wandb\", \"wandb\": {\"project\": \"gpt-pricer\"}}\n",
        "train_file.id, validation_file.id\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "b7152b9b",
      "metadata": {},
      "outputs": [],
      "source": [
        "fine_tune_job = openai.fine_tuning.jobs.create(\n",
        "    training_file=train_file.id,\n",
        "    validation_file=validation_file.id,\n",
        "    model='gpt-4o-mini-2024-07-18',\n",
        "    seed=RANDOM_SEED,\n",
        "    hyperparameters={\"n_epochs\": 2, \"learning_rate_multiplier\": 1.5},\n",
        "    suffix='emmy-pricer'\n",
        ")\n",
        "fine_tune_job\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "cd047075",
      "metadata": {},
      "outputs": [],
      "source": [
        "job_id = fine_tune_job.id\n",
        "job_id\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "cd830d14",
      "metadata": {},
      "outputs": [],
      "source": [
        "openai.fine_tuning.jobs.retrieve(job_id)\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "d2b25992",
      "metadata": {},
      "outputs": [],
      "source": [
        "openai.fine_tuning.jobs.list_events(fine_tuning_job_id=job_id, limit=10).data\n"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "f0328367",
      "metadata": {},
      "source": [
        "If you connected Weights & Biases under Settings → Integrations in the OpenAI dashboard, sync the run for richer charts."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "5995f1d6",
      "metadata": {},
      "outputs": [],
      "source": [
        "import wandb\n",
        "from wandb.integration.openai.fine_tuning import WandbLogger\n",
        "\n",
        "wandb.login()\n",
        "WandbLogger.sync(fine_tune_job_id=job_id, project='gpt-pricer')\n"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "7961d020",
      "metadata": {},
      "source": [
        "## Step 3 — Evaluate the Tuned Model\n",
        "Once the job is complete, grab the resulting model name and use the shared tester harness to verify we cleared the $76 average error goal."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "7742bad2",
      "metadata": {},
      "outputs": [],
      "source": [
        "fine_tuned_model_name = openai.fine_tuning.jobs.retrieve(job_id).fine_tuned_model\n",
        "fine_tuned_model_name\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "8d18cc45",
      "metadata": {},
      "outputs": [],
      "source": [
        "def get_price(text):\n",
        "    cleaned = text.replace('$', '').replace(',', '').strip()\n",
        "    match = re.search(r'[-+]?\\d*\\.?\\d+', cleaned)\n",
        "    return float(match.group()) if match else 0.0\n",
        "\n",
        "def gpt_pricer(item):\n",
        "    response = openai.chat.completions.create(\n",
        "        model=fine_tuned_model_name,\n",
        "        messages=messages_for_inference(item),\n",
        "        seed=RANDOM_SEED,\n",
        "        max_tokens=8\n",
        "    )\n",
        "    reply = response.choices[0].message.content\n",
        "    return get_price(reply)\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "3a491e4b",
      "metadata": {},
      "outputs": [],
      "source": [
        "Tester.test(gpt_pricer, test)\n"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "llm-engineering (3.12.10)",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.12.10"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 5
}