Files
LLM_Engineering_OLD/week8/community_contributions/lisekarimi/10_part2_modal.ipynb
2025-06-07 03:49:58 +02:00

375 lines
125 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
{
"cells": [
{
"cell_type": "markdown",
"id": "44c6af6b-6fc3-44d5-a586-71618af7d09a",
"metadata": {
"jp-MarkdownHeadingCollapsed": true
},
"source": [
"# Modal (Part 2)\n",
"\n",
"---\n",
"✅ With all models and ChromaDB set up, it's time to integrate everything into a real system: **Snapr** — an app that scans online product listings, predicts their value, and alerts users to great deals.\n",
"\n",
"To power SSnapr, well need:\n",
"- Price prediction models — ready for production \n",
"- Fast, on-demand predictions \n",
"- A scalable setup that handles real-world usage\n",
"\n",
"🔧 Thats where **Modal** comes in. Modal lets us deploy models and services to the cloud, with minimal setup, low latency, and clean Python APIs.\n",
"\n",
"- You can check out a [live demo](https://huggingface.co/spaces/lisekarimi/snapr) of the project\n",
"- The source code is available on [GitHub](https://github.com/lisekarimi/snapr)\n",
"\n",
"---\n",
"📢 Find more LLM notebooks on my [GitHub repository](https://github.com/lisekarimi/lexo)\n"
]
},
{
"cell_type": "markdown",
"id": "b8c175e7-ca0a-4664-bded-08ec131c5636",
"metadata": {},
"source": [
"## 📚 Pre-requisites\n",
"\n",
"To follow this project smoothly, it's helpful to know:\n",
"\n",
"- 🛰️ What an API is: You send a request → its processed remotely → you receive a result\n",
"- 🐳 What a Docker image & container are:\n",
" - Image = environment with code & dependencies\n",
" - Container = running instance of that image\n",
"- 🧑‍💻 Local vs Remote code execution:\n",
" - Local code runs on your machine\n",
" - Remote code runs in the cloud (via Modal"
]
},
{
"cell_type": "markdown",
"id": "440fffc2-9ec1-433d-9b71-e6fae3b46415",
"metadata": {},
"source": [
"## 🔧 Install & Setup Modal\n",
"- Before starting, install Modal in your environment (Run this once): `uv pip install modal`\n",
"- Create an account at modal.com (they give you $5 free to start).\n",
"- Then authenticate your environment: `modal setup`"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ef286205",
"metadata": {},
"outputs": [],
"source": [
"!uv pip install modal"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d3906c01-b313-4dac-9a2e-6c7dbfdcc8fd",
"metadata": {},
"outputs": [],
"source": [
"import modal\n",
"import sys\n",
"sys.path.append(\".\") # Make sure your local modules are accessible"
]
},
{
"cell_type": "markdown",
"id": "43c59002-afe6-4dcc-a53e-b50d85857f7d",
"metadata": {},
"source": [
"## 🧠 Key Concepts\n",
"\n",
"Modal is a platform that lets you run Python code in the cloud. You can:\n",
"- Deploy code as APIs\n",
"- Run GPU workloads (e.g., LLMs)\n",
"- Automatically handle Docker, infra, deployment\n",
"\n",
"What is a Modal App?\n",
"An \"App\" is a containerized cloud service where you can run code remotely.\n",
"- Code runs in isolated containers (like Docker)\n",
"- These containers are created on-demand and destroyed when idle\n",
"- You define your logic in a file and deploy it to Modal\n",
"\n",
"Key Modal Concepts\n",
"- `modal.Image`: Defines the environment (like a Docker image)\n",
"- `@app.cls`: Runs classes remotely inside a container\n",
"- `modal.App`: Defines and registers the Modal app\n",
"- `.remote()`: Sends request to Modal API to execute the code remotely\n",
"- `modal deploy -m`: Deploys app permanently like a real cloud service"
]
},
{
"cell_type": "markdown",
"id": "79436b01-9623-4b0e-8ffc-0ea51a5783ac",
"metadata": {},
"source": [
"## ⚙️ Minimal Example"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d62850fc-dbf4-48b0-a2f2-a1e9a200414d",
"metadata": {},
"outputs": [],
"source": [
"from modal_services.get_started import app, f\n",
"\n",
"with app.run(): # This spins up a container in Modal\n",
" print(f.local(1000)) # Run locally inside the notebook\n",
" print('*' * 5)\n",
" print(f.remote(1000)) # Run remotely via Modal API inside a container"
]
},
{
"attachments": {
"886d059a-a8ca-4552-86d2-fb87fb824441.png": {
"image/png": ""
}
},
"cell_type": "markdown",
"id": "d07d54f3-dfb9-46b2-be50-d6d35f81b3c0",
"metadata": {},
"source": [
"🔄 What Happens When You Call .remote()?\n",
"\n",
"some_function.remote() → Modal SDK sends API request\n",
" → Spins up a container\n",
" → Runs the code remotely\n",
" → Sends the result back to your local machine\n",
"\n",
"![image.png](attachment:886d059a-a8ca-4552-86d2-fb87fb824441.png)\n",
"\n",
"What we have here is an **ephemeral app**: the container shuts down after finishing.\n",
"\n",
"For our project, we need a persistently running app that behaves like a production API. To achieve that, we should use `modal deploy -m`, making the app suitable for serving AI services reliably."
]
},
{
"cell_type": "markdown",
"id": "9422c5b0-573e-43a4-99b5-2c6199770f5c",
"metadata": {},
"source": [
"## 📦 Persistent Deployment with `modal deploy`"
]
},
{
"attachments": {
"b84a3557-9805-462f-a1d5-008b3aa4f4f5.png": {
"image/png": ""
}
},
"cell_type": "markdown",
"id": "8edb6eb4-4489-4823-9d16-a01c8d0355b6",
"metadata": {},
"source": [
"Click the blue \"+\" button at the top left of JupyterLab, then choose \"Terminal\" to open a new terminal tab.\n",
"\n",
"There, you can run:\n",
"\n",
"```bash\n",
"conda activate llms\n",
"modal deploy -m modal_services.get_started\n",
"```\n",
"\n",
"This builds and deploys the app (`example-hello-world`), registers `f()`, and makes it callable via `.remote()` anytime — even outside the notebook.\n",
"\n",
"![image.png](attachment:b84a3557-9805-462f-a1d5-008b3aa4f4f5.png)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "656456f5-b3f9-40bf-8a2e-169da0a68fe8",
"metadata": {},
"outputs": [],
"source": [
"from modal_services.get_started import f\n",
"f = modal.Function.from_name(\"example-hello-world\", \"f\") # (app_name, function_name)\n",
"print(f.remote(20))"
]
},
{
"attachments": {
"b950fed1-8806-424c-830a-d8b99927801e.png": {
"image/png": ""
}
},
"cell_type": "markdown",
"id": "7d2c4b43-97a7-4a8e-a0df-1e20de053d5a",
"metadata": {},
"source": [
"## 🚀 Deploy Our first Modal-powered model\n",
"\n",
"So far, weve seen how to run simple remote functions using `@app.function()` and call them via `modal.Function.from_name(...)` in a **persistent app** — good for basic tasks.\n",
"\n",
"But in our Smart Deal Finder project, we need more:\n",
"- Load and reuse a large model (like LLaMA) \n",
"- Keep the model in memory \n",
"- Expose one or more methods (like `price()`)\n",
"\n",
"Thats why we use `@app.cls` — it lets us define a class (e.g. `Pricer`) that lives in a Modal container, loads the model once in `setup()`, and handles remote requests efficiently.\n",
"\n",
"Full code : `\\modal_services.ft_pricer.py`\n",
"\n",
"---\n",
"\n",
"\n",
"🚀 In this step, well deploy a class-based app using `modal.Cls.from_name`.\n",
"\n",
"Specifically, well deploy `Pricer`, which loads our 4-bit quantized fine-tuned LLaMA model (trained in Notebook 9), and exposes a remote `.price()` method to estimate item prices.\n",
"\n",
"⚠️ Before deploying, add your HF_TOKEN in Modal\n",
"\n",
"Then open a terminal and run:\n",
"\n",
"```bash\n",
"modal deploy -m modal_services.ft_pricer\n",
"```\n",
"\n",
"This will:\n",
"- Build the image with your code and dependencies\n",
"- Deploy the app `llm-ft-pricer` and register the `Pricer` class and its methods\n",
"- Not start any container yet — setup() isn't run and the model isnt loaded\n",
"- Prepare the app to handle `.remote()` calls when they come in\n",
"\n",
"![image.png](attachment:b950fed1-8806-424c-830a-d8b99927801e.png)"
]
},
{
"attachments": {
"1c697283-e5e2-4b09-b1f1-d1c11f18c8e4.png": {
"image/png": ""
},
"4a22e438-6b25-4c69-9439-99d146ffd188.png": {
"image/png": ""
}
},
"cell_type": "markdown",
"id": "afff309a-740b-443f-96f1-4b20618ada8b",
"metadata": {},
"source": [
"## 🔗 Connect to Our Deployed App\n",
"\n",
"Now that our app is deployed, we can connect to it and use it like a remote service.\n",
"\n",
"We'll do this using `modal.Cls.from_name(\"llm-ft-pricer\", \"Pricer\")`, which fetches the `Pricer` class from our deployed app via the Modal API.\n",
"\n",
"Then, calling `.price.remote(...)` sends a request to Modal, spins up a container if needed, loads the model, runs the method, and returns the result.\n",
"\n",
"This is how we turn our model into a cloud API.\n",
"\n",
"What happens under the hood when calling price.remote(...): \n",
"- First run = downloads model files → stores in volume (/cache) → loads into memory → runs \n",
"- Later runs = load from volume → memory → run (no re-download)\n",
"\n",
"---\n",
"\n",
"Since we added `min_containers=1`, a container is created and kept warm as soon as the app is deployed. Models remain loaded in memory, so there are no cold starts — unless the app is stopped or the container crashes. \n",
"\n",
"![image.png](attachment:1c697283-e5e2-4b09-b1f1-d1c11f18c8e4.png)\n",
"\n",
"⚠️ However, this **continuously consumes credits** if you forget to stop the container or app manually.\n",
"\n",
"To save credits, you can set `min_containers=0` and `scaledown_window=300` — this way, no container stays warm by default, and a new one will spin up only when `.remote()` is called (i.e., on cold start).\n",
"\n",
"![image.png](attachment:4a22e438-6b25-4c69-9439-99d146ffd188.png)\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0b478ae8-f636-4ac7-bf53-0f3b23c21a72",
"metadata": {},
"outputs": [],
"source": [
"Pricer = modal.Cls.from_name(\"llm-ft-pricer\", \"Pricer\")\n",
"pricer = Pricer()\n",
"reply = pricer.price.remote(\"SEVERIN 28L Microwave, 900W, 5 power levels, 35-min timer, turntable (31.5 cm), Silver, MW 7772\")\n",
"print(reply)"
]
},
{
"cell_type": "markdown",
"id": "2e63efbf-344b-4b5f-8a0d-27b6e41f8508",
"metadata": {},
"source": [
"Now that weve deployed our model and learned how to call it remotely with `.remote()`,\n",
"lets go one step further — wrap this logic inside a local Python class.\n",
"\n",
"In the next step, we'll build a local Agent that cleanly interacts with our deployed `Modal app`, using the same `Modal API` under the hood."
]
},
{
"cell_type": "markdown",
"id": "8fbc7696-e892-4f08-80c5-5199b03ed175",
"metadata": {},
"source": [
"## 🔌 Connect to Your Modal App with a Local Agent\n",
"\n",
"`ft_pricer.py` is now a deployed API on Modal. \n",
"\n",
"To use it locally, well wrap it in a class called `FTPriceAgent` (Full code: `\\agents\\ft_price_agent.py)` that:\n",
"\n",
"- Connects to the remote app via `modal.Cls.from_name(...)` \n",
"- Calls `.price.remote(...)` to run predictions \n",
"\n",
"🔄 **Two API Calls:** happen\n",
"1. `modal.Cls.from_name(...)` → fetches the deployed class \n",
"2. `.price.remote(...)` → runs the remote method on Modal \n",
"\n",
"This keeps our code clean and modular."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b80cd15a-e419-4c21-97e7-4a56ed4db680",
"metadata": {},
"outputs": [],
"source": [
"from agents.ft_price_agent import FTPriceAgent\n",
"\n",
"agent = FTPriceAgent()\n",
"agent.price(\"Apple AirPods Max wireless over-ear headphones with active noise cancellation and spatial audio\")"
]
},
{
"cell_type": "markdown",
"id": "65522b93-59c9-4d15-a12d-58e078b88545",
"metadata": {},
"source": [
"Now that weve seen how Modal agents work — connecting to remote services and running `.remote()` — well use the same pattern for the rest of our models.\n",
"\n",
"✅ For each model — **XGBoost**, **GPT-4o RAG**, and the **Ensemble** — well build a dedicated Agent. "
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.7"
}
},
"nbformat": 4,
"nbformat_minor": 5
}