Merge branch 'ed-donner:main' into community-contributions-branch
This commit is contained in:
@@ -0,0 +1,28 @@
|
||||
Client: Hello I would like to order a pizza
|
||||
Restaurant: Sure. What pizza would you like to order from our menu?
|
||||
Client: Chicken Ranch
|
||||
Restaurant: I am so sorry, but chicken ranch is currently unavailable on our menu
|
||||
Client: AHHHHH. Do you have chicken BBQ?
|
||||
Restaurant: Yes! Do you want it small, medium, or large?
|
||||
Client: Medium
|
||||
Restaurant: Ok. This will be 180 LE
|
||||
Client: Thanks
|
||||
Restaurant: Anytime.
|
||||
Client: AHHHH I forgot. I want to add a new chicken BBQ pizza
|
||||
Restaurant: No problem. Do you also want it medium?
|
||||
Client: Yes
|
||||
Restaurant: Okay this will be 380 LE
|
||||
Client: Okay Thanks
|
||||
Client: Wait a minute. Isn't 180 * 2 = 360?
|
||||
Restaurant: It seems that there might be a misunderstanding. We add an extra 20 LE for every extra pizza ordered.
|
||||
Client: NOBODY TOLD ME THAT.. AND WHY ON EARTH WOULD YOU DO SOMETHING LIKE THAT?
|
||||
Restaurant: We are sorry but this is our policy.
|
||||
Client: Okay then I don't want your pizza.
|
||||
Restaurant: We are so sorry to hear that. We can make a 10% discount on the total price so it would be 342 LE
|
||||
Client: Fine
|
||||
Restaurant: Thank you for ordering
|
||||
Restaurant: Pizza is delivered. How is your experience?
|
||||
Client: Your pizza doesn't taste good
|
||||
Restaurant: We are so sorry to hear that. Do you have any suggestions you would like to make?
|
||||
Client: Make good pizza
|
||||
Restaurant: Thanks for your review. We will make sure to improve our pizza in the future. Your opinion really matters.
|
||||
@@ -0,0 +1,5 @@
|
||||
Client: Hello I would like to order a chicken ranch pizza
|
||||
Restaurant: I am so sorry, but chicken ranch is currently unavailable on our menu
|
||||
Client: Okay thanks
|
||||
Restaurant: Would you like to order something else?
|
||||
Client: No thank you
|
||||
@@ -0,0 +1,19 @@
|
||||
Client: Hello. What is the most selling pizza on your menu?
|
||||
Restaurant: Hello! Chicken Ranch pizza is our most selling pizza. Also our special pepperoni pizza got some amazing reviews
|
||||
Client: Okay. I want to order a pepperoni pizza
|
||||
Restaurant: Sure. Do you want it small, medium, or large?
|
||||
Client: Large
|
||||
Restaurant: Okay. This will be 210 LE. Would you like to order something else?
|
||||
Client: Yes. Do you have onion rings?
|
||||
Restaurant: Yes
|
||||
Client: Okay I would like to add onion rings.
|
||||
Restaurant: Sure. This will be 250 LE
|
||||
Client: Thanks
|
||||
Restaurant: Anytime
|
||||
Client: I have been waiting for too long and the order hasn't arrived yet
|
||||
Restaurant: Sorry to hear that. But it appears that the order is on its way to you.
|
||||
Restaurant: The order is supposed to be arrived by now.
|
||||
Client: Yes it is arrived.
|
||||
Restaurant: How is your experience?
|
||||
Client: Your pizza tastes soooooo good. The order took too long to arrive but when I tasted the pizza, I was really enjoying it and forgot everything about the delay.
|
||||
Restaurant: We are so glad to hear that
|
||||
@@ -0,0 +1,15 @@
|
||||
You are an assistant working for the customer service department in a pizza restaurant.
|
||||
You are to receive a chat between a client and the restaurant's customer service.
|
||||
You should generate your responses based on the following criteria:
|
||||
- What did the client order?
|
||||
- How much did it cost?
|
||||
- If the client changed their mind just keep their final order and the final cost
|
||||
- Mention the client's experience only if they ordered anything as follows: (Positive/Negative/Neutral/Unknown)
|
||||
- If the client did not order anything do not mention their sentiment or experience
|
||||
- If the client's experience is positive or negative only, provide a brief summary about their sentiment
|
||||
- Do not provide brief summary about their sentiment if their experience was neutral or unknown.
|
||||
- Your answers should be clear, straight to the point, and do not use long sentences
|
||||
- Your answers should be displayed in bullet points
|
||||
- Your answers should be displayed in markdown
|
||||
- If the client did not order anything provide a brief summary why that might happened
|
||||
- Do not mention cost if the client did not order anything
|
||||
@@ -0,0 +1,127 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "39e3e763-9b00-49eb-aead-034a2d0517a7",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# imports\n",
|
||||
"\n",
|
||||
"import os\n",
|
||||
"import requests\n",
|
||||
"from dotenv import load_dotenv\n",
|
||||
"from bs4 import BeautifulSoup\n",
|
||||
"from IPython.display import Markdown, display\n",
|
||||
"from openai import OpenAI\n",
|
||||
"\n",
|
||||
"# If you get an error running this cell, then please head over to the troubleshooting notebook!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "f3bb5e2a-b70f-42ba-9f22-030a9c6bc9d1",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Load environment variables in a file called .env\n",
|
||||
"\n",
|
||||
"load_dotenv(override=True)\n",
|
||||
"api_key = os.getenv('OPENAI_API_KEY')\n",
|
||||
"\n",
|
||||
"# Check the key\n",
|
||||
"\n",
|
||||
"if not api_key:\n",
|
||||
" print(\"No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!\")\n",
|
||||
"elif not api_key.startswith(\"sk-proj-\"):\n",
|
||||
" print(\"An API key was found, but it doesn't start sk-proj-; please check you're using the right key - see troubleshooting notebook\")\n",
|
||||
"elif api_key.strip() != api_key:\n",
|
||||
" print(\"An API key was found, but it looks like it might have space or tab characters at the start or end - please remove them - see troubleshooting notebook\")\n",
|
||||
"else:\n",
|
||||
" print(\"API key found and looks good so far!\")\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "994f51fb-eab3-45a2-847f-87aebb92b17a",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"openai = OpenAI()\n",
|
||||
"\n",
|
||||
"# If this doesn't work, try Kernel menu >> Restart Kernel and Clear Outputs Of All Cells, then run the cells from the top of this notebook down.\n",
|
||||
"# If it STILL doesn't work (horrors!) then please see the Troubleshooting notebook in this folder for full instructions"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "a8125c6d-c884-4f65-b477-cab155e29ce3",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Step 1: Create your prompts\n",
|
||||
"\n",
|
||||
"system_prompt = \"You are an AI that suggests short and relevant subject lines for emails based on their content.\"\n",
|
||||
"user_prompt = \"\"\"\n",
|
||||
"Here is the content of an email:\n",
|
||||
"\n",
|
||||
"Dear Team,\n",
|
||||
"\n",
|
||||
"I hope you're all doing well. I wanted to remind you that our next project meeting is scheduled for this Friday at 3 PM. We will be discussing our progress and any blockers. Please make sure to review the latest updates before the meeting.\n",
|
||||
"\n",
|
||||
"Best, \n",
|
||||
"John\n",
|
||||
"\"\"\"\n",
|
||||
"\n",
|
||||
"# Step 2: Make the messages list\n",
|
||||
"\n",
|
||||
"messages = [ {\"role\": \"system\", \"content\": system_prompt},\n",
|
||||
" {\"role\": \"user\", \"content\": user_prompt}] # fill this in\n",
|
||||
"\n",
|
||||
"# Step 3: Call OpenAI\n",
|
||||
"\n",
|
||||
"response = openai.chat.completions.create(\n",
|
||||
" model = \"gpt-4o-mini\",\n",
|
||||
" messages=messages\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# Step 4: print the result\n",
|
||||
"\n",
|
||||
"print(\"Suggested Subject Line:\", response.choices[0].message.content)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "1010ac80-1ee8-432f-aa3f-12af419dc23a",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.11"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
195
week1/community-contributions/day1-selenium-lama-mac.ipynb
Normal file
195
week1/community-contributions/day1-selenium-lama-mac.ipynb
Normal file
@@ -0,0 +1,195 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "c97ad592-c8be-4583-a19c-ac813e56f410",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Mac Users\n",
|
||||
"\n",
|
||||
"I find some challenges while setting up this in MAC silicon M1 chip. Execute below commands in MAC terminal.\n",
|
||||
"\n",
|
||||
"1. Download chromedriver.\n",
|
||||
"2. Unzip and add it to the path.\n",
|
||||
"3. Set Extended attributes."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "b635b345-b000-48cc-8a7f-7df279a489a3",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"cd ~/Downloads\n",
|
||||
"wget https://storage.googleapis.com/chrome-for-testing-public/133.0.6943.126/mac-arm64/chromedriver-mac-arm64.zip\n",
|
||||
"unzip chromedriver-mac-arm64.zip\n",
|
||||
"sudo mv chromedriver-mac-arm64/chromedriver /usr/local/bin/\n",
|
||||
"chmod +x /usr/local/bin/chromedriver\n",
|
||||
"cd /usr/local/bin/\n",
|
||||
"xattr -d com.apple.quarantine chromedriver\n",
|
||||
"cd \n",
|
||||
"chromedriver --version"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "17c7c79a-8ae0-4f5d-a7c8-c54aa7ba90fd",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"!pip install selenium\n",
|
||||
"!pip install undetected-chromedriver\n",
|
||||
"!pip install beautifulsoup4"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "c10bd630-2dfd-4572-8c21-2dc4c6a372ab",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from selenium import webdriver\n",
|
||||
"from selenium.webdriver.chrome.service import Service\n",
|
||||
"from selenium.webdriver.common.by import By\n",
|
||||
"from selenium.webdriver.chrome.options import Options\n",
|
||||
"from openai import OpenAI\n",
|
||||
"import os\n",
|
||||
"import requests\n",
|
||||
"from dotenv import load_dotenv\n",
|
||||
"from bs4 import BeautifulSoup\n",
|
||||
"from IPython.display import Markdown, display\n",
|
||||
"from openai import OpenAI"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "6fb3641d-e9f8-4f5b-bb9d-ee0e971cccdb",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"OLLAMA_API = \"http://localhost:11434/api/chat\"\n",
|
||||
"HEADERS = {\"Content-Type\": \"application/json\"}\n",
|
||||
"MODEL = \"llama3.2\"\n",
|
||||
"PATH_TO_CHROME_DRIVER = '/usr/local/bin/chromedriver'\n",
|
||||
"system_prompt = \"You are an assistant that analyzes the contents of a website \\\n",
|
||||
"and provides a short summary, ignoring text that might be navigation related. \\\n",
|
||||
"Respond in markdown. Highlight all the products this website offered and also find when website is created.\"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "5d57e958",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"class Website:\n",
|
||||
" url: str\n",
|
||||
" title: str\n",
|
||||
" text: str\n",
|
||||
"\n",
|
||||
" def __init__(self, url):\n",
|
||||
" self.url = url\n",
|
||||
"\n",
|
||||
" options = Options()\n",
|
||||
"\n",
|
||||
" options.add_argument(\"--no-sandbox\")\n",
|
||||
" options.add_argument(\"--disable-dev-shm-usage\")\n",
|
||||
"\n",
|
||||
" service = Service(PATH_TO_CHROME_DRIVER)\n",
|
||||
" driver = webdriver.Chrome(service=service, options=options)\n",
|
||||
" driver.get(url)\n",
|
||||
"\n",
|
||||
" # input(\"Please complete the verification in the browser and press Enter to continue...\")\n",
|
||||
" page_source = driver.page_source\n",
|
||||
" driver.quit()\n",
|
||||
"\n",
|
||||
" soup = BeautifulSoup(page_source, 'html.parser')\n",
|
||||
" self.title = soup.title.string if soup.title else \"No title found\"\n",
|
||||
" for irrelevant in soup([\"script\", \"style\", \"img\", \"input\"]):\n",
|
||||
" irrelevant.decompose()\n",
|
||||
" self.text = soup.get_text(separator=\"\\n\", strip=True)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "56df8cd2-2707-43f6-a066-3367846929b3",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def user_prompt_for(website):\n",
|
||||
" user_prompt = f\"You are looking at a website titled {website.title}\"\n",
|
||||
" user_prompt += \"\\nThe contents of this website is as follows; \\\n",
|
||||
"please provide a short summary of this website in markdown. \\\n",
|
||||
"If it includes news or announcements, then summarize these too.\\n\\n\"\n",
|
||||
" user_prompt += website.text\n",
|
||||
" return user_prompt\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def messages_for(website):\n",
|
||||
" return [\n",
|
||||
" {\"role\": \"system\", \"content\": system_prompt},\n",
|
||||
" {\"role\": \"user\", \"content\": user_prompt_for(website)}\n",
|
||||
" ]\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def summarize(url):\n",
|
||||
" website = Website(url)\n",
|
||||
" ollama_via_openai = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')\n",
|
||||
" response = ollama_via_openai.chat.completions.create(\n",
|
||||
" model=MODEL,\n",
|
||||
" messages = messages_for(website)\n",
|
||||
" )\n",
|
||||
" return response.choices[0].message.content\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def display_summary(url):\n",
|
||||
" summary = summarize(url)\n",
|
||||
" display(Markdown(summary))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "f2eb9599",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"display_summary(\"https://ae.almosafer.com\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "31b66c0f-6b45-4986-b77c-758625945a91",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.11"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
@@ -0,0 +1,167 @@
|
||||
import os
|
||||
import time
|
||||
import pandas as pd
|
||||
import re
|
||||
from dotenv import load_dotenv
|
||||
from selenium import webdriver
|
||||
from selenium.webdriver.chrome.service import Service
|
||||
from selenium.webdriver.chrome.options import Options
|
||||
from selenium.webdriver.common.by import By
|
||||
from selenium.webdriver.support.ui import WebDriverWait
|
||||
from selenium.webdriver.support import expected_conditions as EC
|
||||
from openai import OpenAI
|
||||
from openpyxl import load_workbook
|
||||
from openpyxl.styles import Font, Alignment
|
||||
|
||||
# Load environment variables
|
||||
load_dotenv(override=True)
|
||||
api_key = os.getenv('OPENAI_API_KEY')
|
||||
|
||||
# Validate API Key
|
||||
if not api_key:
|
||||
raise ValueError("No API key was found - please check your .env file.")
|
||||
|
||||
# Initialize OpenAI client
|
||||
openai = OpenAI()
|
||||
|
||||
# Set up Selenium WebDriver
|
||||
chrome_options = Options()
|
||||
chrome_options.add_argument("--headless")
|
||||
chrome_options.add_argument("--disable-gpu")
|
||||
chrome_options.add_argument("--no-sandbox")
|
||||
chrome_options.add_argument("--disable-dev-shm-usage")
|
||||
|
||||
class Website:
|
||||
"""Scrapes and processes website content using Selenium."""
|
||||
|
||||
def __init__(self, url: str):
|
||||
self.url = url
|
||||
self.text = "No content extracted."
|
||||
|
||||
service = Service(executable_path="/opt/homebrew/bin/chromedriver")
|
||||
driver = webdriver.Chrome(service=service, options=chrome_options)
|
||||
|
||||
try:
|
||||
driver.get(url)
|
||||
WebDriverWait(driver, 10).until(
|
||||
EC.presence_of_element_located((By.TAG_NAME, "body"))
|
||||
)
|
||||
body_element = driver.find_element(By.TAG_NAME, "body")
|
||||
self.text = body_element.text.strip() if body_element else "No content extracted."
|
||||
except Exception as e:
|
||||
print(f"Error fetching website: {e}")
|
||||
finally:
|
||||
driver.quit()
|
||||
|
||||
def summarized_text(self, max_length=1500):
|
||||
return self.text[:max_length] + ("..." if len(self.text) > max_length else "")
|
||||
|
||||
def clean_text(text):
|
||||
"""
|
||||
Cleans extracted text by removing markdown-style formatting.
|
||||
"""
|
||||
text = re.sub(r"###*\s*", "", text)
|
||||
text = re.sub(r"\*\*(.*?)\*\*", r"\1", text)
|
||||
return text.strip()
|
||||
|
||||
# Aspect-specific prompts for concise output
|
||||
aspect_prompts = {
|
||||
"Marketing Strategies": "Summarize the core marketing strategies used on this website in in under 30 words. Do not include a title or introduction.",
|
||||
"SEO Keywords": "List only the most relevant SEO keywords from this website, separated by commas. Do not include a title or introduction.",
|
||||
"User Engagement Tactics": "List key engagement tactics used on this website (e.g., interactive features, user incentives, social proof). Keep responses to 3-5 bullet points. Do not include a title or introduction.",
|
||||
"Call-to-Action Phrases": "List only the most common Call-to-Action phrases used on this website, separated by commas. Do not include a title or introduction.",
|
||||
"Branding Elements": "Summarize the brand's tone, style, and positioning in under 30 words. Do not include a title or introduction.",
|
||||
"Competitor Comparison": "Briefly describe how this website differentiates itself from competitors in under 30 words. Do not include a title or introduction.",
|
||||
"Product Descriptions": "List the most important features or benefits of the products/services described on this website in under 30 words. Do not include a title or introduction.",
|
||||
"Customer Reviews Sentiment": "Summarize the overall sentiment of customer reviews in oin under 30 words, highlighting common themes. Do not include a title or introduction.",
|
||||
"Social Media Strategy": "List key social media strategies used on this website, separated by commas. Do not include a title or introduction."
|
||||
}
|
||||
|
||||
|
||||
def summarize(url: str) -> dict:
|
||||
"""
|
||||
Fetches a website, extracts relevant content, and generates a separate summary for each aspect.
|
||||
|
||||
:param url: The website URL to analyze.
|
||||
:return: A dictionary containing extracted information.
|
||||
"""
|
||||
website = Website(url)
|
||||
|
||||
if not website.text or website.text == "No content extracted.":
|
||||
return {"URL": url, "Error": "Failed to extract content"}
|
||||
|
||||
extracted_data = {"URL": url}
|
||||
|
||||
for aspect, prompt in aspect_prompts.items():
|
||||
try:
|
||||
formatted_prompt = f"{prompt} \n\nContent:\n{website.summarized_text()}"
|
||||
response = openai.chat.completions.create(
|
||||
model="gpt-4o-mini",
|
||||
messages=[
|
||||
{"role": "system", "content": "You are an expert at extracting structured information from website content."},
|
||||
{"role": "user", "content": formatted_prompt}
|
||||
]
|
||||
)
|
||||
|
||||
extracted_data[aspect] = clean_text(response.choices[0].message.content)
|
||||
|
||||
except Exception as e:
|
||||
extracted_data[aspect] = f"Error generating summary: {e}"
|
||||
|
||||
return extracted_data
|
||||
|
||||
def save_to_excel(data_list: list, filename="website_analysis.xlsx"):
|
||||
"""
|
||||
Saves extracted information to an Excel file with proper formatting.
|
||||
|
||||
:param data_list: A list of dictionaries containing extracted website details.
|
||||
:param filename: The name of the Excel file to save data.
|
||||
"""
|
||||
df = pd.DataFrame(data_list)
|
||||
|
||||
df.to_excel(filename, index=False)
|
||||
|
||||
wb = load_workbook(filename)
|
||||
ws = wb.active
|
||||
|
||||
# Auto-adjust column widths
|
||||
for col in ws.columns:
|
||||
max_length = 0
|
||||
col_letter = col[0].column_letter
|
||||
for cell in col:
|
||||
try:
|
||||
if cell.value:
|
||||
max_length = max(max_length, len(str(cell.value)))
|
||||
except:
|
||||
pass
|
||||
ws.column_dimensions[col_letter].width = min(max_length + 2, 50)
|
||||
|
||||
# Format headers
|
||||
for cell in ws[1]:
|
||||
cell.font = Font(bold=True)
|
||||
cell.alignment = Alignment(horizontal="center", vertical="center")
|
||||
|
||||
# Wrap text for extracted content
|
||||
for row in ws.iter_rows(min_row=2):
|
||||
for cell in row:
|
||||
cell.alignment = Alignment(wrap_text=True, vertical="top")
|
||||
|
||||
wb.save(filename)
|
||||
print(f"Data saved to {filename} with improved formatting.")
|
||||
|
||||
# 🔹 LIST OF WEBSITES TO PROCESS
|
||||
websites = [
|
||||
"https://www.gymshark.com/",
|
||||
]
|
||||
|
||||
if __name__ == "__main__":
|
||||
print("\nProcessing websites...\n")
|
||||
extracted_data_list = []
|
||||
|
||||
for site in websites:
|
||||
print(f"Extracting data from {site}...")
|
||||
extracted_data = summarize(site)
|
||||
extracted_data_list.append(extracted_data)
|
||||
|
||||
save_to_excel(extracted_data_list)
|
||||
print("\nAll websites processed successfully!")
|
||||
213
week1/community-contributions/day2 EXERCISE_deepseek-r1.ipynb
Normal file
213
week1/community-contributions/day2 EXERCISE_deepseek-r1.ipynb
Normal file
@@ -0,0 +1,213 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "bc7d1de3-e2ac-46ff-a302-3b4ba38c4c90",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Also trying the amazing reasoning model DeepSeek\n",
|
||||
"\n",
|
||||
"Here we use the version of DeepSeek-reasoner that's been distilled to 1.5B. \n",
|
||||
"This is actually a 1.5B variant of Qwen that has been fine-tuned using synethic data generated by Deepseek R1.\n",
|
||||
"\n",
|
||||
"Other sizes of DeepSeek are [here](https://ollama.com/library/deepseek-r1) all the way up to the full 671B parameter version, which would use up 404GB of your drive and is far too large for most!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "cf9eb44e-fe5b-47aa-b719-0bb63669ab3d",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"!ollama pull deepseek-r1:1.5b"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "4bdcd35a",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"!ollama pull deepseek-r1:8b"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "1622d9bb-5c68-4d4e-9ca4-b492c751f898",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# NOW the exercise for you\n",
|
||||
"\n",
|
||||
"Take the code from day1 and incorporate it here, to build a website summarizer that uses Llama 3.2 running locally instead of OpenAI; use either of the above approaches."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "1c106420",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# imports\n",
|
||||
"\n",
|
||||
"import requests\n",
|
||||
"import ollama\n",
|
||||
"from bs4 import BeautifulSoup\n",
|
||||
"from IPython.display import Markdown, display"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "22d62f00",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Constants\n",
|
||||
"\n",
|
||||
"OLLAMA_API = \"http://localhost:11434/api/chat\"\n",
|
||||
"HEADERS = {\"Content-Type\": \"application/json\"}\n",
|
||||
"MODEL = \"deepseek-r1:8b\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "6de38216-6d1c-48c4-877b-86d403f4e0f8",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# A class to represent a Webpage\n",
|
||||
"# If you're not familiar with Classes, check out the \"Intermediate Python\" notebook\n",
|
||||
"\n",
|
||||
"# Some websites need you to use proper headers when fetching them:\n",
|
||||
"headers = {\n",
|
||||
" \"User-Agent\": \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36\"\n",
|
||||
"}\n",
|
||||
"\n",
|
||||
"class Website:\n",
|
||||
"\n",
|
||||
" def __init__(self, url):\n",
|
||||
" \"\"\"\n",
|
||||
" Create this Website object from the given url using the BeautifulSoup library\n",
|
||||
" \"\"\"\n",
|
||||
" self.url = url\n",
|
||||
" response = requests.get(url, headers=headers)\n",
|
||||
" soup = BeautifulSoup(response.content, 'html.parser')\n",
|
||||
" self.title = soup.title.string if soup.title else \"No title found\"\n",
|
||||
" for irrelevant in soup.body([\"script\", \"style\", \"img\", \"input\"]):\n",
|
||||
" irrelevant.decompose()\n",
|
||||
" self.text = soup.body.get_text(separator=\"\\n\", strip=True)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "4449b7dc",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Define our system prompt - you can experiment with this later, changing the last sentence to 'Respond in markdown in Spanish.\"\n",
|
||||
"\n",
|
||||
"system_prompt = \"You are an assistant that analyzes the contents of a website \\\n",
|
||||
"and provides a short summary, ignoring text that might be navigation related. \\\n",
|
||||
"Respond in markdown.\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "daca9448",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def user_prompt_for(website):\n",
|
||||
" user_prompt = f\"You are looking at a website titled {website.title}\"\n",
|
||||
" user_prompt += \"\\nThe contents of this website is as follows; \\\n",
|
||||
"please provide a short summary of this website in markdown. \\\n",
|
||||
"If it includes news or announcements, then summarize these too.\\n\\n\"\n",
|
||||
" user_prompt += website.text\n",
|
||||
" return user_prompt"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "0ec9d5d2",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# See how this function creates exactly the format above\n",
|
||||
"\n",
|
||||
"def messages_for(website):\n",
|
||||
" return [\n",
|
||||
" {\"role\": \"system\", \"content\": system_prompt},\n",
|
||||
" {\"role\": \"user\", \"content\": user_prompt_for(website)}\n",
|
||||
" ]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "6e1ab04a",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# And now: call the OpenAI API. You will get very familiar with this!\n",
|
||||
"\n",
|
||||
"def summarize(url):\n",
|
||||
" website = Website(url)\n",
|
||||
" response = ollama.chat(\n",
|
||||
" model = MODEL,\n",
|
||||
" messages = messages_for(website)\n",
|
||||
" )\n",
|
||||
" return response['message']['content']"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "0d3b5628",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def display_summary(url):\n",
|
||||
" summary = summarize(url)\n",
|
||||
" display(Markdown(summary))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "938e5633",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"display_summary(\"https://edwarddonner.com\")"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "llms",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.11"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
81
week1/community-contributions/day5-disable-ssl.ipynb
Normal file
81
week1/community-contributions/day5-disable-ssl.ipynb
Normal file
@@ -0,0 +1,81 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a98030af-fcd1-4d63-a36e-38ba053498fa",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# A Small Tweak to Week1-Day5\n",
|
||||
"\n",
|
||||
"If you have network restrictions (such as using a custom DNS provider, or firewall rules at work), you can disable SSL cert verification.\n",
|
||||
"Once you do that and start executing your code, the output will be riddled with warnings. Thankfully, you can suppress those warnings,too.\n",
|
||||
"\n",
|
||||
"See the 2 lines added to the init method, below."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 22,
|
||||
"id": "106dd65e-90af-4ca8-86b6-23a41840645b",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# A class to represent a Webpage\n",
|
||||
"\n",
|
||||
"# Some websites need you to use proper headers when fetching them:\n",
|
||||
"headers = {\n",
|
||||
" \"User-Agent\": \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36\"\n",
|
||||
"}\n",
|
||||
"\n",
|
||||
"class Website:\n",
|
||||
" \"\"\"\n",
|
||||
" A utility class to represent a Website that we have scraped, now with links\n",
|
||||
" \"\"\"\n",
|
||||
"\n",
|
||||
" def __init__(self, url):\n",
|
||||
" self.url = url\n",
|
||||
"\n",
|
||||
" #\n",
|
||||
" # If you must disable SSL cert validation, and also suppress all the warning that will come with it,\n",
|
||||
" # add the 2 lines below. This comes in very handy if you have DNS/firewall restrictions; alas, use\n",
|
||||
" # with caution, especially if deploying this in a non-dev environment.\n",
|
||||
" requests.packages.urllib3.disable_warnings() \n",
|
||||
" response = requests.get(url, headers=headers, verify=False) \n",
|
||||
" # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n",
|
||||
" \n",
|
||||
" self.body = response.content\n",
|
||||
" soup = BeautifulSoup(self.body, 'html.parser')\n",
|
||||
" self.title = soup.title.string if soup.title else \"No title found\"\n",
|
||||
" if soup.body:\n",
|
||||
" for irrelevant in soup.body([\"script\", \"style\", \"img\", \"input\"]):\n",
|
||||
" irrelevant.decompose()\n",
|
||||
" self.text = soup.body.get_text(separator=\"\\n\", strip=True)\n",
|
||||
" else:\n",
|
||||
" self.text = \"\"\n",
|
||||
" links = [link.get('href') for link in soup.find_all('a')]\n",
|
||||
" self.links = [link for link in links if link]"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.11"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
202
week1/community-contributions/week1 EXERCISE_AI_techician.ipynb
Normal file
202
week1/community-contributions/week1 EXERCISE_AI_techician.ipynb
Normal file
@@ -0,0 +1,202 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "fe12c203-e6a6-452c-a655-afb8a03a4ff5",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# End of week 1 exercise\n",
|
||||
"\n",
|
||||
"To demonstrate your familiarity with OpenAI API, and also Ollama, build a tool that takes a technical question, \n",
|
||||
"and responds with an explanation. This is a tool that you will be able to use yourself during the course!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"id": "c1070317-3ed9-4659-abe3-828943230e03",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# imports\n",
|
||||
"from IPython.display import Markdown, display, update_display\n",
|
||||
"import openai\n",
|
||||
"from openai import OpenAI\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"id": "4a456906-915a-4bfd-bb9d-57e505c5093f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# constants\n",
|
||||
"models = {\n",
|
||||
" 'MODEL_GPT': 'gpt-4o-mini',\n",
|
||||
" 'MODEL_LLAMA': 'llama3.2'\n",
|
||||
"}\n",
|
||||
"\n",
|
||||
"# To use ollama using openai API (ensure that ollama is running on localhost)\n",
|
||||
"ollama_via_openai = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')\n",
|
||||
"\n",
|
||||
"def model_choices(model):\n",
|
||||
" if model in models:\n",
|
||||
" return models[model]\n",
|
||||
" else:\n",
|
||||
" raise ValueError(f\"Model {model} not found in models dictionary\")\n",
|
||||
"\n",
|
||||
"def get_model_api(model='MODEL_GPT'):\n",
|
||||
" if model == 'MODEL_GPT':\n",
|
||||
" return openai, model_choices(model)\n",
|
||||
" elif model == 'MODEL_LLAMA':\n",
|
||||
" return ollama_via_openai, model_choices(model)\n",
|
||||
" else:\n",
|
||||
" raise ValueError(f\"Model {model} not found in models dictionary\")\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 12,
|
||||
"id": "a8d7923c-5f28-4c30-8556-342d7c8497c1",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# set up environment\n",
|
||||
"\n",
|
||||
"system_prompt = \"\"\" You are an AI assistant helping a user find information about a product. \n",
|
||||
"The user asks you a technical question about code, and you provide a response with code snippets and explanations.\"\"\"\n",
|
||||
"\n",
|
||||
"def stream_brochure(question, model):\n",
|
||||
" api, model_name = get_model_api(model)\n",
|
||||
" stream = api.chat.completions.create(\n",
|
||||
" model=model_name,\n",
|
||||
" messages=[\n",
|
||||
" {\"role\": \"system\", \"content\": system_prompt},\n",
|
||||
" {\"role\": \"user\", \"content\": question}\n",
|
||||
" ],\n",
|
||||
" stream=True\n",
|
||||
" )\n",
|
||||
" \n",
|
||||
" response = \"\"\n",
|
||||
" display_handle = display(Markdown(\"\"), display_id=True)\n",
|
||||
" for chunk in stream:\n",
|
||||
" response += chunk.choices[0].delta.content or ''\n",
|
||||
" response = response.replace(\"```\",\"\").replace(\"markdown\", \"\")\n",
|
||||
" update_display(Markdown(response), display_id=display_handle.display_id)\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 13,
|
||||
"id": "3f0d0137-52b0-47a8-81a8-11a90a010798",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Here is the question; type over this to ask something new\n",
|
||||
"\n",
|
||||
"question = \"\"\"\n",
|
||||
"Please explain what this code does and why:\n",
|
||||
"yield from {book.get(\"author\") for book in books if book.get(\"author\")}\n",
|
||||
"\"\"\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "60ce7000-a4a5-4cce-a261-e75ef45063b4",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/markdown": [
|
||||
"**Understanding the Code Snippet**\n",
|
||||
"\n",
|
||||
"This Python code snippet uses a combination of built-in functions, dictionary iteration, and generator expressions to extract and yield author names from a list of `Book` objects.\n",
|
||||
"\n",
|
||||
"Here's a breakdown:\n",
|
||||
"\n",
|
||||
"1. **Dictionary Iteration**: The expression `for book in books if book.get(\"author\")`\n",
|
||||
" - Iterates over each element (`book`) in the container `books`.\n",
|
||||
" - Filters out elements whose `'author'` key does not have a value (i.e., `None`, `False`, or an empty string). This leaves only dictionaries with author information.\n",
|
||||
"\n",
|
||||
"2. **Dictionary Access**: The expression `{book.get(\"author\") for book in books if book.get(\"author\")}`\n",
|
||||
" - Uses dictionary membership testing to access only the values associated with the `'author'` key.\n",
|
||||
" - If the value is not found or is considered false, it's skipped in this particular case.\n",
|
||||
"\n",
|
||||
"3. **Generator Expression**: This generates an iterator that iterates over the filtered author names.\n",
|
||||
" - Yields each author name (i.e., a single `'name'` from the book dictionary) on demand.\n",
|
||||
" - Since these are generator expressions, they use memory less than equivalent Python lists and also create results on-demand.\n",
|
||||
"\n",
|
||||
"4. **`yield from`**: This statement takes the generator expression as an argument and uses it to generate a nested iterator structure.\n",
|
||||
" - It essentially \"decompresses\" the single level of nested iterator created by `list(iter(x))`, allowing for simpler use cases and potentially significant efficiency improvements for more complex structures where every value must be iterated, while in the latter case just the first item per iterable in the outer expression's sequence needs to actually be yielded into result stream.\n",
|
||||
" - By \"yielding\" a nested iterator (the generator expression), we can simplify code by avoiding repetitive structure like `for book, book_author in zip(iterating over), ...` or list creation.\n",
|
||||
"\n",
|
||||
"**Example Use Case**\n",
|
||||
"\n",
|
||||
"In this hypothetical example:\n",
|
||||
"\n",
|
||||
"# Example Book objects\n",
|
||||
"class Book:\n",
|
||||
" def __init__(self, author, title):\n",
|
||||
" self.author = author # str\n",
|
||||
" self.title = title\n",
|
||||
"\n",
|
||||
"books = [\n",
|
||||
" {\"author\": \"John Doe\", \"title\": f\"Book 1 by John Doe\"},\n",
|
||||
" {\"author\": None, \"title\": f\"Book 2 without Author\"},\n",
|
||||
" {\"author\": \"Jane Smith\", \"title\": f\"Book 3 by Jane Smith\"}\n",
|
||||
"]\n",
|
||||
"\n",
|
||||
"# The given expression to extract and yield author names\n",
|
||||
"for author in yield from {book.get(\"author\") for book in books if book.get(\"author\")}:\n",
|
||||
"\n",
|
||||
" print(author) \n",
|
||||
"\n",
|
||||
"In this code snippet, printing the extracted authors would output `John Doe`, `Jane Smith` (since only dictionaries with author information pass the filtering test).\n",
|
||||
"\n",
|
||||
"Please modify it like as you wish and use `yield from` along with dictionary iteration, list comprehension or generator expression if needed, and explain what purpose your version has."
|
||||
],
|
||||
"text/plain": [
|
||||
"<IPython.core.display.Markdown object>"
|
||||
]
|
||||
},
|
||||
"metadata": {},
|
||||
"output_type": "display_data"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Get the model of your choice (choices appeared below) to answer, with streaming \n",
|
||||
"\n",
|
||||
"\"\"\"models = {\n",
|
||||
" 'MODEL_GPT': 'gpt-4o-mini',\n",
|
||||
" 'MODEL_LLAMA': 'llama3.2'\n",
|
||||
"}\"\"\"\n",
|
||||
"\n",
|
||||
"stream_brochure(question,'MODEL_LLAMA')"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "llms",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.11"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
217
week1/community-contributions/week1_day1_chat_summarizer.ipynb
Normal file
217
week1/community-contributions/week1_day1_chat_summarizer.ipynb
Normal file
@@ -0,0 +1,217 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"id": "2ce61bb5-1d5b-43b8-b5bb-6aeae91c7574",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"from dotenv import load_dotenv\n",
|
||||
"from openai import OpenAI\n",
|
||||
"from IPython.display import Markdown, display"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "3399686d-5f14-4fb2-8939-fd2401be3007",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"MODEL = \"gpt-4o-mini\"\n",
|
||||
"SYSTEM_PROMPT_PATH = \"Chat_Summary_Data/System_Prompt.txt\"\n",
|
||||
"CHATS_PATH = \"Chat_Summary_Data/Chat_Examples/\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "d97b8374-a161-435c-8317-1d0ecaaa9b71",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"API key found and looks good so far!\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Load environment variables in a file called .env\n",
|
||||
"\n",
|
||||
"load_dotenv(override=True)\n",
|
||||
"api_key = os.getenv('OPENAI_API_KEY')\n",
|
||||
"\n",
|
||||
"# Check the key\n",
|
||||
"\n",
|
||||
"if not api_key:\n",
|
||||
" print(\"No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!\")\n",
|
||||
"elif not api_key.startswith(\"sk-proj-\"):\n",
|
||||
" print(\"An API key was found, but it doesn't start sk-proj-; please check you're using the right key - see troubleshooting notebook\")\n",
|
||||
"elif api_key.strip() != api_key:\n",
|
||||
" print(\"An API key was found, but it looks like it might have space or tab characters at the start or end - please remove them - see troubleshooting notebook\")\n",
|
||||
"else:\n",
|
||||
" print(\"API key found and looks good so far!\")\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "b3f4afb4-2e4a-4971-915e-a8634a17eda8",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"class ChatAI:\n",
|
||||
" def __init__(self, system_prompt_path=SYSTEM_PROMPT_PATH, model=MODEL):\n",
|
||||
" with open(system_prompt_path, \"r\") as file:\n",
|
||||
" self.system_prompt = file.read()\n",
|
||||
"\n",
|
||||
" self.openai = OpenAI()\n",
|
||||
" self.model = model\n",
|
||||
" \n",
|
||||
" @staticmethod\n",
|
||||
" def _get_user_prompt(chat_txt):\n",
|
||||
" with open(chat_txt, \"r\") as file:\n",
|
||||
" user_prompt_str = file.read()\n",
|
||||
" return user_prompt_str\n",
|
||||
" \n",
|
||||
" def generate(self, chat_txt):\n",
|
||||
" messages = [\n",
|
||||
" {\"role\": \"system\", \"content\": self.system_prompt},\n",
|
||||
" {\"role\": \"user\", \"content\": self._get_user_prompt(chat_txt)}\n",
|
||||
" ]\n",
|
||||
"\n",
|
||||
" response = self.openai.chat.completions.create(model=self.model, messages=messages)\n",
|
||||
" return response.choices[0].message.content"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "d243b582-66af-49f9-bcd1-e05a63e61c34",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"chat_ai = ChatAI()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "c764ace6-5a0f-4dd0-9454-0b8a093b97fc",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/markdown": [
|
||||
"# Chat1"
|
||||
],
|
||||
"text/plain": [
|
||||
"<IPython.core.display.Markdown object>"
|
||||
]
|
||||
},
|
||||
"metadata": {},
|
||||
"output_type": "display_data"
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/markdown": [
|
||||
"- **Order:** 2 Medium Chicken BBQ Pizzas\n",
|
||||
"- **Cost:** 342 LE\n",
|
||||
"- **Experience:** Negative\n",
|
||||
" - **Summary:** The client expressed dissatisfaction with the pizza taste."
|
||||
],
|
||||
"text/plain": [
|
||||
"<IPython.core.display.Markdown object>"
|
||||
]
|
||||
},
|
||||
"metadata": {},
|
||||
"output_type": "display_data"
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/markdown": [
|
||||
"# Chat2"
|
||||
],
|
||||
"text/plain": [
|
||||
"<IPython.core.display.Markdown object>"
|
||||
]
|
||||
},
|
||||
"metadata": {},
|
||||
"output_type": "display_data"
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/markdown": [
|
||||
"- The client ordered: Nothing \n",
|
||||
"- Summary: The client did not place an order because the chicken ranch pizza was unavailable."
|
||||
],
|
||||
"text/plain": [
|
||||
"<IPython.core.display.Markdown object>"
|
||||
]
|
||||
},
|
||||
"metadata": {},
|
||||
"output_type": "display_data"
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/markdown": [
|
||||
"# Chat3"
|
||||
],
|
||||
"text/plain": [
|
||||
"<IPython.core.display.Markdown object>"
|
||||
]
|
||||
},
|
||||
"metadata": {},
|
||||
"output_type": "display_data"
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/markdown": [
|
||||
"- **Order**: Large pepperoni pizza and onion rings \n",
|
||||
"- **Total Cost**: 250 LE \n",
|
||||
"- **Experience**: Positive \n",
|
||||
" - The client enjoyed the pizza despite the delay in delivery."
|
||||
],
|
||||
"text/plain": [
|
||||
"<IPython.core.display.Markdown object>"
|
||||
]
|
||||
},
|
||||
"metadata": {},
|
||||
"output_type": "display_data"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"chats_txt = os.listdir(CHATS_PATH)\n",
|
||||
"for chat_file in chats_txt:\n",
|
||||
" markdown_heading = f\"# {chat_file[:-4]}\"\n",
|
||||
" display(Markdown(markdown_heading))\n",
|
||||
" display(Markdown(chat_ai.generate(CHATS_PATH+chat_file)))"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.11"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
@@ -427,7 +427,12 @@
|
||||
"with: \n",
|
||||
"`import httpx` \n",
|
||||
"`openai = OpenAI(http_client=httpx.Client(verify=False))` \n",
|
||||
"And if that works, you're in good shape. You'll just have to change the labs in the same way any time you hit this cert error.\n",
|
||||
"And also please replace: \n",
|
||||
"`requests.get(url, headers=headers)` \n",
|
||||
"with: \n",
|
||||
"`requests.get(url, headers=headers, verify=False)` \n",
|
||||
"And if that works, you're in good shape. You'll just have to change the labs in the same way any time you hit this cert error. \n",
|
||||
"This approach isn't OK for production code, but it's fine for our experiments. You may need to contact IT support to understand whether there are restrictions in your environment.\n",
|
||||
"\n",
|
||||
"## If all else fails:\n",
|
||||
"\n",
|
||||
|
||||
Reference in New Issue
Block a user