Files
LLM_Engineering_OLD/week1/community-contributions/day2_Ollama_Solution.ipynb
2025-08-02 12:03:11 +05:30

158 lines
5.4 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
{
"cells": [
{
"cell_type": "markdown",
"id": "663695bd-d1f2-4acf-8669-02d9f75f1bf4",
"metadata": {},
"source": [
"# Day 2: Ollama Solution for Website Summarization\n",
"### Building and Deploying Website Summarization Tools with Ollama"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "112ef04a-136e-4e65-b94e-8674a64606ed",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"**Summary of Website**\n",
"=========================\n",
"### Overview\n",
"\n",
"The \"Home - Edward Donner\" website is a personal blog hosted by Edward Donner, the co-founder and CTO of Nebula.io. The website shares insights on his professional experiences with AI, particularly in applying LLMs (Large Language Models) to help people discover their potential.\n",
"\n",
"### Recent Developments\n",
"------------------------\n",
"\n",
"* **Courses**: Upcoming online courses, including \"Connecting my courses become an LLM expert and leader\" and \"The Complete Agentic AI Engineering Course\"\n",
"* **Events**:\n",
"\t+ May 28, 2025: Launch connecting his courses (LLM Expert and Leader)\n",
"\t+ May 18, 2025: 2025 AI Executive Briefing\n",
"\t+ April 21, 2025: AI Executive Briefing\n",
"\t+ January 23, 2025: The Complete Agentic AI Engineering Course\n",
"\n",
"### About the Creator\n",
"-------------------------\n",
"\n",
"* Edward Donner is the co-founder and CTO of Nebula.io.\n",
"* He has experience as a founder and CEO of an AI startup that was acquired in 2021.\n",
"* His interests include DJing, amateur electronic music production, and reading Hacker News.\n",
"\n",
"### Contact Information\n",
"-------------------------\n",
"\n",
"* Email: [ed @ edwarddonner . com](mailto:ed@edwarddonner.com)\n",
"* Website: www.edwarddonner.com\n",
"* Social media links:\n",
"\t+ LinkedIn\n",
"\t+ Twitter\n",
"\t+ Facebook"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import requests\n",
"from bs4 import BeautifulSoup\n",
"from IPython.display import Markdown, display\n",
"\n",
"OLLAMA_API = \"http://localhost:11434/api/chat\"\n",
"HEADERS = {\"Content-Type\": \"application/json\"}\n",
"MODEL = \"llama3.2\"\n",
"\n",
"headers = {\n",
" \"User-Agent\": \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36\"\n",
"}\n",
"\n",
"class Website:\n",
"\n",
" def __init__(self, url):\n",
" \"\"\"\n",
" Create this Website object from the given url using the BeautifulSoup library\n",
" \"\"\"\n",
" self.url = url\n",
" response = requests.get(url, headers=headers)\n",
" soup = BeautifulSoup(response.content, 'html.parser')\n",
" self.title = soup.title.string if soup.title else \"No title found\"\n",
" for irrelevant in soup.body([\"script\", \"style\", \"img\", \"input\"]):\n",
" irrelevant.decompose()\n",
" self.text = soup.body.get_text(separator=\"\\n\", strip=True)\n",
"\n",
"\n",
"system_prompt = \"You are an assistant that analyzes the contents of a website \\\n",
"and provides a short summary, ignoring text that might be navigation related. \\\n",
"Respond in markdown.\"\n",
"\n",
"def user_prompt_for(website):\n",
" user_prompt = f\"You are looking at a website titled {website.title}\"\n",
" user_prompt += \"\\nThe contents of this website is as follows; \\\n",
"please provide a short summary of this website in markdown. \\\n",
"If it includes news or announcements, then summarize these too.\\n\\n\"\n",
" user_prompt += website.text\n",
" return user_prompt\n",
"\n",
"def messages_for(website):\n",
" return [\n",
" {\"role\": \"system\", \"content\": system_prompt},\n",
" {\"role\": \"user\", \"content\": user_prompt_for(website)}\n",
" ]\n",
"\n",
"url = \"https://sitemakerlab.com/\" \n",
"site = Website(url)\n",
"messages = messages_for(site)\n",
"\n",
"def summarize(url):\n",
" website = Website(url)\n",
" response = ollama_via_openai.chat.completions.create(\n",
" model = MODEL,\n",
" messages = messages_for(website)\n",
" )\n",
" return response.choices[0].message.content\n",
"\n",
"def display_summary(url):\n",
" summary = summarize(url)\n",
" display(Markdown(summary))\n",
"\n",
"display_summary(\"https://edwarddonner.com\")\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "566373e7-8612-4c39-a432-7795cb7d2e6c",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}