Files
LLM_Engineering_OLD/week1/community-contributions/song-meaning-summarizer/song_meaning_summarizer.ipynb

236 lines
6.7 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
{
"cells": [
{
"cell_type": "markdown",
"id": "d12b9c22",
"metadata": {},
"source": [
"# Song Lyrics → One-Sentence Summary\n",
"Get the lyrics of a song and summarize its main idea in about one sentence.\n",
"\n",
"## Setup\n",
"Import required libraries: environment vars, display helper, OpenAI client, BeautifulSoup, and requests."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d94bbd61",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from IPython.display import Markdown, display\n",
"from openai import OpenAI\n",
"from bs4 import BeautifulSoup\n",
"import requests"
]
},
{
"cell_type": "markdown",
"id": "92dc1bde",
"metadata": {},
"source": [
"## Function: Get Lyrics from Genius\n",
"Fetch and extract the lyrics from a Genius.com song page using BeautifulSoup."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2b43fa98",
"metadata": {},
"outputs": [],
"source": [
"def get_lyrics_from_genius(url: str) -> str:\n",
" \"\"\"\n",
" Extracts song lyrics from a Genius.com song URL using BeautifulSoup.\n",
" Example URL: https://genius.com/Ed-sheeran-shape-of-you-lyrics\n",
" \"\"\"\n",
" # Standard headers to fetch a website\n",
" headers = {\n",
" \"User-Agent\": \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36\"\n",
" }\n",
"\n",
" response = requests.get(url, headers=headers)\n",
" response.raise_for_status() # raises error if page not found\n",
"\n",
" soup = BeautifulSoup(response.text, \"html.parser\")\n",
"\n",
" # Genius stores lyrics inside <div data-lyrics-container=\"true\">\n",
" lyrics_blocks = soup.find_all(\"div\", {\"data-lyrics-container\": \"true\"})\n",
"\n",
" if not lyrics_blocks:\n",
" return \"Lyrics not found.\"\n",
"\n",
" # Join all text blocks and clean up spacing\n",
" lyrics = \"\\n\".join(block.get_text(separator=\"\\n\") for block in lyrics_blocks)\n",
" return lyrics.strip()"
]
},
{
"cell_type": "markdown",
"id": "fc4f0590",
"metadata": {},
"source": [
"## Function: Create Genius URL\n",
"Build a Genius.com lyrics URL automatically from the given artist and song name."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e018c623",
"metadata": {},
"outputs": [],
"source": [
"def create_genius_url(artist: str, song: str) -> str:\n",
" \"\"\"\n",
" Creates a Genius.com lyrics URL from artist and song name.\n",
" Example:\n",
" create_genius_url(\"Ed sheeran\", \"shape of you\")\n",
" → https://genius.com/Ed-sheeran-shape-of-you-lyrics\n",
" \"\"\"\n",
" artist = artist.strip().replace(\" \", \"-\")\n",
" song = song.strip().replace(\" \", \"-\")\n",
" return f\"https://genius.com/{artist}-{song}-lyrics\"\n"
]
},
{
"cell_type": "markdown",
"id": "62f50f02",
"metadata": {},
"source": [
"## Generate URL and Fetch Lyrics\n",
"Create the Genius URL from the artist and song name, then fetch and display the lyrics."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ed51d48d",
"metadata": {},
"outputs": [],
"source": [
"artist = \"Ed sheeran\"\n",
"song = \"shape of you\"\n",
"\n",
"url = create_genius_url(artist, song)\n",
"print(url)\n",
"# Output: https://genius.com/Ed-sheeran-shape-of-you-lyrics\n",
"\n",
"user_prompt = get_lyrics_from_genius(url)\n",
"print(user_prompt[:5000]) "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fca4203a",
"metadata": {},
"outputs": [],
"source": [
"system_prompt = \"\"\"\n",
"You are a **helpful assistant** that specializes in analyzing **song lyrics**.\n",
"\n",
"## Task\n",
"Your goal is to **summarize the main idea or theme of a song** in **about one sentence**.\n",
"\n",
"## Instructions\n",
"1. Read the given song lyrics carefully.\n",
"2. Identify the **core message**, **emotion**, or **story** of the song.\n",
"3. Respond with **one concise sentence** only.\n",
"4. The tone of your summary should reflect the songs mood (e.g., joyful, melancholic, romantic, rebellious).\n",
"\n",
"## Edge Cases\n",
"- **Very short lyrics:** Summarize the implied meaning.\n",
"- **Repetitive lyrics:** Focus on the message or emotion being emphasized.\n",
"- **Abstract or nonsensical lyrics:** Describe the overall feeling or imagery they create.\n",
"- **No lyrics or only a title provided:** Reply with \n",
" `No lyrics provided — unable to summarize meaningfully.`\n",
"- **Non-English lyrics:** Summarize in English unless otherwise instructed.\n",
"\n",
"## Output Format\n",
"Plain text — a single, coherent sentence summarizing the main idea of the song.\n",
"\"\"\""
]
},
{
"cell_type": "markdown",
"id": "11784d62",
"metadata": {},
"source": [
"## Create Chat Messages\n",
"Prepare the system and user messages, then send them to the OpenAI model for summarization."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f1205658",
"metadata": {},
"outputs": [],
"source": [
"messages = [\n",
" {\"role\": \"system\", \"content\": system_prompt},\n",
" {\"role\": \"user\", \"content\": user_prompt}\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5c8d61aa",
"metadata": {},
"outputs": [],
"source": [
"openai = OpenAI()\n",
"response = openai.chat.completions.create(\n",
" model = \"gpt-4.1-mini\",\n",
" messages = messages\n",
")"
]
},
{
"cell_type": "markdown",
"id": "4ad95820",
"metadata": {},
"source": [
"## Display Summary\n",
"Show the models one-sentence summary of the song lyrics in a formatted Markdown output."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4f09a642",
"metadata": {},
"outputs": [],
"source": [
"display(Markdown(response.choices[0].message.content))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}