Merge pull request #831 from KfirTayar/kfir/song-meaning-summarizer

Week 1: Add song meaning summarizer notebook (cleared outputs)
This commit is contained in:
Ed Donner
2025-10-25 14:11:51 -04:00
committed by GitHub

View File

@@ -0,0 +1,235 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "d12b9c22",
"metadata": {},
"source": [
"# Song Lyrics → One-Sentence Summary\n",
"Get the lyrics of a song and summarize its main idea in about one sentence.\n",
"\n",
"## Setup\n",
"Import required libraries: environment vars, display helper, OpenAI client, BeautifulSoup, and requests."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d94bbd61",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from IPython.display import Markdown, display\n",
"from openai import OpenAI\n",
"from bs4 import BeautifulSoup\n",
"import requests"
]
},
{
"cell_type": "markdown",
"id": "92dc1bde",
"metadata": {},
"source": [
"## Function: Get Lyrics from Genius\n",
"Fetch and extract the lyrics from a Genius.com song page using BeautifulSoup."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2b43fa98",
"metadata": {},
"outputs": [],
"source": [
"def get_lyrics_from_genius(url: str) -> str:\n",
" \"\"\"\n",
" Extracts song lyrics from a Genius.com song URL using BeautifulSoup.\n",
" Example URL: https://genius.com/Ed-sheeran-shape-of-you-lyrics\n",
" \"\"\"\n",
" # Standard headers to fetch a website\n",
" headers = {\n",
" \"User-Agent\": \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36\"\n",
" }\n",
"\n",
" response = requests.get(url, headers=headers)\n",
" response.raise_for_status() # raises error if page not found\n",
"\n",
" soup = BeautifulSoup(response.text, \"html.parser\")\n",
"\n",
" # Genius stores lyrics inside <div data-lyrics-container=\"true\">\n",
" lyrics_blocks = soup.find_all(\"div\", {\"data-lyrics-container\": \"true\"})\n",
"\n",
" if not lyrics_blocks:\n",
" return \"Lyrics not found.\"\n",
"\n",
" # Join all text blocks and clean up spacing\n",
" lyrics = \"\\n\".join(block.get_text(separator=\"\\n\") for block in lyrics_blocks)\n",
" return lyrics.strip()"
]
},
{
"cell_type": "markdown",
"id": "fc4f0590",
"metadata": {},
"source": [
"## Function: Create Genius URL\n",
"Build a Genius.com lyrics URL automatically from the given artist and song name."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e018c623",
"metadata": {},
"outputs": [],
"source": [
"def create_genius_url(artist: str, song: str) -> str:\n",
" \"\"\"\n",
" Creates a Genius.com lyrics URL from artist and song name.\n",
" Example:\n",
" create_genius_url(\"Ed sheeran\", \"shape of you\")\n",
" → https://genius.com/Ed-sheeran-shape-of-you-lyrics\n",
" \"\"\"\n",
" artist = artist.strip().replace(\" \", \"-\")\n",
" song = song.strip().replace(\" \", \"-\")\n",
" return f\"https://genius.com/{artist}-{song}-lyrics\"\n"
]
},
{
"cell_type": "markdown",
"id": "62f50f02",
"metadata": {},
"source": [
"## Generate URL and Fetch Lyrics\n",
"Create the Genius URL from the artist and song name, then fetch and display the lyrics."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ed51d48d",
"metadata": {},
"outputs": [],
"source": [
"artist = \"Ed sheeran\"\n",
"song = \"shape of you\"\n",
"\n",
"url = create_genius_url(artist, song)\n",
"print(url)\n",
"# Output: https://genius.com/Ed-sheeran-shape-of-you-lyrics\n",
"\n",
"user_prompt = get_lyrics_from_genius(url)\n",
"print(user_prompt[:5000]) "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fca4203a",
"metadata": {},
"outputs": [],
"source": [
"system_prompt = \"\"\"\n",
"You are a **helpful assistant** that specializes in analyzing **song lyrics**.\n",
"\n",
"## Task\n",
"Your goal is to **summarize the main idea or theme of a song** in **about one sentence**.\n",
"\n",
"## Instructions\n",
"1. Read the given song lyrics carefully.\n",
"2. Identify the **core message**, **emotion**, or **story** of the song.\n",
"3. Respond with **one concise sentence** only.\n",
"4. The tone of your summary should reflect the songs mood (e.g., joyful, melancholic, romantic, rebellious).\n",
"\n",
"## Edge Cases\n",
"- **Very short lyrics:** Summarize the implied meaning.\n",
"- **Repetitive lyrics:** Focus on the message or emotion being emphasized.\n",
"- **Abstract or nonsensical lyrics:** Describe the overall feeling or imagery they create.\n",
"- **No lyrics or only a title provided:** Reply with \n",
" `No lyrics provided — unable to summarize meaningfully.`\n",
"- **Non-English lyrics:** Summarize in English unless otherwise instructed.\n",
"\n",
"## Output Format\n",
"Plain text — a single, coherent sentence summarizing the main idea of the song.\n",
"\"\"\""
]
},
{
"cell_type": "markdown",
"id": "11784d62",
"metadata": {},
"source": [
"## Create Chat Messages\n",
"Prepare the system and user messages, then send them to the OpenAI model for summarization."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f1205658",
"metadata": {},
"outputs": [],
"source": [
"messages = [\n",
" {\"role\": \"system\", \"content\": system_prompt},\n",
" {\"role\": \"user\", \"content\": user_prompt}\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5c8d61aa",
"metadata": {},
"outputs": [],
"source": [
"openai = OpenAI()\n",
"response = openai.chat.completions.create(\n",
" model = \"gpt-4.1-mini\",\n",
" messages = messages\n",
")"
]
},
{
"cell_type": "markdown",
"id": "4ad95820",
"metadata": {},
"source": [
"## Display Summary\n",
"Show the models one-sentence summary of the song lyrics in a formatted Markdown output."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4f09a642",
"metadata": {},
"outputs": [],
"source": [
"display(Markdown(response.choices[0].message.content))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}