Merge pull request #831 from KfirTayar/kfir/song-meaning-summarizer
Week 1: Add song meaning summarizer notebook (cleared outputs)
This commit is contained in:
@@ -0,0 +1,235 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "d12b9c22",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Song Lyrics → One-Sentence Summary\n",
|
||||
"Get the lyrics of a song and summarize its main idea in about one sentence.\n",
|
||||
"\n",
|
||||
"## Setup\n",
|
||||
"Import required libraries: environment vars, display helper, OpenAI client, BeautifulSoup, and requests."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "d94bbd61",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"from IPython.display import Markdown, display\n",
|
||||
"from openai import OpenAI\n",
|
||||
"from bs4 import BeautifulSoup\n",
|
||||
"import requests"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "92dc1bde",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Function: Get Lyrics from Genius\n",
|
||||
"Fetch and extract the lyrics from a Genius.com song page using BeautifulSoup."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "2b43fa98",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def get_lyrics_from_genius(url: str) -> str:\n",
|
||||
" \"\"\"\n",
|
||||
" Extracts song lyrics from a Genius.com song URL using BeautifulSoup.\n",
|
||||
" Example URL: https://genius.com/Ed-sheeran-shape-of-you-lyrics\n",
|
||||
" \"\"\"\n",
|
||||
" # Standard headers to fetch a website\n",
|
||||
" headers = {\n",
|
||||
" \"User-Agent\": \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36\"\n",
|
||||
" }\n",
|
||||
"\n",
|
||||
" response = requests.get(url, headers=headers)\n",
|
||||
" response.raise_for_status() # raises error if page not found\n",
|
||||
"\n",
|
||||
" soup = BeautifulSoup(response.text, \"html.parser\")\n",
|
||||
"\n",
|
||||
" # Genius stores lyrics inside <div data-lyrics-container=\"true\">\n",
|
||||
" lyrics_blocks = soup.find_all(\"div\", {\"data-lyrics-container\": \"true\"})\n",
|
||||
"\n",
|
||||
" if not lyrics_blocks:\n",
|
||||
" return \"Lyrics not found.\"\n",
|
||||
"\n",
|
||||
" # Join all text blocks and clean up spacing\n",
|
||||
" lyrics = \"\\n\".join(block.get_text(separator=\"\\n\") for block in lyrics_blocks)\n",
|
||||
" return lyrics.strip()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "fc4f0590",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Function: Create Genius URL\n",
|
||||
"Build a Genius.com lyrics URL automatically from the given artist and song name."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "e018c623",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def create_genius_url(artist: str, song: str) -> str:\n",
|
||||
" \"\"\"\n",
|
||||
" Creates a Genius.com lyrics URL from artist and song name.\n",
|
||||
" Example:\n",
|
||||
" create_genius_url(\"Ed sheeran\", \"shape of you\")\n",
|
||||
" → https://genius.com/Ed-sheeran-shape-of-you-lyrics\n",
|
||||
" \"\"\"\n",
|
||||
" artist = artist.strip().replace(\" \", \"-\")\n",
|
||||
" song = song.strip().replace(\" \", \"-\")\n",
|
||||
" return f\"https://genius.com/{artist}-{song}-lyrics\"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "62f50f02",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Generate URL and Fetch Lyrics\n",
|
||||
"Create the Genius URL from the artist and song name, then fetch and display the lyrics."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "ed51d48d",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"artist = \"Ed sheeran\"\n",
|
||||
"song = \"shape of you\"\n",
|
||||
"\n",
|
||||
"url = create_genius_url(artist, song)\n",
|
||||
"print(url)\n",
|
||||
"# Output: https://genius.com/Ed-sheeran-shape-of-you-lyrics\n",
|
||||
"\n",
|
||||
"user_prompt = get_lyrics_from_genius(url)\n",
|
||||
"print(user_prompt[:5000]) "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "fca4203a",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"system_prompt = \"\"\"\n",
|
||||
"You are a **helpful assistant** that specializes in analyzing **song lyrics**.\n",
|
||||
"\n",
|
||||
"## Task\n",
|
||||
"Your goal is to **summarize the main idea or theme of a song** in **about one sentence**.\n",
|
||||
"\n",
|
||||
"## Instructions\n",
|
||||
"1. Read the given song lyrics carefully.\n",
|
||||
"2. Identify the **core message**, **emotion**, or **story** of the song.\n",
|
||||
"3. Respond with **one concise sentence** only.\n",
|
||||
"4. The tone of your summary should reflect the song’s mood (e.g., joyful, melancholic, romantic, rebellious).\n",
|
||||
"\n",
|
||||
"## Edge Cases\n",
|
||||
"- **Very short lyrics:** Summarize the implied meaning.\n",
|
||||
"- **Repetitive lyrics:** Focus on the message or emotion being emphasized.\n",
|
||||
"- **Abstract or nonsensical lyrics:** Describe the overall feeling or imagery they create.\n",
|
||||
"- **No lyrics or only a title provided:** Reply with \n",
|
||||
" `No lyrics provided — unable to summarize meaningfully.`\n",
|
||||
"- **Non-English lyrics:** Summarize in English unless otherwise instructed.\n",
|
||||
"\n",
|
||||
"## Output Format\n",
|
||||
"Plain text — a single, coherent sentence summarizing the main idea of the song.\n",
|
||||
"\"\"\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "11784d62",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Create Chat Messages\n",
|
||||
"Prepare the system and user messages, then send them to the OpenAI model for summarization."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "f1205658",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"messages = [\n",
|
||||
" {\"role\": \"system\", \"content\": system_prompt},\n",
|
||||
" {\"role\": \"user\", \"content\": user_prompt}\n",
|
||||
"]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "5c8d61aa",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"openai = OpenAI()\n",
|
||||
"response = openai.chat.completions.create(\n",
|
||||
" model = \"gpt-4.1-mini\",\n",
|
||||
" messages = messages\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "4ad95820",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Display Summary\n",
|
||||
"Show the model’s one-sentence summary of the song lyrics in a formatted Markdown output."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "4f09a642",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"display(Markdown(response.choices[0].message.content))"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": ".venv",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.12.6"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
Reference in New Issue
Block a user