236 lines
6.7 KiB
Plaintext
236 lines
6.7 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d12b9c22",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Song Lyrics → One-Sentence Summary\n",
|
||
"Get the lyrics of a song and summarize its main idea in about one sentence.\n",
|
||
"\n",
|
||
"## Setup\n",
|
||
"Import required libraries: environment vars, display helper, OpenAI client, BeautifulSoup, and requests."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "d94bbd61",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"import os\n",
|
||
"from IPython.display import Markdown, display\n",
|
||
"from openai import OpenAI\n",
|
||
"from bs4 import BeautifulSoup\n",
|
||
"import requests"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "92dc1bde",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Function: Get Lyrics from Genius\n",
|
||
"Fetch and extract the lyrics from a Genius.com song page using BeautifulSoup."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "2b43fa98",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"def get_lyrics_from_genius(url: str) -> str:\n",
|
||
" \"\"\"\n",
|
||
" Extracts song lyrics from a Genius.com song URL using BeautifulSoup.\n",
|
||
" Example URL: https://genius.com/Ed-sheeran-shape-of-you-lyrics\n",
|
||
" \"\"\"\n",
|
||
" # Standard headers to fetch a website\n",
|
||
" headers = {\n",
|
||
" \"User-Agent\": \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36\"\n",
|
||
" }\n",
|
||
"\n",
|
||
" response = requests.get(url, headers=headers)\n",
|
||
" response.raise_for_status() # raises error if page not found\n",
|
||
"\n",
|
||
" soup = BeautifulSoup(response.text, \"html.parser\")\n",
|
||
"\n",
|
||
" # Genius stores lyrics inside <div data-lyrics-container=\"true\">\n",
|
||
" lyrics_blocks = soup.find_all(\"div\", {\"data-lyrics-container\": \"true\"})\n",
|
||
"\n",
|
||
" if not lyrics_blocks:\n",
|
||
" return \"Lyrics not found.\"\n",
|
||
"\n",
|
||
" # Join all text blocks and clean up spacing\n",
|
||
" lyrics = \"\\n\".join(block.get_text(separator=\"\\n\") for block in lyrics_blocks)\n",
|
||
" return lyrics.strip()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "fc4f0590",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Function: Create Genius URL\n",
|
||
"Build a Genius.com lyrics URL automatically from the given artist and song name."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "e018c623",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"def create_genius_url(artist: str, song: str) -> str:\n",
|
||
" \"\"\"\n",
|
||
" Creates a Genius.com lyrics URL from artist and song name.\n",
|
||
" Example:\n",
|
||
" create_genius_url(\"Ed sheeran\", \"shape of you\")\n",
|
||
" → https://genius.com/Ed-sheeran-shape-of-you-lyrics\n",
|
||
" \"\"\"\n",
|
||
" artist = artist.strip().replace(\" \", \"-\")\n",
|
||
" song = song.strip().replace(\" \", \"-\")\n",
|
||
" return f\"https://genius.com/{artist}-{song}-lyrics\"\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "62f50f02",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Generate URL and Fetch Lyrics\n",
|
||
"Create the Genius URL from the artist and song name, then fetch and display the lyrics."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "ed51d48d",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"artist = \"Ed sheeran\"\n",
|
||
"song = \"shape of you\"\n",
|
||
"\n",
|
||
"url = create_genius_url(artist, song)\n",
|
||
"print(url)\n",
|
||
"# Output: https://genius.com/Ed-sheeran-shape-of-you-lyrics\n",
|
||
"\n",
|
||
"user_prompt = get_lyrics_from_genius(url)\n",
|
||
"print(user_prompt[:5000]) "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "fca4203a",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"system_prompt = \"\"\"\n",
|
||
"You are a **helpful assistant** that specializes in analyzing **song lyrics**.\n",
|
||
"\n",
|
||
"## Task\n",
|
||
"Your goal is to **summarize the main idea or theme of a song** in **about one sentence**.\n",
|
||
"\n",
|
||
"## Instructions\n",
|
||
"1. Read the given song lyrics carefully.\n",
|
||
"2. Identify the **core message**, **emotion**, or **story** of the song.\n",
|
||
"3. Respond with **one concise sentence** only.\n",
|
||
"4. The tone of your summary should reflect the song’s mood (e.g., joyful, melancholic, romantic, rebellious).\n",
|
||
"\n",
|
||
"## Edge Cases\n",
|
||
"- **Very short lyrics:** Summarize the implied meaning.\n",
|
||
"- **Repetitive lyrics:** Focus on the message or emotion being emphasized.\n",
|
||
"- **Abstract or nonsensical lyrics:** Describe the overall feeling or imagery they create.\n",
|
||
"- **No lyrics or only a title provided:** Reply with \n",
|
||
" `No lyrics provided — unable to summarize meaningfully.`\n",
|
||
"- **Non-English lyrics:** Summarize in English unless otherwise instructed.\n",
|
||
"\n",
|
||
"## Output Format\n",
|
||
"Plain text — a single, coherent sentence summarizing the main idea of the song.\n",
|
||
"\"\"\""
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "11784d62",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Create Chat Messages\n",
|
||
"Prepare the system and user messages, then send them to the OpenAI model for summarization."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "f1205658",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"messages = [\n",
|
||
" {\"role\": \"system\", \"content\": system_prompt},\n",
|
||
" {\"role\": \"user\", \"content\": user_prompt}\n",
|
||
"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "5c8d61aa",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"openai = OpenAI()\n",
|
||
"response = openai.chat.completions.create(\n",
|
||
" model = \"gpt-4.1-mini\",\n",
|
||
" messages = messages\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "4ad95820",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Display Summary\n",
|
||
"Show the model’s one-sentence summary of the song lyrics in a formatted Markdown output."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "4f09a642",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"display(Markdown(response.choices[0].message.content))"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": ".venv",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.12.6"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 5
|
||
}
|