Updates including more alternatives for PC audio generation

This commit is contained in:
Edward Donner
2024-11-02 12:51:29 -04:00
parent 284336b3b9
commit bad513321d
4 changed files with 161 additions and 9 deletions

View File

@@ -300,6 +300,8 @@
"\n",
"**For PC Users**\n",
"\n",
"Detailed instructions are [here](https://chatgpt.com/share/6724efee-6b0c-8012-ac5e-72e2e3885905) and summary instructions:\n",
"\n",
"1. Download FFmpeg from the official website: https://ffmpeg.org/download.html\n",
"\n",
"2. Extract the downloaded files to a location on your computer (e.g., `C:\\ffmpeg`)\n",
@@ -327,6 +329,30 @@
"Message me or email me at ed@edwarddonner.com with any problems!"
]
},
{
"cell_type": "markdown",
"id": "4cc90e80-c96e-4dd4-b9d6-386fe2b7e797",
"metadata": {},
"source": [
"## To check you now have ffmpeg and can access it here\n",
"\n",
"Excecute the next cell to see if you get a version number. (Putting an exclamation mark before something in Jupyter Lab tells it to run it as a terminal command rather than python code).\n",
"\n",
"If this doesn't work, you may need to actually save and close down your Jupyter lab, and start it again from a new Terminal window (Mac) or Anaconda prompt (PC), remembering to activate the llms environment. This ensures you pick up ffmpeg.\n",
"\n",
"And if that doesn't work, please contact me!"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7b3be0fb-1d34-4693-ab6f-dbff190afcd7",
"metadata": {},
"outputs": [],
"source": [
"!ffmpeg -version"
]
},
{
"cell_type": "markdown",
"id": "d91d3f8f-e505-4e3c-a87c-9e42ed823db6",
@@ -336,7 +362,7 @@
"\n",
"This version should work fine for you. It might work for Windows users too, but you might get a Permissions error writing to a temp file. If so, see the next section!\n",
"\n",
"As always, if you have problems, please contact me! (You could also comment out the audio talker() in the later code if you're less interested in audio generation)"
"As always, if you have problems, please contact me! (You could also comment out the audio talker() in the later code if you're less interested in audio generation - "
]
},
{
@@ -380,7 +406,11 @@
"\n",
"## if you get a permissions error writing to a temp file, then this code should work instead.\n",
"\n",
"A collaboration between student Mark M. and Claude got this resolved!"
"A collaboration between students Mark M. and Patrick H. and Claude got this resolved!\n",
"\n",
"Below are 3 variations - hopefully one of them will work on your PC. If not, message me please!\n",
"\n",
"## PC Variation 1"
]
},
{
@@ -392,12 +422,16 @@
"source": [
"import tempfile\n",
"import subprocess\n",
"from io import BytesIO\n",
"from pydub import AudioSegment\n",
"import time\n",
"\n",
"def play_audio(audio_segment):\n",
" temp_dir = tempfile.gettempdir()\n",
" temp_path = os.path.join(temp_dir, \"temp_audio.wav\")\n",
" try:\n",
" audio_segment.export(temp_path, format=\"wav\")\n",
" time.sleep(3) # Dominic found that this was needed. You could also try commenting out to see if not needed on your PC\n",
" subprocess.call([\n",
" \"ffplay\",\n",
" \"-nodisp\",\n",
@@ -419,19 +453,127 @@
" )\n",
" audio_stream = BytesIO(response.content)\n",
" audio = AudioSegment.from_file(audio_stream, format=\"mp3\")\n",
" play_audio(audio)"
" play_audio(audio)\n",
"\n",
"talker(\"Well hi there\")"
]
},
{
"cell_type": "markdown",
"id": "96f90e35-f71e-468e-afea-07b98f74dbcf",
"metadata": {},
"source": [
"## PC Variation 2"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0dfb8ee9-e7dd-4615-8d69-2deb3fd44473",
"id": "8597c7f8-7b50-44ad-9b31-db12375cd57b",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from pydub import AudioSegment\n",
"from pydub.playback import play\n",
"from io import BytesIO\n",
"\n",
"def talker(message):\n",
" # Set a custom directory for temporary files on Windows\n",
" custom_temp_dir = os.path.expanduser(\"~/Documents/temp_audio\")\n",
" os.environ['TEMP'] = custom_temp_dir # You can also use 'TMP' if necessary\n",
" \n",
" # Create the folder if it doesn't exist\n",
" if not os.path.exists(custom_temp_dir):\n",
" os.makedirs(custom_temp_dir)\n",
" \n",
" response = openai.audio.speech.create(\n",
" model=\"tts-1\",\n",
" voice=\"onyx\", # Also, try replacing onyx with alloy\n",
" input=message\n",
" )\n",
" \n",
" audio_stream = BytesIO(response.content)\n",
" audio = AudioSegment.from_file(audio_stream, format=\"mp3\")\n",
"\n",
" play(audio)\n",
"\n",
"talker(\"Well hi there\")"
]
},
{
"cell_type": "markdown",
"id": "e821224c-b069-4f9b-9535-c15fdb0e411c",
"metadata": {},
"source": [
"## PC Variation 3\n",
"\n",
"### Let's try a completely different sound library\n",
"\n",
"First run the next cell to install a new library, then try the cell below it."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "69d3c0d9-afcc-49e3-b829-9c9869d8b472",
"metadata": {},
"outputs": [],
"source": [
"!pip install simpleaudio"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "28f9cc99-36b7-4554-b3f4-f2012f614a13",
"metadata": {},
"outputs": [],
"source": [
"from pydub import AudioSegment\n",
"from io import BytesIO\n",
"import tempfile\n",
"import os\n",
"import simpleaudio as sa\n",
"\n",
"def talker(message):\n",
" response = openai.audio.speech.create(\n",
" model=\"tts-1\",\n",
" voice=\"onyx\", # Also, try replacing onyx with alloy\n",
" input=message\n",
" )\n",
" \n",
" audio_stream = BytesIO(response.content)\n",
" audio = AudioSegment.from_file(audio_stream, format=\"mp3\")\n",
"\n",
" # Create a temporary file in a folder where you have write permissions\n",
" with tempfile.NamedTemporaryFile(suffix=\".wav\", delete=False, dir=os.path.expanduser(\"~/Documents\")) as temp_audio_file:\n",
" temp_file_name = temp_audio_file.name\n",
" audio.export(temp_file_name, format=\"wav\")\n",
" \n",
" # Load and play audio using simpleaudio\n",
" wave_obj = sa.WaveObject.from_wave_file(temp_file_name)\n",
" play_obj = wave_obj.play()\n",
" play_obj.wait_done() # Wait for playback to finish\n",
"\n",
" # Clean up the temporary file afterward\n",
" os.remove(temp_file_name)\n",
" \n",
"talker(\"Well hi there\")"
]
},
{
"cell_type": "markdown",
"id": "7986176b-cd04-495f-a47f-e057b0e462ed",
"metadata": {},
"source": [
"## PC Users - if none of those 3 variations worked!\n",
"\n",
"Please get in touch with me. I'm sorry this is causing problems! We'll figure it out.\n",
"\n",
"Alternatively: playing audio from your PC isn't super-critical for this course, and you can feel free to focus on image generation and skip audio for now, or come back to it later."
]
},
{
"cell_type": "markdown",
"id": "1d48876d-c4fa-46a8-a04f-f9fadf61fb0d",
@@ -472,7 +614,10 @@
" \n",
" reply = response.choices[0].message.content\n",
" history += [{\"role\":\"assistant\", \"content\":reply}]\n",
"\n",
" # Comment out or delete the next line if you'd rather skip Audio for now..\n",
" talker(reply)\n",
" \n",
" return history, image"
]
},