Updates including more alternatives for PC audio generation

2024-11-02 12:51:29 -04:00
parent 284336b3b9
commit bad513321d
4 changed files with 161 additions and 9 deletions
--- a/week2/day5.ipynb
+++ b/week2/day5.ipynb
@@ -300,6 +300,8 @@
    "\n",
    "**For PC Users**\n",
    "\n",
+    "Detailed instructions are [here](https://chatgpt.com/share/6724efee-6b0c-8012-ac5e-72e2e3885905) and summary instructions:\n",
+    "\n",
    "1. Download FFmpeg from the official website: https://ffmpeg.org/download.html\n",
    "\n",
    "2. Extract the downloaded files to a location on your computer (e.g., `C:\\ffmpeg`)\n",
@@ -327,6 +329,30 @@
    "Message me or email me at ed@edwarddonner.com with any problems!"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "4cc90e80-c96e-4dd4-b9d6-386fe2b7e797",
+   "metadata": {},
+   "source": [
+    "## To check you now have ffmpeg and can access it here\n",
+    "\n",
+    "Excecute the next cell to see if you get a version number. (Putting an exclamation mark before something in Jupyter Lab tells it to run it as a terminal command rather than python code).\n",
+    "\n",
+    "If this doesn't work, you may need to actually save and close down your Jupyter lab, and start it again from a new Terminal window (Mac) or Anaconda prompt (PC), remembering to activate the llms environment. This ensures you pick up ffmpeg.\n",
+    "\n",
+    "And if that doesn't work, please contact me!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7b3be0fb-1d34-4693-ab6f-dbff190afcd7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!ffmpeg -version"
+   ]
+  },
  {
   "cell_type": "markdown",
   "id": "d91d3f8f-e505-4e3c-a87c-9e42ed823db6",
@@ -336,7 +362,7 @@
    "\n",
    "This version should work fine for you. It might work for Windows users too, but you might get a Permissions error writing to a temp file. If so, see the next section!\n",
    "\n",
-    "As always, if you have problems, please contact me! (You could also comment out the audio talker() in the later code if you're less interested in audio generation)"
+    "As always, if you have problems, please contact me! (You could also comment out the audio talker() in the later code if you're less interested in audio generation - "
   ]
  },
  {
@@ -380,7 +406,11 @@
    "\n",
    "## if you get a permissions error writing to a temp file, then this code should work instead.\n",
    "\n",
-    "A collaboration between student Mark M. and Claude got this resolved!"
+    "A collaboration between students Mark M. and Patrick H. and Claude got this resolved!\n",
+    "\n",
+    "Below are 3 variations - hopefully one of them will work on your PC. If not, message me please!\n",
+    "\n",
+    "## PC Variation 1"
   ]
  },
  {
@@ -392,12 +422,16 @@
   "source": [
    "import tempfile\n",
    "import subprocess\n",
+    "from io import BytesIO\n",
+    "from pydub import AudioSegment\n",
+    "import time\n",
    "\n",
    "def play_audio(audio_segment):\n",
    "    temp_dir = tempfile.gettempdir()\n",
    "    temp_path = os.path.join(temp_dir, \"temp_audio.wav\")\n",
    "    try:\n",
    "        audio_segment.export(temp_path, format=\"wav\")\n",
+    "        time.sleep(3) # Dominic found that this was needed. You could also try commenting out to see if not needed on your PC\n",
    "        subprocess.call([\n",
    "            \"ffplay\",\n",
    "            \"-nodisp\",\n",
@@ -419,19 +453,127 @@
    "    )\n",
    "    audio_stream = BytesIO(response.content)\n",
    "    audio = AudioSegment.from_file(audio_stream, format=\"mp3\")\n",
-    "    play_audio(audio)"
+    "    play_audio(audio)\n",
+    "\n",
+    "talker(\"Well hi there\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "96f90e35-f71e-468e-afea-07b98f74dbcf",
+   "metadata": {},
+   "source": [
+    "## PC Variation 2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
-   "id": "0dfb8ee9-e7dd-4615-8d69-2deb3fd44473",
+   "id": "8597c7f8-7b50-44ad-9b31-db12375cd57b",
   "metadata": {},
   "outputs": [],
   "source": [
+    "import os\n",
+    "from pydub import AudioSegment\n",
+    "from pydub.playback import play\n",
+    "from io import BytesIO\n",
+    "\n",
+    "def talker(message):\n",
+    "    # Set a custom directory for temporary files on Windows\n",
+    "    custom_temp_dir = os.path.expanduser(\"~/Documents/temp_audio\")\n",
+    "    os.environ['TEMP'] = custom_temp_dir  # You can also use 'TMP' if necessary\n",
+    "    \n",
+    "    # Create the folder if it doesn't exist\n",
+    "    if not os.path.exists(custom_temp_dir):\n",
+    "        os.makedirs(custom_temp_dir)\n",
+    "    \n",
+    "    response = openai.audio.speech.create(\n",
+    "        model=\"tts-1\",\n",
+    "        voice=\"onyx\",  # Also, try replacing onyx with alloy\n",
+    "        input=message\n",
+    "    )\n",
+    "    \n",
+    "    audio_stream = BytesIO(response.content)\n",
+    "    audio = AudioSegment.from_file(audio_stream, format=\"mp3\")\n",
+    "\n",
+    "    play(audio)\n",
+    "\n",
    "talker(\"Well hi there\")"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "e821224c-b069-4f9b-9535-c15fdb0e411c",
+   "metadata": {},
+   "source": [
+    "## PC Variation 3\n",
+    "\n",
+    "### Let's try a completely different sound library\n",
+    "\n",
+    "First run the next cell to install a new library, then try the cell below it."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "69d3c0d9-afcc-49e3-b829-9c9869d8b472",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip install simpleaudio"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "28f9cc99-36b7-4554-b3f4-f2012f614a13",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from pydub import AudioSegment\n",
+    "from io import BytesIO\n",
+    "import tempfile\n",
+    "import os\n",
+    "import simpleaudio as sa\n",
+    "\n",
+    "def talker(message):\n",
+    "    response = openai.audio.speech.create(\n",
+    "        model=\"tts-1\",\n",
+    "        voice=\"onyx\",  # Also, try replacing onyx with alloy\n",
+    "        input=message\n",
+    "    )\n",
+    "    \n",
+    "    audio_stream = BytesIO(response.content)\n",
+    "    audio = AudioSegment.from_file(audio_stream, format=\"mp3\")\n",
+    "\n",
+    "    # Create a temporary file in a folder where you have write permissions\n",
+    "    with tempfile.NamedTemporaryFile(suffix=\".wav\", delete=False, dir=os.path.expanduser(\"~/Documents\")) as temp_audio_file:\n",
+    "        temp_file_name = temp_audio_file.name\n",
+    "        audio.export(temp_file_name, format=\"wav\")\n",
+    "    \n",
+    "    # Load and play audio using simpleaudio\n",
+    "    wave_obj = sa.WaveObject.from_wave_file(temp_file_name)\n",
+    "    play_obj = wave_obj.play()\n",
+    "    play_obj.wait_done()  # Wait for playback to finish\n",
+    "\n",
+    "    # Clean up the temporary file afterward\n",
+    "    os.remove(temp_file_name)\n",
+    "    \n",
+    "talker(\"Well hi there\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7986176b-cd04-495f-a47f-e057b0e462ed",
+   "metadata": {},
+   "source": [
+    "## PC Users - if none of those 3 variations worked!\n",
+    "\n",
+    "Please get in touch with me. I'm sorry this is causing problems! We'll figure it out.\n",
+    "\n",
+    "Alternatively: playing audio from your PC isn't super-critical for this course, and you can feel free to focus on image generation and skip audio for now, or come back to it later."
+   ]
+  },
  {
   "cell_type": "markdown",
   "id": "1d48876d-c4fa-46a8-a04f-f9fadf61fb0d",
@@ -472,7 +614,10 @@
    "        \n",
    "    reply = response.choices[0].message.content\n",
    "    history += [{\"role\":\"assistant\", \"content\":reply}]\n",
+    "\n",
+    "    # Comment out or delete the next line if you'd rather skip Audio for now..\n",
    "    talker(reply)\n",
+    "    \n",
    "    return history, image"
   ]
  },