Added DeepSeek to weeks 1, 2 and 8

This commit is contained in:
Edward Donner
2025-01-28 12:23:46 -05:00
parent 8cb97665af
commit 7d6d9959df
9 changed files with 298 additions and 9 deletions

View File

@@ -203,6 +203,46 @@
"print(response.choices[0].message.content)"
]
},
{
"cell_type": "markdown",
"id": "bc7d1de3-e2ac-46ff-a302-3b4ba38c4c90",
"metadata": {},
"source": [
"## Also trying the amazing reasoning model DeepSeek\n",
"\n",
"Here we use the version of DeepSeek-reasoner that's been distilled to 1.5B. \n",
"This is actually a 1.5B variant of Qwen that has been fine-tuned using synethic data generated by Deepseek R1.\n",
"\n",
"Other sizes of DeepSeek are [here](https://ollama.com/library/deepseek-r1) all the way up to the full 671B parameter version, which would use up 404GB of your drive and is far too large for most!"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cf9eb44e-fe5b-47aa-b719-0bb63669ab3d",
"metadata": {},
"outputs": [],
"source": [
"!ollama pull deepseek-r1:1.5b"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1d3d554b-e00d-4c08-9300-45e073950a76",
"metadata": {},
"outputs": [],
"source": [
"# This may take a few minutes to run! You should then see a fascinating \"thinking\" trace inside <think> tags, followed by some decent definitions\n",
"\n",
"response = ollama_via_openai.chat.completions.create(\n",
" model=\"deepseek-r1:1.5b\",\n",
" messages=[{\"role\": \"user\", \"content\": \"Please give definitions of some core concepts behind LLMs: a neural network, attention and the transformer\"}]\n",
")\n",
"\n",
"print(response.choices[0].message.content)"
]
},
{
"cell_type": "markdown",
"id": "1622d9bb-5c68-4d4e-9ca4-b492c751f898",