add markdown
This commit is contained in:
@@ -2,16 +2,53 @@
|
|||||||
"cells": [
|
"cells": [
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": 9,
|
||||||
"id": "a767b6bc-65fe-42b2-988f-efd54125114f",
|
"id": "a767b6bc-65fe-42b2-988f-efd54125114f",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/markdown": [
|
||||||
|
"```markdown\n",
|
||||||
|
"# Summary of \"DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning\"\n",
|
||||||
|
"\n",
|
||||||
|
"## Overview\n",
|
||||||
|
"The paper introduces **DeepSeek-R1**, a first-generation reasoning model developed by DeepSeek-AI. The model is designed to enhance reasoning capabilities in large language models (LLMs) using reinforcement learning (RL). Two versions are presented:\n",
|
||||||
|
"- **DeepSeek-R1-Zero**: A model trained via large-scale RL without supervised fine-tuning (SFT), showcasing strong reasoning abilities but facing challenges like poor readability and language mixing.\n",
|
||||||
|
"- **DeepSeek-R1**: An improved version incorporating multi-stage training and cold-start data before RL, achieving performance comparable to OpenAI's models on reasoning tasks.\n",
|
||||||
|
"\n",
|
||||||
|
"## Key Contributions\n",
|
||||||
|
"- Open-sourcing of **DeepSeek-R1-Zero**, **DeepSeek-R1**, and six dense models (1.5B, 7B, 8B, 14B, 32B, 70B) distilled from DeepSeek-R1 based on Qwen and Llama architectures.\n",
|
||||||
|
"- The models are made available to support the research community.\n",
|
||||||
|
"\n",
|
||||||
|
"## Community Engagement\n",
|
||||||
|
"- The paper has been widely discussed and recommended, with 216 upvotes and 45 models citing it.\n",
|
||||||
|
"- Additional resources, including a video review and articles, are available through external links provided by the community.\n",
|
||||||
|
"\n",
|
||||||
|
"## Related Research\n",
|
||||||
|
"The paper is part of a broader trend in enhancing LLMs' reasoning abilities, with related works such as:\n",
|
||||||
|
"- **Improving Multi-Step Reasoning Abilities of Large Language Models with Direct Advantage Policy Optimization (2024)**\n",
|
||||||
|
"- **Offline Reinforcement Learning for LLM Multi-Step Reasoning (2024)**\n",
|
||||||
|
"- **Reasoning Language Models: A Blueprint (2025)**\n",
|
||||||
|
"\n",
|
||||||
|
"## Availability\n",
|
||||||
|
"- The paper and models are accessible on [GitHub](https://github.com/deepseek-ai/DeepSeek-R1) and the [arXiv page](https://arxiv.org/abs/2501.12948).\n",
|
||||||
|
"```"
|
||||||
|
],
|
||||||
|
"text/plain": [
|
||||||
|
"<IPython.core.display.Markdown object>"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "display_data"
|
||||||
|
}
|
||||||
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"import os\n",
|
"import os\n",
|
||||||
"import requests\n",
|
"import requests\n",
|
||||||
"from dotenv import load_dotenv\n",
|
"from dotenv import load_dotenv\n",
|
||||||
"from bs4 import BeautifulSoup\n",
|
"from bs4 import BeautifulSoup\n",
|
||||||
"from IPython.display import Markdown, display\n",
|
"from IPython.display import Markdown, display, clear_output\n",
|
||||||
"from openai import OpenAI\n",
|
"from openai import OpenAI\n",
|
||||||
"import time\n",
|
"import time\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -83,9 +120,11 @@
|
|||||||
" for chunk in response:\n",
|
" for chunk in response:\n",
|
||||||
" if chunk.choices[0].delta.content: # Check if there's content in the chunk\n",
|
" if chunk.choices[0].delta.content: # Check if there's content in the chunk\n",
|
||||||
" accumulated_content += chunk.choices[0].delta.content # Append the chunk to the accumulated content\n",
|
" accumulated_content += chunk.choices[0].delta.content # Append the chunk to the accumulated content\n",
|
||||||
|
" clear_output(wait=True) # Clear the previous output\n",
|
||||||
|
" display(Markdown(accumulated_content)) # Display the updated content\n",
|
||||||
" \n",
|
" \n",
|
||||||
" # Display the accumulated content as a single Markdown block\n",
|
" # # Final display (optional, as the loop already displays the content)\n",
|
||||||
" display(Markdown(accumulated_content))\n",
|
" # display(Markdown(accumulated_content))\n",
|
||||||
"\n",
|
"\n",
|
||||||
"def display_summary():\n",
|
"def display_summary():\n",
|
||||||
" url = str(input(\"Enter the URL of the website you want to summarize: \"))\n",
|
" url = str(input(\"Enter the URL of the website you want to summarize: \"))\n",
|
||||||
|
|||||||
Reference in New Issue
Block a user