add markdown
This commit is contained in:
@@ -2,16 +2,53 @@
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 9,
|
||||
"id": "a767b6bc-65fe-42b2-988f-efd54125114f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/markdown": [
|
||||
"```markdown\n",
|
||||
"# Summary of \"DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning\"\n",
|
||||
"\n",
|
||||
"## Overview\n",
|
||||
"The paper introduces **DeepSeek-R1**, a first-generation reasoning model developed by DeepSeek-AI. The model is designed to enhance reasoning capabilities in large language models (LLMs) using reinforcement learning (RL). Two versions are presented:\n",
|
||||
"- **DeepSeek-R1-Zero**: A model trained via large-scale RL without supervised fine-tuning (SFT), showcasing strong reasoning abilities but facing challenges like poor readability and language mixing.\n",
|
||||
"- **DeepSeek-R1**: An improved version incorporating multi-stage training and cold-start data before RL, achieving performance comparable to OpenAI's models on reasoning tasks.\n",
|
||||
"\n",
|
||||
"## Key Contributions\n",
|
||||
"- Open-sourcing of **DeepSeek-R1-Zero**, **DeepSeek-R1**, and six dense models (1.5B, 7B, 8B, 14B, 32B, 70B) distilled from DeepSeek-R1 based on Qwen and Llama architectures.\n",
|
||||
"- The models are made available to support the research community.\n",
|
||||
"\n",
|
||||
"## Community Engagement\n",
|
||||
"- The paper has been widely discussed and recommended, with 216 upvotes and 45 models citing it.\n",
|
||||
"- Additional resources, including a video review and articles, are available through external links provided by the community.\n",
|
||||
"\n",
|
||||
"## Related Research\n",
|
||||
"The paper is part of a broader trend in enhancing LLMs' reasoning abilities, with related works such as:\n",
|
||||
"- **Improving Multi-Step Reasoning Abilities of Large Language Models with Direct Advantage Policy Optimization (2024)**\n",
|
||||
"- **Offline Reinforcement Learning for LLM Multi-Step Reasoning (2024)**\n",
|
||||
"- **Reasoning Language Models: A Blueprint (2025)**\n",
|
||||
"\n",
|
||||
"## Availability\n",
|
||||
"- The paper and models are accessible on [GitHub](https://github.com/deepseek-ai/DeepSeek-R1) and the [arXiv page](https://arxiv.org/abs/2501.12948).\n",
|
||||
"```"
|
||||
],
|
||||
"text/plain": [
|
||||
"<IPython.core.display.Markdown object>"
|
||||
]
|
||||
},
|
||||
"metadata": {},
|
||||
"output_type": "display_data"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"import requests\n",
|
||||
"from dotenv import load_dotenv\n",
|
||||
"from bs4 import BeautifulSoup\n",
|
||||
"from IPython.display import Markdown, display\n",
|
||||
"from IPython.display import Markdown, display, clear_output\n",
|
||||
"from openai import OpenAI\n",
|
||||
"import time\n",
|
||||
"\n",
|
||||
@@ -83,9 +120,11 @@
|
||||
" for chunk in response:\n",
|
||||
" if chunk.choices[0].delta.content: # Check if there's content in the chunk\n",
|
||||
" accumulated_content += chunk.choices[0].delta.content # Append the chunk to the accumulated content\n",
|
||||
" clear_output(wait=True) # Clear the previous output\n",
|
||||
" display(Markdown(accumulated_content)) # Display the updated content\n",
|
||||
" \n",
|
||||
" # Display the accumulated content as a single Markdown block\n",
|
||||
" display(Markdown(accumulated_content))\n",
|
||||
" # # Final display (optional, as the loop already displays the content)\n",
|
||||
" # display(Markdown(accumulated_content))\n",
|
||||
"\n",
|
||||
"def display_summary():\n",
|
||||
" url = str(input(\"Enter the URL of the website you want to summarize: \"))\n",
|
||||
|
||||
Reference in New Issue
Block a user