Add community contribution: Wikipedia Summarizer Battle (Ollama + OpenAI)
This commit is contained in:
@@ -0,0 +1,122 @@
|
||||
# 🥊 Summarization Battle: Ollama vs. OpenAI Judge
|
||||
|
||||
This mini-project pits multiple **local LLMs** (via [Ollama](https://ollama.ai)) against each other in a **web summarization contest**, with an **OpenAI model** serving as the impartial judge.
|
||||
It automatically fetches web articles, summarizes them with several models, and evaluates the results on **coverage, faithfulness, clarity, and conciseness**.
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Features
|
||||
- **Fetch Articles** – Download and clean text content from given URLs.
|
||||
- **Summarize with Ollama** – Run multiple local models (e.g., `llama3.2`, `phi3`, `deepseek-r1`) via the Ollama API.
|
||||
- **Judge with OpenAI** – Use `gpt-4o-mini` (or any other OpenAI model) to score summaries.
|
||||
- **Battle Results** – Collect JSON results with per-model scores, rationales, and winners.
|
||||
- **Timeout Handling & Warmup** – Keeps models alive with `keep_alive` to avoid cold-start delays.
|
||||
|
||||
---
|
||||
|
||||
## 📂 Project Structure
|
||||
```
|
||||
.
|
||||
├── urls.txt # Dictionary of categories → URLs
|
||||
├── battle_results.json # Summarization + judging results
|
||||
├── main.py # Main script
|
||||
├── requirements.txt # Dependencies
|
||||
└── README.md # You are here
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ⚙️ Installation
|
||||
|
||||
1. **Clone the repo**:
|
||||
```bash
|
||||
git clone https://github.com/khashayarbayati1/wikipedia-summarization-battle.git
|
||||
cd summarization-battle
|
||||
```
|
||||
|
||||
2. **Install dependencies**:
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
Minimal requirements:
|
||||
```txt
|
||||
requests
|
||||
beautifulsoup4
|
||||
python-dotenv
|
||||
openai>=1.0.0
|
||||
httpx
|
||||
```
|
||||
|
||||
3. **Install Ollama & models**:
|
||||
- [Install Ollama](https://ollama.ai/download) if not already installed.
|
||||
- Pull the models you want:
|
||||
```bash
|
||||
ollama pull llama3.2:latest
|
||||
ollama pull deepseek-r1:1.5b
|
||||
ollama pull phi3:latest
|
||||
```
|
||||
|
||||
4. **Set up OpenAI API key**:
|
||||
Create a `.env` file with:
|
||||
```env
|
||||
OPENAI_API_KEY=sk-proj-xxxx...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## ▶️ Usage
|
||||
|
||||
1. Put your URL dictionary in `urls.txt`, e.g.:
|
||||
```python
|
||||
{
|
||||
"sports": "https://en.wikipedia.org/wiki/Sabermetrics",
|
||||
"Politics": "https://en.wikipedia.org/wiki/Separation_of_powers",
|
||||
"History": "https://en.wikipedia.org/wiki/Industrial_Revolution"
|
||||
}
|
||||
```
|
||||
|
||||
2. Run the script:
|
||||
```bash
|
||||
python main.py
|
||||
```
|
||||
|
||||
3. Results are written to:
|
||||
- `battle_results.json`
|
||||
- Printed in the terminal
|
||||
|
||||
---
|
||||
|
||||
## 🏆 Example Results
|
||||
|
||||
Sample output (excerpt):
|
||||
|
||||
```json
|
||||
{
|
||||
"category": "sports",
|
||||
"url": "https://en.wikipedia.org/wiki/Sabermetrics",
|
||||
"scores": {
|
||||
"llama3.2:latest": { "score": 4, "rationale": "Covers the main points..." },
|
||||
"deepseek-r1:1.5b": { "score": 3, "rationale": "Some inaccuracies..." },
|
||||
"phi3:latest": { "score": 5, "rationale": "Concise, accurate, well-organized." }
|
||||
},
|
||||
"winner": "phi3:latest"
|
||||
}
|
||||
```
|
||||
|
||||
From the full run:
|
||||
- 🥇 **`phi3:latest`** won in *Sports, History, Productivity*
|
||||
- 🥇 **`deepseek-r1:1.5b`** won in *Politics, Technology*
|
||||
|
||||
---
|
||||
|
||||
## 💡 Ideas for Extension
|
||||
- Add more Ollama models (e.g., `mistral`, `gemma`, etc.)
|
||||
- Try different evaluation criteria (e.g., readability, length control)
|
||||
- Visualize results with charts
|
||||
- Benchmark runtime and token usage
|
||||
|
||||
---
|
||||
|
||||
## 📜 License
|
||||
MIT License – free to use, modify, and share.
|
||||
Reference in New Issue
Block a user