# 🥊 Summarization Battle: Ollama vs. OpenAI Judge

This mini-project pits multiple **local LLMs** (via [Ollama](https://ollama.ai)) against each other in a **web summarization contest**, with an **OpenAI model** serving as the impartial judge.  
It automatically fetches web articles, summarizes them with several models, and evaluates the results on **coverage, faithfulness, clarity, and conciseness**.

---

## 🚀 Features
- **Fetch Articles** – Download and clean text content from given URLs.
- **Summarize with Ollama** – Run multiple local models (e.g., `llama3.2`, `phi3`, `deepseek-r1`) via the Ollama API.
- **Judge with OpenAI** – Use `gpt-4o-mini` (or any other OpenAI model) to score summaries.
- **Battle Results** – Collect JSON results with per-model scores, rationales, and winners.
- **Timeout Handling & Warmup** – Keeps models alive with `keep_alive` to avoid cold-start delays.

---

## 📂 Project Structure
```
.
├── urls.txt              # Dictionary of categories → URLs
├── battle_results.json   # Summarization + judging results
├── main.py               # Main script
├── requirements.txt      # Dependencies
└── README.md             # You are here
```

---

## ⚙️ Installation

1. **Clone the repo**:
   ```bash
   git clone https://github.com/khashayarbayati1/wikipedia-summarization-battle.git
   cd summarization-battle
   ```

2. **Install dependencies**:
   ```bash
   pip install -r requirements.txt
   ```

   Minimal requirements:
   ```txt
   requests
   beautifulsoup4
   python-dotenv
   openai>=1.0.0
   httpx
   ```

3. **Install Ollama & models**:
   - [Install Ollama](https://ollama.ai/download) if not already installed.
   - Pull the models you want:
     ```bash
     ollama pull llama3.2:latest
     ollama pull deepseek-r1:1.5b
     ollama pull phi3:latest
     ```

4. **Set up OpenAI API key**:
   Create a `.env` file with:
   ```env
   OPENAI_API_KEY=sk-proj-xxxx...
   ```

---

## ▶️ Usage

1. Put your URL dictionary in `urls.txt`, e.g.:
   ```python
   {
     "sports": "https://en.wikipedia.org/wiki/Sabermetrics",
     "Politics": "https://en.wikipedia.org/wiki/Separation_of_powers",
     "History": "https://en.wikipedia.org/wiki/Industrial_Revolution"
   }
   ```

2. Run the script:
   ```bash
   python main.py
   ```

3. Results are written to:
   - `battle_results.json`
   - Printed in the terminal

---

## 🏆 Example Results

Sample output (excerpt):

```json
{
  "category": "sports",
  "url": "https://en.wikipedia.org/wiki/Sabermetrics",
  "scores": {
    "llama3.2:latest": { "score": 4, "rationale": "Covers the main points..." },
    "deepseek-r1:1.5b": { "score": 3, "rationale": "Some inaccuracies..." },
    "phi3:latest": { "score": 5, "rationale": "Concise, accurate, well-organized." }
  },
  "winner": "phi3:latest"
}
```

From the full run:
- 🥇 **`phi3:latest`** won in *Sports, History, Productivity*
- 🥇 **`deepseek-r1:1.5b`** won in *Politics, Technology*

---

## 💡 Ideas for Extension
- Add more Ollama models (e.g., `mistral`, `gemma`, etc.)
- Try different evaluation criteria (e.g., readability, length control)
- Visualize results with charts
- Benchmark runtime and token usage

---

## 📜 License
MIT License – free to use, modify, and share.