From 028419628e3914d0b2755d3d19ce881bd1eebeb8 Mon Sep 17 00:00:00 2001 From: Hope Ogbons Date: Mon, 27 Oct 2025 06:17:55 +0100 Subject: [PATCH] Add README for Multi-Language Code Complexity Annotator, detailing features, installation, usage, and troubleshooting --- .../hopeogbons/README.md | 255 ++++++++++++++++++ 1 file changed, 255 insertions(+) create mode 100644 week4/community-contributions/hopeogbons/README.md diff --git a/week4/community-contributions/hopeogbons/README.md b/week4/community-contributions/hopeogbons/README.md new file mode 100644 index 0000000..8620c79 --- /dev/null +++ b/week4/community-contributions/hopeogbons/README.md @@ -0,0 +1,255 @@ +# ๐Ÿ”ถ Multi-Language Code Complexity Annotator + +An automated tool that analyzes source code and annotates it with Big-O complexity estimates, complete with syntax highlighting and optional AI-powered code reviews. + +## ๐ŸŽฏ What It Does + +Understanding time complexity (Big-O notation) is crucial for writing efficient algorithms, identifying bottlenecks, making informed optimization decisions, and passing technical interviews. + +Analyzing complexity manually is tedious and error-prone. This tool **automates** the entire processโ€”detecting loops, recursion, and functions, then annotating code with Big-O estimates and explanations. + +### Core Features + +- ๐Ÿ“Š **Automatic Detection** - Identifies loops, recursion, and functions across 13+ programming languages +- ๐Ÿงฎ **Complexity Estimation** - Calculates Big-O complexity (O(1), O(n), O(nยฒ), O(log n), etc.) +- ๐Ÿ’ฌ **Inline Annotations** - Inserts explanatory comments directly into your code +- ๐ŸŽจ **Syntax Highlighting** - Generates beautiful HTML previews with orange-colored complexity comments +- ๐Ÿค– **AI Code Review** - Optional LLaMA-powered analysis for optimization suggestions +- ๐Ÿ’พ **Export Options** - Download annotated source code and Markdown previews + +## ๐ŸŒ Supported Languages + +Python โ€ข JavaScript โ€ข TypeScript โ€ข Java โ€ข C โ€ข C++ โ€ข C# โ€ข Go โ€ข PHP โ€ข Swift โ€ข Ruby โ€ข Kotlin โ€ข Rust + +## ๐Ÿ› ๏ธ Tech Stack + +- **HuggingFace Transformers** - LLM model loading and inference +- **LLaMA 3.2** - AI-powered code review +- **Gradio** - Interactive web interface +- **Pygments** - Syntax highlighting +- **PyTorch** - Deep learning framework +- **Regex Analysis** - Heuristic complexity detection + +## ๐Ÿ“‹ Prerequisites + +- Python 3.12+ +- `uv` package manager (or `pip`) +- 4GB+ RAM (for basic use without AI) +- 14GB+ RAM (for AI code review with LLaMA models) +- Optional: NVIDIA GPU with CUDA (for model quantization) + +## ๐Ÿš€ Installation + +### 1. Clone the Repository + +```bash +cd week4 +``` + +### 2. Install Dependencies + +```bash +uv pip install -U pip +uv pip install transformers accelerate gradio torch --extra-index-url https://download.pytorch.org/whl/cpu +uv pip install bitsandbytes pygments python-dotenv +``` + +> **Note:** This installs the CPU-only version of PyTorch. For GPU support, remove the `--extra-index-url` flag. + +### 3. Set Up HuggingFace Token (Optional - for AI Features) + +Create a `.env` file in the `week4` directory: + +```env +HF_TOKEN=hf_your_token_here +``` + +Get your token at: https://huggingface.co/settings/tokens + +> **Required for:** LLaMA models (requires accepting Meta's license agreement) + +## ๐Ÿ’ก Usage + +### Option 1: Jupyter Notebook + +Open and run `week4 EXERCISE_hopeogbons.ipynb`: + +```bash +jupyter notebook "week4 EXERCISE_hopeogbons.ipynb" +``` + +Run all cells in order. The Gradio interface will launch at `http://127.0.0.1:7861` + +### Option 2: Web Interface + +Once the Gradio app is running: + +#### **Without AI Review (No Model Needed)** + +1. Upload a code file (.py, .js, .java, etc.) +2. Uncheck "Generate AI Code Review" +3. Click "๐Ÿš€ Process & Annotate" +4. View syntax-highlighted code with Big-O annotations +5. Download the annotated source + Markdown + +#### **With AI Review (Requires Model)** + +1. Click "๐Ÿ”„ Load Model" (wait 2-5 minutes for first download) +2. Upload your code file +3. Check "Generate AI Code Review" +4. Adjust temperature/tokens if needed +5. Click "๐Ÿš€ Process & Annotate" +6. Read AI-generated optimization suggestions + +## ๐Ÿ“Š How It Works + +### Complexity Detection Algorithm + +The tool uses **heuristic pattern matching** to estimate Big-O complexity: + +1. **Detect Blocks** - Regex patterns find functions, loops, and recursion +2. **Analyze Loops** - Count nesting depth: + - 1 loop = O(n) + - 2 nested loops = O(nยฒ) + - 3 nested loops = O(nยณ) +3. **Analyze Recursion** - Pattern detection: + - Divide-and-conquer (binary search) = O(log n) + - Single recursive call = O(n) + - Multiple recursive calls = O(2^n) +4. **Aggregate** - Functions inherit worst-case complexity of inner operations + +### Example Output + +**Input (Python):** + +```python +def bubble_sort(arr): + for i in range(len(arr)): + for j in range(len(arr) - i - 1): + if arr[j] > arr[j + 1]: + arr[j], arr[j + 1] = arr[j + 1], arr[j] +``` + +**Output (Annotated):** + +```python +def bubble_sort(arr): +# Big-O: O(n^2) +# Explanation: Nested loops indicate quadratic time. + for i in range(len(arr)): + for j in range(len(arr) - i - 1): + if arr[j] > arr[j + 1]: + arr[j], arr[j + 1] = arr[j + 1], arr[j] +``` + +## ๐Ÿง  AI Model Options + +### CPU/Mac (No GPU) + +- `meta-llama/Llama-3.2-1B` (Default, ~1GB, requires HF approval) +- `gpt2` (No approval needed, ~500MB) +- `microsoft/DialoGPT-medium` (~1GB) + +### GPU Users + +- Any model with 8-bit or 4-bit quantization enabled +- `meta-llama/Llama-2-7b-chat-hf` (requires approval) + +### Memory Requirements + +- **Without quantization:** ~14GB RAM (7B models) or ~26GB (13B models) +- **With 8-bit quantization:** ~50% reduction (GPU required) +- **With 4-bit quantization:** ~75% reduction (GPU required) + +## โš™๏ธ Configuration + +### File Limits + +- Max file size: **2 MB** +- Supported extensions: `.py`, `.js`, `.ts`, `.java`, `.c`, `.cpp`, `.cs`, `.go`, `.php`, `.swift`, `.rb`, `.kt`, `.rs` + +### Model Parameters + +- **Temperature** (0.0 - 1.5): Controls randomness + - Lower = more deterministic + - Higher = more creative +- **Max Tokens** (16 - 1024): Maximum length of AI review + +## ๐Ÿ“ Project Structure + +``` +week4/ +โ”œโ”€โ”€ week4 EXERCISE_hopeogbons.ipynb # Main application notebook +โ”œโ”€โ”€ README.md # This file +โ””โ”€โ”€ .env # HuggingFace token (create this) +``` + +## ๐Ÿ› Troubleshooting + +### Model Loading Issues + +**Error:** "Model not found" or "Access denied" + +- **Solution:** Accept Meta's license at https://huggingface.co/meta-llama/Llama-3.2-1B +- Ensure your `.env` file contains a valid HF_TOKEN + +### Memory Issues + +**Error:** "Out of memory" during model loading + +- **Solution:** Use a smaller model like `gpt2` or `microsoft/DialoGPT-medium` +- Try 8-bit or 4-bit quantization (GPU required) + +### Quantization Requires GPU + +**Error:** "Quantization requires CUDA" + +- **Solution:** Disable both 4-bit and 8-bit quantization checkboxes +- Run on CPU with smaller models + +### File Upload Issues + +**Error:** "Unsupported file extension" + +- **Solution:** Ensure your file has one of the supported extensions +- Check that the file size is under 2MB + +## ๐ŸŽ“ Use Cases + +- **Code Review** - Automated complexity analysis for pull requests +- **Interview Prep** - Understand algorithm efficiency before coding interviews +- **Performance Optimization** - Identify bottlenecks in existing code +- **Education** - Learn Big-O notation through practical examples +- **Documentation** - Auto-generate complexity documentation + +## ๐Ÿ“ Notes + +- First model load downloads weights (~1-14GB depending on model) +- Subsequent runs load from cache (much faster) +- Complexity estimates are heuristic-based, not formally verified +- For production use, consider manual verification of critical algorithms + +## ๐Ÿค Contributing + +This is a learning project from the Andela LLM Engineering course (Week 4). Feel free to extend it with: + +- Additional language support +- More sophisticated complexity detection +- Integration with CI/CD pipelines +- Support for space complexity analysis + +## ๐Ÿ“„ License + +Educational project - use as reference for learning purposes. + +## ๐Ÿ™ Acknowledgments + +- **OpenAI Whisper** for inspiration on model integration +- **HuggingFace** for providing the Transformers library +- **Meta** for LLaMA models +- **Gradio** for the excellent UI framework +- **Andela** for the LLM Engineering curriculum + +--- + +**Built with โค๏ธ as part of Week 4 LLM Engineering coursework**