7.7 KiB
🔶 Multi-Language Code Complexity Annotator
An automated tool that analyzes source code and annotates it with Big-O complexity estimates, complete with syntax highlighting and optional AI-powered code reviews.
🎯 What It Does
Understanding time complexity (Big-O notation) is crucial for writing efficient algorithms, identifying bottlenecks, making informed optimization decisions, and passing technical interviews.
Analyzing complexity manually is tedious and error-prone. This tool automates the entire process—detecting loops, recursion, and functions, then annotating code with Big-O estimates and explanations.
Core Features
- 📊 Automatic Detection - Identifies loops, recursion, and functions across 13+ programming languages
- 🧮 Complexity Estimation - Calculates Big-O complexity (O(1), O(n), O(n²), O(log n), etc.)
- 💬 Inline Annotations - Inserts explanatory comments directly into your code
- 🎨 Syntax Highlighting - Generates beautiful HTML previews with orange-colored complexity comments
- 🤖 AI Code Review - Optional LLaMA-powered analysis for optimization suggestions
- 💾 Export Options - Download annotated source code and Markdown previews
🌐 Supported Languages
Python • JavaScript • TypeScript • Java • C • C++ • C# • Go • PHP • Swift • Ruby • Kotlin • Rust
🛠️ Tech Stack
- HuggingFace Transformers - LLM model loading and inference
- LLaMA 3.2 - AI-powered code review
- Gradio - Interactive web interface
- Pygments - Syntax highlighting
- PyTorch - Deep learning framework
- Regex Analysis - Heuristic complexity detection
📋 Prerequisites
- Python 3.12+
uvpackage manager (orpip)- 4GB+ RAM (for basic use without AI)
- 14GB+ RAM (for AI code review with LLaMA models)
- Optional: NVIDIA GPU with CUDA (for model quantization)
🚀 Installation
1. Clone the Repository
cd week4
2. Install Dependencies
uv pip install -U pip
uv pip install transformers accelerate gradio torch --extra-index-url https://download.pytorch.org/whl/cpu
uv pip install bitsandbytes pygments python-dotenv
Note: This installs the CPU-only version of PyTorch. For GPU support, remove the
--extra-index-urlflag.
3. Set Up HuggingFace Token (Optional - for AI Features)
Create a .env file in the week4 directory:
HF_TOKEN=hf_your_token_here
Get your token at: https://huggingface.co/settings/tokens
Required for: LLaMA models (requires accepting Meta's license agreement)
💡 Usage
Option 1: Jupyter Notebook
Open and run week4 EXERCISE_hopeogbons.ipynb:
jupyter notebook "week4 EXERCISE_hopeogbons.ipynb"
Run all cells in order. The Gradio interface will launch at http://127.0.0.1:7861
Option 2: Web Interface
Once the Gradio app is running:
Without AI Review (No Model Needed)
- Upload a code file (.py, .js, .java, etc.)
- Uncheck "Generate AI Code Review"
- Click "🚀 Process & Annotate"
- View syntax-highlighted code with Big-O annotations
- Download the annotated source + Markdown
With AI Review (Requires Model)
- Click "🔄 Load Model" (wait 2-5 minutes for first download)
- Upload your code file
- Check "Generate AI Code Review"
- Adjust temperature/tokens if needed
- Click "🚀 Process & Annotate"
- Read AI-generated optimization suggestions
📊 How It Works
Complexity Detection Algorithm
The tool uses heuristic pattern matching to estimate Big-O complexity:
- Detect Blocks - Regex patterns find functions, loops, and recursion
- Analyze Loops - Count nesting depth:
- 1 loop = O(n)
- 2 nested loops = O(n²)
- 3 nested loops = O(n³)
- Analyze Recursion - Pattern detection:
- Divide-and-conquer (binary search) = O(log n)
- Single recursive call = O(n)
- Multiple recursive calls = O(2^n)
- Aggregate - Functions inherit worst-case complexity of inner operations
Example Output
Input (Python):
def bubble_sort(arr):
for i in range(len(arr)):
for j in range(len(arr) - i - 1):
if arr[j] > arr[j + 1]:
arr[j], arr[j + 1] = arr[j + 1], arr[j]
Output (Annotated):
def bubble_sort(arr):
# Big-O: O(n^2)
# Explanation: Nested loops indicate quadratic time.
for i in range(len(arr)):
for j in range(len(arr) - i - 1):
if arr[j] > arr[j + 1]:
arr[j], arr[j + 1] = arr[j + 1], arr[j]
🧠 AI Model Options
CPU/Mac (No GPU)
meta-llama/Llama-3.2-1B(Default, ~1GB, requires HF approval)gpt2(No approval needed, ~500MB)microsoft/DialoGPT-medium(~1GB)
GPU Users
- Any model with 8-bit or 4-bit quantization enabled
meta-llama/Llama-2-7b-chat-hf(requires approval)
Memory Requirements
- Without quantization: ~14GB RAM (7B models) or ~26GB (13B models)
- With 8-bit quantization: ~50% reduction (GPU required)
- With 4-bit quantization: ~75% reduction (GPU required)
⚙️ Configuration
File Limits
- Max file size: 2 MB
- Supported extensions:
.py,.js,.ts,.java,.c,.cpp,.cs,.go,.php,.swift,.rb,.kt,.rs
Model Parameters
- Temperature (0.0 - 1.5): Controls randomness
- Lower = more deterministic
- Higher = more creative
- Max Tokens (16 - 1024): Maximum length of AI review
📁 Project Structure
week4/
├── week4 EXERCISE_hopeogbons.ipynb # Main application notebook
├── README.md # This file
└── .env # HuggingFace token (create this)
🐛 Troubleshooting
Model Loading Issues
Error: "Model not found" or "Access denied"
- Solution: Accept Meta's license at https://huggingface.co/meta-llama/Llama-3.2-1B
- Ensure your
.envfile contains a valid HF_TOKEN
Memory Issues
Error: "Out of memory" during model loading
- Solution: Use a smaller model like
gpt2ormicrosoft/DialoGPT-medium - Try 8-bit or 4-bit quantization (GPU required)
Quantization Requires GPU
Error: "Quantization requires CUDA"
- Solution: Disable both 4-bit and 8-bit quantization checkboxes
- Run on CPU with smaller models
File Upload Issues
Error: "Unsupported file extension"
- Solution: Ensure your file has one of the supported extensions
- Check that the file size is under 2MB
🎓 Use Cases
- Code Review - Automated complexity analysis for pull requests
- Interview Prep - Understand algorithm efficiency before coding interviews
- Performance Optimization - Identify bottlenecks in existing code
- Education - Learn Big-O notation through practical examples
- Documentation - Auto-generate complexity documentation
📝 Notes
- First model load downloads weights (~1-14GB depending on model)
- Subsequent runs load from cache (much faster)
- Complexity estimates are heuristic-based, not formally verified
- For production use, consider manual verification of critical algorithms
🤝 Contributing
This is a learning project from the Andela LLM Engineering course (Week 4). Feel free to extend it with:
- Additional language support
- More sophisticated complexity detection
- Integration with CI/CD pipelines
- Support for space complexity analysis
📄 License
Educational project - use as reference for learning purposes.
🙏 Acknowledgments
- OpenAI Whisper for inspiration on model integration
- HuggingFace for providing the Transformers library
- Meta for LLaMA models
- Gradio for the excellent UI framework
- Andela for the LLM Engineering curriculum
Built with ❤️ as part of Week 4 LLM Engineering coursework