LLM_Engineering_OLD/week4/community-contributions/code_commentor.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "07bb451d-2b91-425f-b8ea-6f35ced780b0",
   "metadata": {},
   "source": [
    "# AI Code Commenting Assistant  \n",
    "\n",
    "## Project Summary  \n",
    "\n",
    "**Purpose**:  \n",
    "An AI-powered assistant that automatically generates **clear, concise code comments** to improve code readability and maintainability.  \n",
    "\n",
    "**Key Features**:  \n",
    "- **Language-Agnostic**: Auto-detects programming languages or allows manual specification  \n",
    "- **Smart Commenting**: Focuses on explaining **complex logic, algorithms, and edge cases** (not obvious syntax)  \n",
    "- **Customizable**: Optional focus areas let users prioritize specific parts (e.g., database queries, recursion)  \n",
    "- **Efficient Workflow**: Processes code in chunks and preserves original formatting  \n",
    "\n",
    "**Benefits**:  \n",
    "✔ Saves time writing documentation  \n",
    "✔ Helps developers understand unfamiliar code  \n",
    "✔ Supports multiple languages (Python, JavaScript, C++, SQL, etc.)  \n",
    "✔ Avoids redundant comments on trivial operations  \n",
    "\n",
    "**Example Use Case**:  \n",
    "```python  \n",
    "# Before AI:  \n",
    "def fib(n):  \n",
    "    if n <= 1: return n  \n",
    "    else: return fib(n-1) + fib(n-2)  \n",
    "\n",
    "# After AI:  \n",
    "def fib(n):  \n",
    "    # Recursively computes nth Fibonacci number (O(2^n) time)  \n",
    "    if n <= 1: return n  # Base case  \n",
    "    else: return fib(n-1) + fib(n-2)  # Recursive case  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a0413ae1-0348-4884-ba95-384c4c8f841c",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install --upgrade huggingface_hub"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b22da766-042b-402f-9e05-78aa8f45ddd4",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import io\n",
    "from dotenv import load_dotenv\n",
    "from google import genai\n",
    "from google.genai import types\n",
    "from openai import OpenAI\n",
    "from anthropic import Anthropic\n",
    "from huggingface_hub import InferenceClient\n",
    "import gradio as gr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5af6d3de-bab6-475e-b2f3-7b788bb2e529",
   "metadata": {},
   "outputs": [],
   "source": [
    "# load environments\n",
    "load_dotenv(override=True)\n",
    "os.environ['ANTHROPIC_API_KEY'] = os.getenv(\"CLAUDE_API_KEY\")\n",
    "os.environ[\"HF_TOKEN\"] = os.getenv(\"HF_TOKEN\")\n",
    "gemini_api_key= os.getenv(\"GEMINI_API_KEY\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cad0755e-4174-4fbc-84e6-15cc54bc609a",
   "metadata": {},
   "outputs": [],
   "source": [
    "#initialize remote models\n",
    "claude= Anthropic()\n",
    "gemini = genai.Client(api_key=gemini_api_key)\n",
    "\n",
    "#opensource models\n",
    "qwen = InferenceClient(provider=\"featherless-ai\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "31d75812-1cd3-4512-8446-022c3357c354",
   "metadata": {},
   "outputs": [],
   "source": [
    "#initialize local model\n",
    "llama = OpenAI(base_url=\"http://localhost:11434/v1\", api_key=\"ollama\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "31316379-2a56-4707-b207-ea60b490f536",
   "metadata": {},
   "outputs": [],
   "source": [
    "#models\n",
    "claude_model = \"claude-3-5-haiku-latest\"\n",
    "gemini_model = \"gemini-2.5-pro\"\n",
    "qwen_model= \"Qwen/Qwen2.5-Coder-32B-Instruct\"\n",
    "llama_model = \"llama3:8b\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b7d9c4bf-0955-4406-8717-ffa7bdd0bec9",
   "metadata": {},
   "outputs": [],
   "source": [
    "system_message=\"\"\"\n",
    "You are an expert AI specialized in code documentation. Your task is to generate concise, meaningful comments that explain the purpose and logic of provided code. Follow these rules:\n",
    "\n",
    "1. **Infer language**: Auto-detect programming language and use appropriate comment syntax\n",
    "2. **Explain why, not what**: Focus on purpose, edge cases, and non-obvious logic\n",
    "3. **Be concise**: Maximum 1-2 sentences per comment block\n",
    "4. **Prioritize key sections**: Only comment complex logic, algorithms, or critical operations\n",
    "5. **Maintain structure**: Preserve original code formatting and indentation\n",
    "6. **Output format**: Return ONLY commented code with no additional text\n",
    "\n",
    "Commenting guidelines by language:\n",
    "- Python: `# Inline comments` and `\"\"Docstrings\"\"`\n",
    "- JavaScript/Java: `// Line comments` and `/* Block comments */`\n",
    "- C/C++: `//` and `/* */`\n",
    "- SQL: `-- Line comments`\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "79dfe110-1523-40c7-ad90-2787ed22fd8d",
   "metadata": {},
   "outputs": [],
   "source": [
    "def user_prompt(code):\n",
    "    prompt = f\"\"\"\n",
    "     i want to document my code for better understanding. Please generate meaningful necessary comments\n",
    "     here is my code:\n",
    "     {code}\n",
    "\n",
    "     Return ONLY commented code with no additional text\n",
    "     \"\"\"\n",
    "\n",
    "    return prompt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c7bcf29e-ec78-4cfd-9b41-f2dc86400435",
   "metadata": {},
   "outputs": [],
   "source": [
    "def conversation_template(code):\n",
    "    messages = [\n",
    "        {\"role\":\"system\", \"content\":system_message},\n",
    "        {\"role\":\"user\",\"content\":user_prompt(code)}\n",
    "    ]\n",
    "    return messages"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a36fec0f-7eba-4ccd-8fc4-cbf5ade76fa2",
   "metadata": {},
   "outputs": [],
   "source": [
    "def stream_gemini(code):\n",
    "    message = user_prompt(code)\n",
    "    response = gemini.models.generate_content_stream(\n",
    "        model=gemini_model,\n",
    "        config= types.GenerateContentConfig(\n",
    "            system_instruction = system_message,\n",
    "            temperature = 0.8,\n",
    "        ),\n",
    "        contents = [message]\n",
    "    )\n",
    "\n",
    "    result = \"\"\n",
    "    for chunk in response:\n",
    "        result += chunk.text or \"\"\n",
    "        yield result"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e5d1e0c0-dc88-43ee-8698-82ad9ce7c51b",
   "metadata": {},
   "outputs": [],
   "source": [
    "def stream_claude(code):\n",
    "    messages = [{\"role\":\"user\",\"content\":user_prompt(code)}]\n",
    "    response = claude.messages.stream(\n",
    "        model= claude_model,\n",
    "        temperature=0.8,\n",
    "        messages = messages,\n",
    "        max_tokens=5000\n",
    "    )\n",
    "\n",
    "    result = \"\"\n",
    "    with response as stream:\n",
    "        for text in stream.text_stream:\n",
    "            result += text or \"\"\n",
    "            yield result\n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "903c97e5-9170-449e-8a0f-9f906351ec45",
   "metadata": {},
   "outputs": [],
   "source": [
    "def stream_opensource(code,model):\n",
    "    model = model.lower()\n",
    "    client = globals()[model]\n",
    "    model = globals()[f\"{model}_model\"]\n",
    "    stream = client.chat.completions.create(\n",
    "        model = model,\n",
    "        messages= conversation_template(code),\n",
    "        temperature = 0.7,\n",
    "        stream = True\n",
    "    )\n",
    "\n",
    "    result = \"\"\n",
    "    for chunk in stream:\n",
    "        result += chunk.choices[0].delta.content or \"\"\n",
    "        yield result"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ff051c22-a2f8-4153-b970-f8a466a4cf5a",
   "metadata": {},
   "outputs": [],
   "source": [
    "def commentor(code, model):\n",
    "    model =model.lower()\n",
    "    if model == \"claude\":\n",
    "        result = stream_claude(code)\n",
    "    elif model == \"gemini\":\n",
    "        result = stream_gemini(code)\n",
    "    elif model == \"qwen\" or model == \"llama\":\n",
    "        result = stream_opensource(code, model)\n",
    "\n",
    "\n",
    "    for code in result:\n",
    "        yield code.replace(\"```cpp\\n\",\"\").replace(\"```python\\n\",\"\").replace(\"```javascript\\n\",\"\").replace(\"```typescript\\n\",\"\").replace(\"```\",\"\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "10daf070-3546-4073-a2a0-3f5f8fc156f0",
   "metadata": {},
   "outputs": [],
   "source": [
    "with gr.Blocks() as ui:\n",
    "    gr.Markdown(\"# Genarate comment\")\n",
    "    with gr.Row():\n",
    "        raw_code = gr.Textbox(label=\"Raw Code:\", lines=10)\n",
    "        commented_code = gr.Textbox(label=\"Commented_code\",lines=10)\n",
    "    with gr.Row():\n",
    "        models = gr.Dropdown([\"Gemini\",\"Claude\",\"Llama\",\"Qwen\"], value=\"Gemini\")\n",
    "    with gr.Row():\n",
    "        generate_comment = gr.Button(\"Generate Comment\")\n",
    "\n",
    "    generate_comment.click(commentor, inputs=[raw_code, models], outputs=[commented_code])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "afb87f32-f25e-40c5-844a-d2b7af748192",
   "metadata": {},
   "outputs": [],
   "source": [
    "ui.launch(inbrowser=True,debug=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "96bc48ad-10ad-4821-b58e-ea1b22cdcdc9",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}