Files
LLM_Engineering_OLD/week4/community-contributions/python_to_cpp_translator.ipynb

581 lines
17 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Python to C++ Code Translator using LLMs\n",
"\n",
"This notebook translates Python code to compilable C++ using GPT, Gemini, or Claude.\n",
"\n",
"## Features:\n",
"- 🤖 Multiple LLM support (GPT, Gemini, Claude)\n",
"- ✅ Automatic compilation testing with g++\n",
"- 🔄 Comparison mode to test all LLMs\n",
"- 💬 Interactive translation mode"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 1: Install Required Packages\n",
"\n",
"Run this cell first to install all dependencies:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\u001b[2mResolved \u001b[1m267 packages\u001b[0m \u001b[2min 10ms\u001b[0m\u001b[0m\n",
"\u001b[2mAudited \u001b[1m243 packages\u001b[0m \u001b[2min 467ms\u001b[0m\u001b[0m\n"
]
}
],
"source": [
"!uv add openai anthropic python-dotenv google-generativeai\n",
"#!pip install openai anthropic python-dotenv google-generativeai"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 2: Import Libraries"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import subprocess\n",
"import tempfile\n",
"from pathlib import Path\n",
"from dotenv import load_dotenv\n",
"import openai\n",
"from anthropic import Anthropic\n",
"import google.generativeai as genai"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 3: Load API Keys\n",
"\n",
"Make sure you have a `.env` file with:\n",
"```\n",
"OPENAI_API_KEY=your_key_here\n",
"GEMINI_API_KEY=your_key_here\n",
"ANTHROPIC_API_KEY=your_key_here\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Load API keys from .env file\n",
"load_dotenv()\n",
"\n",
"# Initialize API clients\n",
"openai_client = openai.OpenAI(api_key=os.getenv('OPENAI_API_KEY'))\n",
"anthropic_client = Anthropic(api_key=os.getenv('ANTHROPIC_API_KEY'))\n",
"genai.configure(api_key=os.getenv('GEMINI_API_KEY'))\n",
"\n",
"print(\"✓ API keys loaded successfully\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 4: Define System Prompt"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"SYSTEM_PROMPT = \"\"\"You are an expert programmer that translates Python code to C++.\n",
"Translate the given Python code to efficient, compilable C++ code.\n",
"\n",
"Requirements:\n",
"- The C++ code must compile without errors\n",
"- Include all necessary headers\n",
"- Use modern C++ (C++11 or later) features where appropriate\n",
"- Add proper error handling\n",
"- Maintain the same functionality as the Python code\n",
"- Include a main() function if the Python code has executable statements\n",
"\n",
"Only return the C++ code, no explanations unless there are important notes about compilation.\"\"\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 5: LLM Translation Functions"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def translate_with_gpt(python_code, model=\"gpt-4o\"):\n",
" \"\"\"Translate Python to C++ using OpenAI's GPT models\"\"\"\n",
" try:\n",
" response = openai_client.chat.completions.create(\n",
" model=model,\n",
" messages=[\n",
" {\"role\": \"system\", \"content\": SYSTEM_PROMPT},\n",
" {\"role\": \"user\", \"content\": f\"Translate this Python code to C++:\\n\\n{python_code}\"}\n",
" ],\n",
" temperature=0.2\n",
" )\n",
" return response.choices[0].message.content\n",
" except Exception as e:\n",
" return f\"Error with GPT: {str(e)}\"\n",
"\n",
"def translate_with_gemini(python_code, model=\"gemini-2.0-flash-exp\"):\n",
" \"\"\"Translate Python to C++ using Google's Gemini\"\"\"\n",
" try:\n",
" model_instance = genai.GenerativeModel(model)\n",
" prompt = f\"{SYSTEM_PROMPT}\\n\\nTranslate this Python code to C++:\\n\\n{python_code}\"\n",
" response = model_instance.generate_content(prompt)\n",
" return response.text\n",
" except Exception as e:\n",
" return f\"Error with Gemini: {str(e)}\"\n",
"\n",
"def translate_with_claude(python_code, model=\"claude-sonnet-4-20250514\"):\n",
" \"\"\"Translate Python to C++ using Anthropic's Claude\"\"\"\n",
" try:\n",
" response = anthropic_client.messages.create(\n",
" model=model,\n",
" max_tokens=4096,\n",
" temperature=0.2,\n",
" system=SYSTEM_PROMPT,\n",
" messages=[\n",
" {\"role\": \"user\", \"content\": f\"Translate this Python code to C++:\\n\\n{python_code}\"}\n",
" ]\n",
" )\n",
" return response.content[0].text\n",
" except Exception as e:\n",
" return f\"Error with Claude: {str(e)}\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 6: Main Translation Function"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def translate_python_to_cpp(python_code, llm=\"gpt\", model=None):\n",
" \"\"\"\n",
" Translate Python code to C++ using specified LLM\n",
" \n",
" Args:\n",
" python_code (str): Python code to translate\n",
" llm (str): LLM to use ('gpt', 'gemini', or 'claude')\n",
" model (str): Specific model version (optional)\n",
" \n",
" Returns:\n",
" str: Translated C++ code\n",
" \"\"\"\n",
" print(f\"🔄 Translating with {llm.upper()}...\")\n",
" \n",
" if llm.lower() == \"gpt\":\n",
" model = model or \"gpt-4o\"\n",
" cpp_code = translate_with_gpt(python_code, model)\n",
" elif llm.lower() == \"gemini\":\n",
" model = model or \"gemini-2.0-flash-exp\"\n",
" cpp_code = translate_with_gemini(python_code, model)\n",
" elif llm.lower() == \"claude\":\n",
" model = model or \"claude-sonnet-4-20250514\"\n",
" cpp_code = translate_with_claude(python_code, model)\n",
" else:\n",
" return \"Error: Invalid LLM. Choose 'gpt', 'gemini', or 'claude'\"\n",
" \n",
" return cpp_code"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 7: Compilation Testing Functions"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def extract_cpp_code(text):\n",
" \"\"\"Extract C++ code from markdown code blocks if present\"\"\"\n",
" if \"```cpp\" in text:\n",
" start = text.find(\"```cpp\") + 6\n",
" end = text.find(\"```\", start)\n",
" return text[start:end].strip()\n",
" elif \"```c++\" in text:\n",
" start = text.find(\"```c++\") + 6\n",
" end = text.find(\"```\", start)\n",
" return text[start:end].strip()\n",
" elif \"```\" in text:\n",
" start = text.find(\"```\") + 3\n",
" end = text.find(\"```\", start)\n",
" return text[start:end].strip()\n",
" return text.strip()\n",
"\n",
"def compile_cpp_code(cpp_code, output_name=\"translated_program\"):\n",
" \"\"\"\n",
" Compile C++ code and return compilation status\n",
" \n",
" Args:\n",
" cpp_code (str): C++ code to compile\n",
" output_name (str): Name of output executable\n",
" \n",
" Returns:\n",
" dict: Compilation result with status and messages\n",
" \"\"\"\n",
" # Extract code from markdown if present\n",
" cpp_code = extract_cpp_code(cpp_code)\n",
" \n",
" # Create temporary directory\n",
" with tempfile.TemporaryDirectory() as tmpdir:\n",
" cpp_file = Path(tmpdir) / \"program.cpp\"\n",
" exe_file = Path(tmpdir) / output_name\n",
" \n",
" # Write C++ code to file\n",
" with open(cpp_file, 'w') as f:\n",
" f.write(cpp_code)\n",
" \n",
" # Try to compile\n",
" try:\n",
" result = subprocess.run(\n",
" ['g++', '-std=c++17', str(cpp_file), '-o', str(exe_file)],\n",
" capture_output=True,\n",
" text=True,\n",
" timeout=10\n",
" )\n",
" \n",
" if result.returncode == 0:\n",
" return {\n",
" 'success': True,\n",
" 'message': '✓ Compilation successful!',\n",
" 'executable': str(exe_file),\n",
" 'stdout': result.stdout,\n",
" 'stderr': result.stderr\n",
" }\n",
" else:\n",
" return {\n",
" 'success': False,\n",
" 'message': '✗ Compilation failed',\n",
" 'stdout': result.stdout,\n",
" 'stderr': result.stderr\n",
" }\n",
" except subprocess.TimeoutExpired:\n",
" return {\n",
" 'success': False,\n",
" 'message': '✗ Compilation timed out'\n",
" }\n",
" except FileNotFoundError:\n",
" return {\n",
" 'success': False,\n",
" 'message': '✗ g++ compiler not found. Please install g++ to compile C++ code.'\n",
" }\n",
" except Exception as e:\n",
" return {\n",
" 'success': False,\n",
" 'message': f'✗ Compilation error: {str(e)}'\n",
" }"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Step 8: Complete Pipeline"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def translate_and_compile(python_code, llm=\"gpt\", model=None, verbose=True):\n",
" \"\"\"\n",
" Translate Python to C++ and attempt compilation\n",
" \n",
" Args:\n",
" python_code (str): Python code to translate\n",
" llm (str): LLM to use\n",
" model (str): Specific model version\n",
" verbose (bool): Print detailed output\n",
" \n",
" Returns:\n",
" dict: Results including translated code and compilation status\n",
" \"\"\"\n",
" # Translate\n",
" cpp_code = translate_python_to_cpp(python_code, llm, model)\n",
" \n",
" if verbose:\n",
" print(\"\\n\" + \"=\"*60)\n",
" print(\"TRANSLATED C++ CODE:\")\n",
" print(\"=\"*60)\n",
" print(cpp_code)\n",
" print(\"=\"*60 + \"\\n\")\n",
" \n",
" # Compile\n",
" print(\"🔨 Attempting to compile...\")\n",
" compilation_result = compile_cpp_code(cpp_code)\n",
" \n",
" if verbose:\n",
" print(compilation_result['message'])\n",
" if not compilation_result['success'] and 'stderr' in compilation_result:\n",
" print(\"\\nCompilation errors:\")\n",
" print(compilation_result['stderr'])\n",
" \n",
" return {\n",
" 'cpp_code': cpp_code,\n",
" 'compilation': compilation_result\n",
" }"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example 1: Factorial Function"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"python_code_1 = \"\"\"\n",
"def factorial(n):\n",
" if n <= 1:\n",
" return 1\n",
" return n * factorial(n - 1)\n",
"\n",
"# Test the function\n",
"print(factorial(5))\n",
"\"\"\"\n",
"\n",
"print(\"Example 1: Factorial Function\")\n",
"print(\"=\"*60)\n",
"result1 = translate_and_compile(python_code_1, llm=\"gpt\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example 2: Sum of Squares"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"python_code_2 = \"\"\"\n",
"def sum_of_squares(numbers):\n",
" return sum(x**2 for x in numbers)\n",
"\n",
"numbers = [1, 2, 3, 4, 5]\n",
"result = sum_of_squares(numbers)\n",
"print(f\"Sum of squares: {result}\")\n",
"\"\"\"\n",
"\n",
"print(\"Example 2: Sum of Squares\")\n",
"print(\"=\"*60)\n",
"result2 = translate_and_compile(python_code_2, llm=\"claude\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example 3: Fibonacci with Gemini"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"python_code_3 = \"\"\"\n",
"def fibonacci(n):\n",
" if n <= 1:\n",
" return n\n",
" a, b = 0, 1\n",
" for _ in range(2, n + 1):\n",
" a, b = b, a + b\n",
" return b\n",
"\n",
"print(f\"Fibonacci(10) = {fibonacci(10)}\")\n",
"\"\"\"\n",
"\n",
"print(\"Example 3: Fibonacci with Gemini\")\n",
"print(\"=\"*60)\n",
"result3 = translate_and_compile(python_code_3, llm=\"gemini\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Compare All LLMs"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def compare_llms(python_code):\n",
" \"\"\"Compare all three LLMs on the same Python code\"\"\"\n",
" llms = [\"gpt\", \"gemini\", \"claude\"]\n",
" results = {}\n",
" \n",
" for llm in llms:\n",
" print(f\"\\n{'='*60}\")\n",
" print(f\"Testing with {llm.upper()}\")\n",
" print('='*60)\n",
" results[llm] = translate_and_compile(python_code, llm=llm, verbose=False)\n",
" print(results[llm]['compilation']['message'])\n",
" \n",
" return results"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Test code for comparison\n",
"python_code_compare = \"\"\"\n",
"def is_prime(n):\n",
" if n < 2:\n",
" return False\n",
" for i in range(2, int(n**0.5) + 1):\n",
" if n % i == 0:\n",
" return False\n",
" return True\n",
"\n",
"primes = [x for x in range(2, 20) if is_prime(x)]\n",
"print(f\"Primes under 20: {primes}\")\n",
"\"\"\"\n",
"\n",
"print(\"COMPARING ALL LLMs\")\n",
"print(\"=\"*60)\n",
"comparison_results = compare_llms(python_code_compare)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Interactive Translation Mode\n",
"\n",
"Use this cell to translate your own Python code interactively:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Your custom Python code here\n",
"your_python_code = \"\"\"\n",
"# Paste your Python code here\n",
"def hello_world():\n",
" print(\"Hello, World!\")\n",
"\n",
"hello_world()\n",
"\"\"\"\n",
"\n",
"# Choose your LLM: \"gpt\", \"gemini\", or \"claude\"\n",
"chosen_llm = \"gpt\"\n",
"\n",
"result = translate_and_compile(your_python_code, llm=chosen_llm)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Summary\n",
"\n",
"You now have a complete Python to C++ translator! \n",
"\n",
"### Main Functions:\n",
"- `translate_python_to_cpp(code, llm, model)` - Translate only\n",
"- `translate_and_compile(code, llm, model)` - Translate and compile\n",
"- `compare_llms(code)` - Compare all three LLMs\n",
"\n",
"### Supported LLMs:\n",
"- **gpt** - OpenAI GPT-4o\n",
"- **gemini** - Google Gemini 2.0 Flash\n",
"- **claude** - Anthropic Claude Sonnet 4\n",
"\n",
"Happy translating! 🚀"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.12"
}
},
"nbformat": 4,
"nbformat_minor": 4
}