Added Resume Analyzer and Sample version

2025-04-26 14:17:11 +10:00
parent b8b2f766e5
commit c279e9eaa7
1 changed files with 234 additions and 0 deletions
--- a/week1/community-contributions/day1-AnalyzeResume-GenerateSample.ipynb
+++ b/week1/community-contributions/day1-AnalyzeResume-GenerateSample.ipynb
@@ -0,0 +1,234 @@
 {
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4e2a9393-7767-488e-a8bf-27c12dca35bd",
   "metadata": {},
   "outputs": [],
   "source": [
    "# imports\n",
    "\n",
    "import os\n",
    "import requests\n",
    "from dotenv import load_dotenv\n",
    "from bs4 import BeautifulSoup\n",
    "from IPython.display import Markdown, display\n",
    "from openai import OpenAI\n",
    "\n",
    "# If you get an error running this cell, then please head over to the troubleshooting notebook!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7b87cadb-d513-4303-baee-a37b6f938e4d",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load environment variables in a file called .env\n",
    "\n",
    "load_dotenv(override=True)\n",
    "api_key = os.getenv('OPENAI_API_KEY')\n",
    "\n",
    "# Check the key\n",
    "\n",
    "if not api_key:\n",
    "    print(\"No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!\")\n",
    "elif not api_key.startswith(\"sk-proj-\"):\n",
    "    print(\"An API key was found, but it doesn't start sk-proj-; please check you're using the right key - see troubleshooting notebook\")\n",
    "elif api_key.strip() != api_key:\n",
    "    print(\"An API key was found, but it looks like it might have space or tab characters at the start or end - please remove them - see troubleshooting notebook\")\n",
    "else:\n",
    "    print(\"API key found and looks good so far!\")\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "019974d9-f3ad-4a8a-b5f9-0a3719aea2d3",
   "metadata": {},
   "outputs": [],
   "source": [
    "openai = OpenAI()\n",
    "\n",
    "# If this doesn't work, try Kernel menu >> Restart Kernel and Clear Outputs Of All Cells, then run the cells from the top of this notebook down.\n",
    "# If it STILL doesn't work (horrors!) then please see the Troubleshooting notebook in this folder for full instructions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "abdb8417-c5dc-44bc-9bee-2e059d162699",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define our system prompt - you can experiment with this later, changing the last sentence to 'Respond in markdown in Spanish.\"\n",
    "\n",
    "system_prompt = \"You are a high profile professional resume analyst and assist users with highlighting gaps in a very formed resume and provide direction to make the resume eye catching to the recruiters \\\n",
    "and employers.\"\n",
    "\n",
    "user_prompt = \"\"\"Analyze the resume details to do the following: \\\n",
    "1. Assess the resume to highlight areas of improvement. \\ \n",
    "2. Create a well formed resume.\n",
    "\n",
    "Name: Sam Burns\n",
    "\n",
    "PROFESSIONAL SUMMARY\n",
    "Experienced Data and AI Architect with over 10 years of expertise designing scalable data platforms, integrating cloud-native solutions, and deploying AI/ML systems across enterprise environments. Proven track record of aligning data architecture with business strategy, leading cross-functional teams, and delivering high-impact AI-driven insights.\n",
    "\n",
    "CORE SKILLS\n",
    "\n",
    "Data Architecture: Lakehouse, Data Mesh, Delta Lake, Data Vault\n",
    "\n",
    "Cloud Platforms: Azure (Data Factory, Synapse, ML Studio), AWS (S3, Glue, SageMaker), Databricks\n",
    "\n",
    "Big Data & Streaming: Spark, Kafka, Hive, Hadoop\n",
    "\n",
    "ML/AI Tooling: MLflow, TensorFlow, Scikit-learn, Hugging Face Transformers\n",
    "\n",
    "Programming: Python, SQL, PySpark, Scala, Terraform\n",
    "\n",
    "DevOps: CI/CD (GitHub Actions, Azure DevOps), Docker, Kubernetes\n",
    "\n",
    "Governance: Data Lineage, Cataloging, RBAC, GDPR, Responsible AI\n",
    "\n",
    "PROFESSIONAL EXPERIENCE\n",
    "\n",
    "Senior Data & AI Architect\n",
    "ABC Tech Solutions — New York, NY\n",
    "Jan 2021 – Present\n",
    "\n",
    "Designed and implemented a company-wide lakehouse architecture on Databricks, integrating AWS S3, Redshift, and real-time ingestion from Kafka.\n",
    "\n",
    "Led architecture for a predictive maintenance platform using sensor data (IoT), Spark streaming, and MLflow-managed experiments.\n",
    "\n",
    "Developed enterprise ML governance framework ensuring reproducibility, fairness, and compliance with GDPR.\n",
    "\n",
    "Mentored 6 data engineers and ML engineers; led architectural reviews and technical roadmap planning.\n",
    "\n",
    "Data Architect / AI Specialist\n",
    "Global Insights Inc. — Boston, MA\n",
    "Jun 2017 – Dec 2020\n",
    "\n",
    "Modernized legacy data warehouse to Azure Synapse-based analytics platform, reducing ETL latency by 40%.\n",
    "\n",
    "Built MLOps pipelines for customer churn prediction models using Azure ML and ADF.\n",
    "\n",
    "Collaborated with business units to define semantic layers for self-service analytics in Power BI.\n",
    "\n",
    "Data Engineer\n",
    "NextGen Analytics — Remote\n",
    "Jul 2013 – May 2017\n",
    "\n",
    "Developed ETL pipelines in PySpark to transform raw web traffic into structured analytics dashboards.\n",
    "\n",
    "Integrated NLP models into customer support workflows using spaCy and early versions of Hugging Face.\n",
    "\n",
    "Contributed to open-source tools for Jupyter-based analytics and data catalog integration.\n",
    "\n",
    "EDUCATION\n",
    "M.S. in Computer Science – Carnegie Mellon University\n",
    "B.S. in Information Systems – Rutgers University\n",
    "\n",
    "CERTIFICATIONS\n",
    "\n",
    "Databricks Certified Data Engineer Professional\n",
    "\n",
    "Azure Solutions Architect Expert\n",
    "\n",
    "AWS Certified Machine Learning – Specialty\n",
    "\n",
    "PROJECTS & CONTRIBUTIONS\n",
    "\n",
    "llm_engineering (GitHub): Developed and maintained hands-on LLM course materials and community contributions framework.\n",
    "\n",
    "Real-time AI PoC: Designed Kafka-Spark pipeline with Azure OpenAI Service for anomaly detection on IoT streams.\n",
    "\n",
    "Contributor to Hugging Face Transformers – integration examples for inference pipelines\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4e6a8730-c3ad-4243-a045-0acba2b5ebcf",
   "metadata": {},
   "outputs": [],
   "source": [
    "messages = [\n",
    "    {\"role\": \"system\", \"content\": system_prompt},\n",
    "    {\"role\": \"user\", \"content\": user_prompt}\n",
    "]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "21ed95c5-7001-47de-a36d-1d6673b403ce",
   "metadata": {},
   "outputs": [],
   "source": [
    "# To give you a preview -- calling OpenAI with system and user messages:\n",
    "\n",
    "response = openai.chat.completions.create(model=\"gpt-4o-mini\", messages=messages)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3d926d59-450e-4609-92ba-2d6f244f1342",
   "metadata": {},
   "outputs": [],
   "source": [
    "# A function to display this nicely in the Jupyter output, using markdown\n",
    "\n",
    "display(Markdown(response.choices[0].message.content))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eeab24dc-5f90-4570-b542-b0585aca3eb6",
   "metadata": {},
   "source": [
    "# Sharing your code\n",
    "\n",
    "I'd love it if you share your code afterwards so I can share it with others! You'll notice that some students have already made changes (including a Selenium implementation) which you will find in the community-contributions folder. If you'd like add your changes to that folder, submit a Pull Request with your new versions in that folder and I'll merge your changes.\n",
    "\n",
    "If you're not an expert with git (and I am not!) then GPT has given some nice instructions on how to submit a Pull Request. It's a bit of an involved process, but once you've done it once it's pretty clear. As a pro-tip: it's best if you clear the outputs of your Jupyter notebooks (Edit >> Clean outputs of all cells, and then Save) for clean notebooks.\n",
    "\n",
    "Here are good instructions courtesy of an AI friend:  \n",
    "https://chatgpt.com/share/677a9cb5-c64c-8012-99e0-e06e88afd293"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f4484fcf-8b39-4c3f-9674-37970ed71988",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }