Merge pull request #594 from ZnDream/D2-property-rental-assistant

Add D2-property-rental-assistant
This commit is contained in:
Ed Donner
2025-08-16 09:07:35 +01:00
committed by GitHub
2 changed files with 406 additions and 0 deletions

View File

@@ -0,0 +1,189 @@
# AI Property Rental Assistant
An intelligent property rental assistant Jupyter notebook that scrapes real estate listings from OnTheMarket and uses a local LLM (DeepSeek R1) to analyze and recommend properties based on user requirements.
## Features
- **Web Scraping**: Automatically fetches property listings from OnTheMarket
- **AI-Powered Analysis**: Uses DeepSeek R1 model via Ollama for intelligent recommendations
- **Personalized Recommendations**: Filters and ranks properties based on:
- Budget constraints
- Number of bedrooms
- Tenant type (student, family, professional)
- Location preferences
- **Clean Output**: Returns formatted markdown with top 3-5 property recommendations
- **Smart Filtering**: Handles cases where no suitable properties are found with helpful suggestions
## Prerequisites
- Python 3.7+
- Ollama installed and running locally
- DeepSeek R1 14B model pulled in Ollama
## Installation
1. **Clone the repository**
```bash
git clone <your-repo-url>
cd property-rental-assistant
```
2. **Install required Python packages**
```bash
pip install requests beautifulsoup4 ollama ipython jupyter
```
3. **Install and setup Ollama**
```bash
# Install Ollama (macOS/Linux)
curl -fsSL https://ollama.ai/install.sh | sh
# For Windows, download from: https://ollama.ai/download
```
4. **Pull the DeepSeek R1 model**
```bash
ollama pull deepseek-r1:14b
```
5. **Start Ollama server**
```bash
ollama serve
```
## Usage
### Running the Notebook
1. **Start Jupyter Notebook**
```bash
jupyter notebook
```
2. **Open the notebook**
Navigate to `property_rental_assistant.ipynb` in the Jupyter interface
3. **Run all cells**
Click `Cell``Run All` or use `Shift + Enter` to run cells individually
### Customizing Search Parameters
Modify the `user_needs` variable in the notebook:
```python
user_needs = "I'm a student looking for a 2-bedroom house in Durham under £2,000/month"
```
Other examples:
- `"Family of 4 looking for 3-bedroom house with garden in Durham, budget £2,500/month"`
- `"Professional couple seeking modern 1-bed apartment near city center, max £1,500/month"`
- `"Student group needs 4-bedroom house near Durham University, £600/month per person"`
### Changing the Property Website
Update the `website_url` variable in the notebook:
```python
website_url = "https://www.onthemarket.com/to-rent/property/durham/"
```
## Architecture
```
┌─────────────────┐ ┌──────────────┐ ┌─────────────┐
│ OnTheMarket │────▶│ Web Scraper │────▶│ Ollama │
│ Website │ │ (BeautifulSoup)│ │ (DeepSeek R1)│
└─────────────────┘ └──────────────┘ └─────────────┘
┌─────────────────────────────────┐
│ AI-Generated Recommendations │
│ • Top 5 matching properties │
│ • Filtered by requirements │
│ • Markdown formatted output │
└─────────────────────────────────┘
```
## Project Structure
```
property-rental-assistant/
├── property_rental_assistant.ipynb # Main Jupyter notebook
└── README.md # This file
```
## 🔧 Configuration
### Ollama API Settings
```python
OLLAMA_API = "http://localhost:11434/api/chat" # Default Ollama endpoint
MODEL = "deepseek-r1:14b" # Model to use
```
### Web Scraping Settings
```python
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
}
timeout = 10 # Request timeout in seconds
```
### Content Limits
```python
website.text[:4000] # Truncate content to 4000 chars for token limits
```
## How It Works
1. **Web Scraping**: The `Website` class fetches and parses HTML content from the property listing URL
2. **Content Cleaning**: Removes scripts, styles, and images to extract clean text
3. **Prompt Engineering**: Combines system prompt with user requirements and scraped data
4. **LLM Analysis**: Sends the prompt to DeepSeek R1 via Ollama API
5. **Recommendation Generation**: The AI analyzes listings and returns top matches in markdown format
## 🛠️ Troubleshooting
### Ollama Connection Error
```
Error communicating with Ollama: [Errno 111] Connection refused
```
**Solution**: Ensure Ollama is running with `ollama serve`
### Model Not Found
```
Error: model 'deepseek-r1:14b' not found
```
**Solution**: Pull the model with `ollama pull deepseek-r1:14b`
### Web Scraping Blocked
```
Error fetching website: 403 Forbidden
```
**Solution**: The website may be blocking automated requests. Try:
- Updating the User-Agent string
- Adding delays between requests
- Using a proxy or VPN
### Insufficient Property Data
If recommendations are poor quality, the scraper may not be capturing listing details properly. Check:
- The website structure hasn't changed
- The content truncation limit (4000 chars) isn't too restrictive
## Future Enhancements
- [ ] Support multiple property websites (Rightmove, Zoopla, SpareRoom)
- [ ] Interactive CLI for dynamic user input
- [ ] Property image analysis
- [ ] Save search history and favorite properties
- [ ] Email notifications for new matching properties
- [ ] Price trend analysis
- [ ] Commute time calculations to specified locations
- [ ] Multi-language support
- [ ] Web interface with Flask/FastAPI
- [ ] Docker containerization
## Acknowledgments
- [Ollama](https://ollama.ai/) for local LLM hosting
- [DeepSeek](https://www.deepseek.com/) for the R1 model
- [BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/) for web scraping
- [OnTheMarket](https://www.onthemarket.com/) for property data

View File

@@ -0,0 +1,217 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "57112e5c-7b0f-4ba7-9022-ae21e8ac0f42",
"metadata": {},
"outputs": [],
"source": [
"# imports\n",
"\n",
"import requests\n",
"from bs4 import BeautifulSoup\n",
"from IPython.display import Markdown, display"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3b71a051-fc0e-46a9-8b1b-b58f685e800d",
"metadata": {},
"outputs": [],
"source": [
"# Constants\n",
"OLLAMA_API = \"http://localhost:11434/api/chat\"\n",
"HEADERS = {\"Content-Type\": \"application/json\"}\n",
"MODEL = \"deepseek-r1:14b\""
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ed3be9dc-d459-46ac-a8eb-f9b932c4302f",
"metadata": {},
"outputs": [],
"source": [
"headers = {\n",
" \"User-Agent\": \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36\"\n",
"}\n",
"\n",
"class Website:\n",
" def __init__(self, url):\n",
" self.url = url\n",
" try:\n",
" response = requests.get(url, headers=headers, timeout=10)\n",
" response.raise_for_status()\n",
" soup = BeautifulSoup(response.content, 'html.parser')\n",
" self.title = soup.title.string if soup.title else \"No title found\"\n",
" if soup.body:\n",
" for irrelevant in soup.body([\"script\", \"style\", \"img\", \"input\"]):\n",
" irrelevant.decompose()\n",
" self.text = soup.body.get_text(separator=\"\\n\", strip=True)\n",
" else:\n",
" self.text = \"No body content found\"\n",
" except requests.RequestException as e:\n",
" print(f\"Error fetching website: {e}\")\n",
" self.title = \"Error loading page\"\n",
" self.text = \"Could not load page content\""
]
},
{
"cell_type": "markdown",
"id": "17ea76f8-38d9-40b9-8aba-eb957d690a0d",
"metadata": {},
"source": [
"## Without Ollama package"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3a6fd698-8e59-4cd7-bb53-b9375e50f899",
"metadata": {},
"outputs": [],
"source": [
"def house_renting(system_prompt, user_prompt):\n",
" messages = [\n",
" {\"role\": \"system\", \"content\": system_prompt},\n",
" {\"role\": \"user\", \"content\": user_prompt}\n",
" ]\n",
" payload = {\n",
" \"model\": MODEL,\n",
" \"messages\": messages,\n",
" \"stream\": False\n",
" }\n",
" response = requests.post(OLLAMA_API, json=payload, headers=HEADERS)\n",
" return response.json()['message']['content']"
]
},
{
"cell_type": "markdown",
"id": "c826a52c-d1d3-493a-8b7c-6e75b848b453",
"metadata": {},
"source": [
"## Introducing Ollama package "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "519e27da-eeff-4c1b-a8c6-e680fdf01da2",
"metadata": {},
"outputs": [],
"source": [
"import ollama\n",
"\n",
"def house_renting_ollama(system_prompt, user_prompt):\n",
" try:\n",
" messages = [\n",
" {\"role\": \"system\", \"content\": system_prompt},\n",
" {\"role\": \"user\", \"content\": user_prompt}\n",
" ]\n",
" response = ollama.chat(model=MODEL, messages=messages)\n",
" return response['message']['content']\n",
" except Exception as e:\n",
" return f\"Error communicating with Ollama: {e}\""
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "60e98b28-06d9-4303-b8ca-f7b798244eb4",
"metadata": {},
"outputs": [],
"source": [
"system_prompt = \"\"\"\n",
"You are a helpful real estate assistant specializing in UK property rentals. Your job is to guide users in finding houses to rent, especially in Durham. Follow these rules:\n",
"1. Always ask clarifying questions if user input is vague. Determine location, budget, number of bedrooms, and tenant type (e.g. student, family, professional).\n",
"2. Use structured data provided from the website (like property listings) to identify relevant options.\n",
"3. If listings are provided, filter and rank them based on the user's preferences.\n",
"4. Recommend up to 5 top properties with rent price, bedroom count, key features, and location.\n",
"5. Always respond in markdown with clean formatting using headers, bold text, and bullet points.\n",
"6. If no listings match well, provide tips (e.g. \"try adjusting your budget or search radius\").\n",
"7. Stay concise, helpful, and adapt to whether the user is a student, family, couple, or solo tenant.\n",
"\"\"\"\n",
"\n",
"def user_prompt_for_renting(website, user_needs):\n",
" return f\"\"\"\n",
"I want to rent a house and here's what I'm looking for:\n",
"{user_needs}\n",
"\n",
"Here are the property listings I found on the website titled: \"{website.title}\".\n",
"\n",
"Please analyze them and recommend the best 35 options that match my needs. If none are suitable, tell me why and offer suggestions.\n",
"\n",
"The page content is below:\n",
"{website.text[:4000]}\n",
"\"\"\" # content is truncated for token limits"
]
},
{
"cell_type": "markdown",
"id": "ef420f4b-e3d2-4fbd-bf6f-811f2c8536e0",
"metadata": {},
"source": [
"## Ollama Package"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1cf128af-4ece-41ab-b353-5c8564c7de1d",
"metadata": {},
"outputs": [],
"source": [
"if __name__ == \"__main__\": \n",
" print(\"Starting AI Property Rental Assistant...\")\n",
" print(\"=\" * 50)\n",
" \n",
" website_url = \"https://www.onthemarket.com/to-rent/property/durham/\"\n",
" print(f\"🔍 Scraping properties from: {website_url}\")\n",
" \n",
" website = Website(website_url)\n",
" print(f\"Website Title: {website.title}\")\n",
" print(f\"Content Length: {len(website.text)} characters\")\n",
" print(f\"Successfully scraped property listings\\n\")\n",
" \n",
" user_needs = \"I'm a student looking for a 2-bedroom house in Durham under £2,000/month\"\n",
" print(f\"User Requirements: {user_needs}\\n\")\n",
" \n",
" user_prompt = user_prompt_for_renting(website, user_needs)\n",
" print(\"Generating AI recommendations...\")\n",
" \n",
" # Choose which method to use (comment out the one you don't want)\n",
" \n",
" # Method 1: Using ollama Python library\n",
" output = house_renting_ollama(system_prompt, user_prompt)\n",
" \n",
" # Method 2: Using direct API call\n",
" # output = house_renting(system_prompt, user_prompt)\n",
" \n",
" display(Markdown(output))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python [conda env:llms]",
"language": "python",
"name": "conda-env-llms-py"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}