Add notebooks for Muhammad Qasim Sheikh in community-contributions

This commit is contained in:
aashahid
2025-10-21 17:51:37 +05:00
parent ef34387aee
commit 0b4e4be9a0
12 changed files with 1284 additions and 0 deletions

View File

@@ -0,0 +1,50 @@
# **Automated Bitcoin Daily Summary Generator**
This project automates the process of generating a daily summary of the Bitcoin network's status. It fetches real-time data from multiple public API endpoints, processes it, and then uses a Large Language Model (LLM) to generate a clear, structured, and human-readable report in Markdown format.
## **Project Overview**
The core goal of this project is to provide a snapshot of key Bitcoin metrics without manual analysis. By leveraging the Braiins Public API for data and OpenAI's GPT models for summarization, it can produce insightful daily reports covering market trends, network health, miner revenue, and future outlooks like the next halving event.
### **Key Features**
- **Automated Data Fetching**: Pulls data from 7 different Braiins API endpoints covering price, hashrate, difficulty, transaction fees, and more.
- **Data Cleaning**: Pre-processes the raw JSON data to make it clean and suitable for the LLM.
- **Intelligent Summarization**: Uses an advanced LLM to analyze the data and generate a structured report with explanations for technical terms.
- **Dynamic Dating**: The report is always dated for the day it is run, providing a timely summary regardless of the timestamps in the source data.
- **Markdown Output**: Generates a clean, well-formatted Markdown file that is easy to read or integrate into other systems.
## **How It Works**
The project is split into two main files:
1. **utils.py**: A utility script responsible for all data fetching and cleaning operations.
- It defines the Braiins API endpoints to be queried.
- It contains functions to handle HTTP requests, parse JSON responses, and clean up keys and values to ensure consistency.
2. **day_1_bitcoin_daily_brief.ipynb**: A Jupyter Notebook that acts as the main orchestrator.
- It imports the necessary functions from utils.py.
- It calls fetch_clean_data() to get the latest Bitcoin network data.
- It constructs a detailed system and user prompt for the LLM, explicitly instructing it on the desired format and, crucially, to use the current date for the summary.
- It sends the data and prompt to the OpenAI API.
- It receives the generated summary and displays it as formatted Markdown.
## **Setup and Usage**
To run this project, you will need to have Python and the required libraries installed.
### **1\. Prerequisites**
- Python 3.x
- Jupyter Notebook or JupyterLab
### **2\. Installation**
- Install the necessary Python libraries: pip install requests openai python-dotenv jupyter
### **3\. Configuration**
You need an API key from OpenAI to use the summarization feature.
1. Create a file named .env in the root directory of the project.
2. Add your OpenAI API key to the .env file as follows:
OPENAI_API_KEY='your_openai_api_key_here'

View File

@@ -0,0 +1,156 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "abaef96b",
"metadata": {},
"source": [
"## Importing The Libraries"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "f90c541b",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import datetime\n",
"from utils import fetch_clean_data\n",
"from openai import OpenAI\n",
"from IPython.display import Markdown, display\n",
"from dotenv import load_dotenv\n",
"import json"
]
},
{
"cell_type": "markdown",
"id": "6e6c864b",
"metadata": {},
"source": [
"## Configuration"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "be62299d",
"metadata": {},
"outputs": [],
"source": [
"load_dotenv(override=True)\n",
"api_key = os.getenv('OPENAI_API_KEY')\n",
"\n",
"client = OpenAI()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "3aa8e3e2",
"metadata": {},
"outputs": [],
"source": [
"def generate_markdown_summary(data: dict, today_date_str: str) -> str:\n",
" \"\"\"\n",
" Send cleaned Bitcoin data to an LLM and receive a Markdown summary.\n",
" \"\"\"\n",
"\n",
" system_prompt = f\"\"\"\n",
" You are a professional crypto analyst. Your job is to read the provided Bitcoin network data \n",
" and write a clear, structured report that can be read directly as a daily summary.\n",
"\n",
" Following are the rules that you must adhere to:\n",
" - **IMPORTANT**: The summary title MUST use today's date: {today_date_str}. The title must be: \"Bitcoin Daily Summary - {today_date_str}\".\n",
" - **CRITICAL**: Do NOT infer the reporting period from the data. The data contains historical records, but your report is for {today_date_str}.\n",
" - Include **headings** for sections like \"Market Overview\", \"Network Metrics Explained\", \"Miner Revenue Trends\", and \"Halving Outlook\".\n",
" - Use **bullet points** for key metrics.\n",
" - Use a **table** for historical or time-series data if available.\n",
" - Explain important terms (like hashrate, difficulty, transaction fees) in plain language.\n",
"\n",
" Respond in markdown. Do not wrap the markdown in a code block - respond just with the markdown.\n",
" \"\"\"\n",
"\n",
" # Convert the Python data dictionary into a clean JSON string for the prompt\n",
" data_str = json.dumps(data, indent=2)\n",
"\n",
" user_prompt = f\"\"\"\n",
" Today's date is {today_date_str}. Use this as the reference point for the report.\n",
"\n",
" The following data may contain historical records (e.g., from 2024), \n",
" but you must treat it as background context and write the summary as of {today_date_str}.\n",
"\n",
" Here is the data for you to summarize: \n",
" {data_str}\n",
" \"\"\"\n",
" \n",
" response = client.chat.completions.create(\n",
" model= \"gpt-4.1-mini\", \n",
" messages=[\n",
" {\"role\": \"system\", \"content\": system_prompt},\n",
" {\"role\": \"user\", \"content\": user_prompt}\n",
" ]\n",
" )\n",
"\n",
" markdown_text = response.choices[0].message.content.strip()\n",
" return markdown_text"
]
},
{
"cell_type": "markdown",
"id": "1e8c2d7d",
"metadata": {},
"source": [
"## Main Function"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "05059ed9",
"metadata": {},
"outputs": [],
"source": [
"def main():\n",
" # 0. Get today's date as a string\n",
" today_str = datetime.datetime.now().strftime('%B %d, %Y')\n",
" \n",
" # 1. Fetch and clean data\n",
" print(\"Fetching Bitcoin data...\")\n",
" data = fetch_clean_data()\n",
"\n",
" # 2. Generate Markdown summary\n",
" print(\"Generating LLM summary...\")\n",
" markdown_report = generate_markdown_summary(data, today_str)\n",
"\n",
" # 3. Display Output\n",
" display(Markdown(markdown_report))\n",
"\n",
"if __name__ == \"__main__\":\n",
" main()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "llm-engineering",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,121 @@
# utils.py
import requests
import re
import datetime
import logging
from typing import Dict, Optional, Union
# -----------------------------------------
# Logging setup
# -----------------------------------------
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
# -----------------------------------------
# Braiins API endpoints (7 selected)
# -----------------------------------------
BRAIINS_APIS = {
'price_stats': 'https://insights.braiins.com/api/v1.0/price-stats',
'hashrate_stats': 'https://insights.braiins.com/api/v1.0/hashrate-stats',
'difficulty_stats': 'https://insights.braiins.com/api/v1.0/difficulty-stats',
'transaction_fees_history': 'https://insights.braiins.com/api/v1.0/transaction-fees-history',
'daily_revenue_history': 'https://insights.braiins.com/api/v1.0/daily-revenue-history',
'hashrate_value_history': 'https://insights.braiins.com/api/v1.0/hashrate-value-history',
'halvings': 'https://insights.braiins.com/api/v2.0/halvings'
}
# -----------------------------------------
# Utility Functions
# -----------------------------------------
def clean_value(value):
"""Clean strings, remove brackets/quotes and standardize whitespace."""
if value is None:
return ""
s = str(value)
s = s.replace(",", " ")
s = re.sub(r"[\[\]\{\}\(\)]", "", s)
s = s.replace('"', "").replace("'", "")
s = re.sub(r"\s+", " ", s)
return s.strip()
def parse_date(date_str: str) -> Optional[str]:
"""Parse dates into a standard readable format."""
if not date_str or not isinstance(date_str, str):
return None
try:
if 'T' in date_str:
return datetime.datetime.fromisoformat(date_str.replace('Z', '').split('.')[0]).strftime('%Y-%m-%d %H:%M:%S')
if '-' in date_str and len(date_str) == 10:
return datetime.datetime.strptime(date_str, '%Y-%m-%d').strftime('%Y-%m-%d %H:%M:%S')
if '/' in date_str and len(date_str) == 10:
return datetime.datetime.strptime(date_str, '%m/%d/%Y').strftime('%Y-%m-%d %H:%M:%S')
except Exception:
return date_str
return date_str
def fetch_endpoint_data(url: str) -> Optional[Union[Dict, list]]:
"""Generic GET request to Braiins API endpoint."""
try:
resp = requests.get(url, timeout=15)
resp.raise_for_status()
return resp.json()
except Exception as e:
logger.error(f"Failed to fetch {url}: {e}")
return None
def clean_and_process_data(data: Union[Dict, list]) -> Union[Dict, list]:
"""Clean all keys and values in the fetched data."""
if isinstance(data, dict):
return {clean_value(k): clean_value(v) for k, v in data.items()}
elif isinstance(data, list):
cleaned_list = []
for item in data:
if isinstance(item, dict):
cleaned_list.append({clean_value(k): clean_value(v) for k, v in item.items()})
else:
cleaned_list.append(clean_value(item))
return cleaned_list
return clean_value(data)
# -----------------------------------------
# Main data fetcher
# -----------------------------------------
def fetch_clean_data(history_limit: int = 30) -> Dict[str, Union[Dict, list]]:
"""
Fetch and clean data from 7 selected Braiins endpoints.
For historical data, it limits the number of records.
Returns a dictionary ready to be passed into an LLM.
"""
logger.info("Fetching Bitcoin network data from Braiins...")
results = {}
for key, url in BRAIINS_APIS.items():
logger.info(f"Fetching {key} ...")
raw_data = fetch_endpoint_data(url)
if raw_data is not None:
# --- START OF THE NEW CODE ---
# If the endpoint is for historical data, limit the number of records
if "history" in key and isinstance(raw_data, list):
logger.info(f"Limiting {key} data to the last {history_limit} records.")
raw_data = raw_data[-history_limit:]
# --- END OF THE NEW CODE ---
results[key] = clean_and_process_data(raw_data)
else:
results[key] = {"error": "Failed to fetch"}
logger.info("All data fetched and cleaned successfully.")
return results
# -----------------------------------------
# Local test run (optional)
# -----------------------------------------
if __name__ == "__main__":
data = fetch_clean_data()
print("Sample keys fetched:", list(data.keys()))

View File

@@ -0,0 +1,50 @@
# **Automated Bitcoin Daily Summary Generator**
This project automates the process of generating a daily summary of the Bitcoin network's status. It fetches real-time data from multiple public API endpoints, processes it, and then uses a Large Language Model (LLM) to generate a clear, structured, and human-readable report in Markdown format.
## **Project Overview**
The core goal of this project is to provide a snapshot of key Bitcoin metrics without manual analysis. By leveraging the Braiins Public API for data and OpenAI's GPT models for summarization, it can produce insightful daily reports covering market trends, network health, miner revenue, and future outlooks like the next halving event.
### **Key Features**
- **Automated Data Fetching**: Pulls data from 7 different Braiins API endpoints covering price, hashrate, difficulty, transaction fees, and more.
- **Data Cleaning**: Pre-processes the raw JSON data to make it clean and suitable for the LLM.
- **Intelligent Summarization**: Uses an advanced LLM to analyze the data and generate a structured report with explanations for technical terms.
- **Dynamic Dating**: The report is always dated for the day it is run, providing a timely summary regardless of the timestamps in the source data.
- **Markdown Output**: Generates a clean, well-formatted Markdown file that is easy to read or integrate into other systems.
## **How It Works**
The project is split into two main files:
1. **utils.py**: A utility script responsible for all data fetching and cleaning operations.
- It defines the Braiins API endpoints to be queried.
- It contains functions to handle HTTP requests, parse JSON responses, and clean up keys and values to ensure consistency.
2. **day_1_bitcoin_daily_brief.ipynb**: A Jupyter Notebook that acts as the main orchestrator.
- It imports the necessary functions from utils.py.
- It calls fetch_clean_data() to get the latest Bitcoin network data.
- It constructs a detailed system and user prompt for the LLM, explicitly instructing it on the desired format and, crucially, to use the current date for the summary.
- It sends the data and prompt to the OpenAI API.
- It receives the generated summary and displays it as formatted Markdown.
## **Setup and Usage**
To run this project, you will need to have Python and the required libraries installed.
### **1\. Prerequisites**
- Python 3.x
- Jupyter Notebook or JupyterLab
### **2\. Installation**
- Install the necessary Python libraries: pip install requests openai python-dotenv jupyter
### **3\. Configuration**
You need an API key from OpenAI to use the summarization feature.
1. Create a file named .env in the root directory of the project.
2. Add your OpenAI API key to the .env file as follows:
OPENAI_API_KEY='your_openai_api_key_here'

View File

@@ -0,0 +1,152 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "abaef96b",
"metadata": {},
"source": [
"## Importing The Libraries"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "f90c541b",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import datetime\n",
"from utils import fetch_clean_data\n",
"from openai import OpenAI\n",
"from IPython.display import Markdown, display\n",
"import json"
]
},
{
"cell_type": "markdown",
"id": "6e6c864b",
"metadata": {},
"source": [
"## Configuration"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "be62299d",
"metadata": {},
"outputs": [],
"source": [
"client = OpenAI(base_url='http://localhost:11434/v1', api_key = 'ollama')"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "3aa8e3e2",
"metadata": {},
"outputs": [],
"source": [
"def generate_markdown_summary(data: dict, today_date_str: str) -> str:\n",
" \"\"\"\n",
" Send cleaned Bitcoin data to an LLM and receive a Markdown summary.\n",
" \"\"\"\n",
"\n",
" system_prompt = f\"\"\"\n",
" You are a professional crypto analyst. Your job is to read the provided Bitcoin network data \n",
" and write a clear, structured report that can be read directly as a daily summary.\n",
"\n",
" Following are the rules that you must adhere to:\n",
" - **IMPORTANT**: The summary title MUST use today's date: {today_date_str}. The title must be: \"Bitcoin Daily Summary - {today_date_str}\".\n",
" - **CRITICAL**: Do NOT infer the reporting period from the data. The data contains historical records, but your report is for {today_date_str}.\n",
" - Include **headings** for sections like \"Market Overview\", \"Network Metrics Explained\", \"Miner Revenue Trends\", and \"Halving Outlook\".\n",
" - Use **bullet points** for key metrics.\n",
" - Use a **table** for historical or time-series data if available.\n",
" - Explain important terms (like hashrate, difficulty, transaction fees) in plain language.\n",
"\n",
" Respond in markdown. Do not wrap the markdown in a code block - respond just with the markdown.\n",
" \"\"\"\n",
"\n",
" # Convert the Python data dictionary into a clean JSON string for the prompt\n",
" data_str = json.dumps(data, indent=2)\n",
"\n",
" user_prompt = f\"\"\"\n",
" Today's date is {today_date_str}. Use this as the reference point for the report.\n",
"\n",
" The following data may contain historical records (e.g., from 2024), \n",
" but you must treat it as background context and write the summary as of {today_date_str}.\n",
"\n",
" Here is the data for you to summarize: \n",
" {data_str}\n",
" \"\"\"\n",
" \n",
" response = client.chat.completions.create(\n",
" model= \"llama3.2\", \n",
" messages=[\n",
" {\"role\": \"system\", \"content\": system_prompt},\n",
" {\"role\": \"user\", \"content\": user_prompt}\n",
" ]\n",
" )\n",
"\n",
" markdown_text = response.choices[0].message.content.strip()\n",
" return markdown_text"
]
},
{
"cell_type": "markdown",
"id": "1e8c2d7d",
"metadata": {},
"source": [
"## Main Function"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "05059ed9",
"metadata": {},
"outputs": [],
"source": [
"def main():\n",
" # 0. Get today's date as a string\n",
" today_str = datetime.datetime.now().strftime('%B %d, %Y')\n",
" \n",
" # 1. Fetch and clean data\n",
" print(\"Fetching Bitcoin data...\")\n",
" data = fetch_clean_data()\n",
"\n",
" # 2. Generate Markdown summary\n",
" print(\"Generating LLM summary...\")\n",
" markdown_report = generate_markdown_summary(data, today_str)\n",
"\n",
" # 3. Display Output\n",
" display(Markdown(markdown_report))\n",
"\n",
"if __name__ == \"__main__\":\n",
" main()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "llm-engineering",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,113 @@
# utils.py
import requests
import re
import datetime
import logging
from typing import Dict, Optional, Union
# -----------------------------------------
# Logging setup
# -----------------------------------------
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
# -----------------------------------------
# Braiins API endpoints (7 selected)
# -----------------------------------------
BRAIINS_APIS = {
'price_stats': 'https://insights.braiins.com/api/v1.0/price-stats',
'hashrate_stats': 'https://insights.braiins.com/api/v1.0/hashrate-stats',
'difficulty_stats': 'https://insights.braiins.com/api/v1.0/difficulty-stats',
'transaction_fees_history': 'https://insights.braiins.com/api/v1.0/transaction-fees-history',
'daily_revenue_history': 'https://insights.braiins.com/api/v1.0/daily-revenue-history',
'hashrate_value_history': 'https://insights.braiins.com/api/v1.0/hashrate-value-history',
'halvings': 'https://insights.braiins.com/api/v2.0/halvings'
}
# -----------------------------------------
# Utility Functions
# -----------------------------------------
def clean_value(value):
"""Clean strings, remove brackets/quotes and standardize whitespace."""
if value is None:
return ""
s = str(value)
s = s.replace(",", " ")
s = re.sub(r"[\[\]\{\}\(\)]", "", s)
s = s.replace('"', "").replace("'", "")
s = re.sub(r"\s+", " ", s)
return s.strip()
def parse_date(date_str: str) -> Optional[str]:
"""Parse dates into a standard readable format."""
if not date_str or not isinstance(date_str, str):
return None
try:
if 'T' in date_str:
return datetime.datetime.fromisoformat(date_str.replace('Z', '').split('.')[0]).strftime('%Y-%m-%d %H:%M:%S')
if '-' in date_str and len(date_str) == 10:
return datetime.datetime.strptime(date_str, '%Y-%m-%d').strftime('%Y-%m-%d %H:%M:%S')
if '/' in date_str and len(date_str) == 10:
return datetime.datetime.strptime(date_str, '%m/%d/%Y').strftime('%Y-%m-%d %H:%M:%S')
except Exception:
return date_str
return date_str
def fetch_endpoint_data(url: str) -> Optional[Union[Dict, list]]:
"""Generic GET request to Braiins API endpoint."""
try:
resp = requests.get(url, timeout=15)
resp.raise_for_status()
return resp.json()
except Exception as e:
logger.error(f"Failed to fetch {url}: {e}")
return None
def clean_and_process_data(data: Union[Dict, list]) -> Union[Dict, list]:
"""Clean all keys and values in the fetched data."""
if isinstance(data, dict):
return {clean_value(k): clean_value(v) for k, v in data.items()}
elif isinstance(data, list):
cleaned_list = []
for item in data:
if isinstance(item, dict):
cleaned_list.append({clean_value(k): clean_value(v) for k, v in item.items()})
else:
cleaned_list.append(clean_value(item))
return cleaned_list
return clean_value(data)
# -----------------------------------------
# Main data fetcher
# -----------------------------------------
def fetch_clean_data() -> Dict[str, Union[Dict, list]]:
"""
Fetch and clean data from 7 selected Braiins endpoints.
Returns a dictionary ready to be passed into an LLM.
"""
logger.info("Fetching Bitcoin network data from Braiins...")
results = {}
for key, url in BRAIINS_APIS.items():
logger.info(f"Fetching {key} ...")
raw_data = fetch_endpoint_data(url)
if raw_data is not None:
results[key] = clean_and_process_data(raw_data)
else:
results[key] = {"error": "Failed to fetch"}
logger.info("All data fetched and cleaned successfully.")
return results
# -----------------------------------------
# Local test run (optional)
# -----------------------------------------
if __name__ == "__main__":
data = fetch_clean_data()
print("Sample keys fetched:", list(data.keys()))