🧠 KnowledgeHub - Personal Knowledge Management & Research Assistant
An elegant, fully local AI-powered knowledge management system that helps you organize, search, and understand your documents using state-of-the-art LLM technology.
✨ Features
🎯 Core Capabilities
- 📤 Document Ingestion: Upload PDF, DOCX, TXT, MD, and HTML files
- ❓ Intelligent Q&A: Ask questions and get answers from your documents using RAG
- 📝 Smart Summarization: Generate concise summaries with key points
- 🔗 Connection Discovery: Find relationships between documents
- 💾 Multi-format Export: Export as Markdown, HTML, or plain text
- 📊 Statistics Dashboard: Track your knowledge base growth
🔒 Privacy-First
- 100% Local Processing: All data stays on your machine
- No Cloud Dependencies: Uses Ollama for local LLM inference
- Open Source: Full transparency and control
⚡ Technology Stack
- LLM: Ollama with Llama 3.2 (3B) or Llama 3.1 (8B)
- Embeddings: sentence-transformers (all-MiniLM-L6-v2)
- Vector Database: ChromaDB
- UI: Gradio
- Document Processing: pypdf, python-docx, beautifulsoup4
🚀 Quick Start
Prerequisites
- Python 3.8+ installed
- Ollama installed and running
Installing Ollama
macOS/Linux:
curl -fsSL https://ollama.com/install.sh | sh
Windows: Download from ollama.com/download
Installation
-
Clone or download this repository
-
Install Python dependencies:
pip install -r requirements.txt
- Pull Llama model using Ollama:
# For faster inference (recommended for most users)
ollama pull llama3.2
# OR for better quality (requires more RAM)
ollama pull llama3.1
- Start Ollama server (if not already running):
ollama serve
- Launch KnowledgeHub:
python app.py
The application will open in your browser at http://127.0.0.1:7860
📖 Usage Guide
1. Upload Documents
- Go to the "Upload Documents" tab
- Select a file (PDF, DOCX, TXT, MD, or HTML)
- Click "Upload & Process"
- The document will be chunked and stored in your local vector database
2. Ask Questions
- Go to the "Ask Questions" tab
- Type your question in natural language
- Adjust the number of sources to retrieve (default: 5)
- Click "Ask" to get an AI-generated answer with sources
3. Summarize Documents
- Go to the "Summarize" tab
- Select a document from the dropdown
- Click "Generate Summary"
- Get a concise summary with key points
4. Find Connections
- Go to the "Find Connections" tab
- Select a document to analyze
- Adjust how many related documents to find
- See documents that are semantically similar
5. Export Knowledge
- Go to the "Export" tab
- Choose your format (Markdown, HTML, or Text)
- Click "Export" to download your knowledge base
6. View Statistics
- Go to the "Statistics" tab
- See overview of your knowledge base
- Track total documents, chunks, and characters
🏗️ Architecture
KnowledgeHub/
├── agents/ # Specialized AI agents
│ ├── base_agent.py # Base class for all agents
│ ├── ingestion_agent.py # Document processing
│ ├── question_agent.py # RAG-based Q&A
│ ├── summary_agent.py # Summarization
│ ├── connection_agent.py # Finding relationships
│ └── export_agent.py # Exporting data
├── models/ # Data models
│ ├── document.py # Document structures
│ └── knowledge_graph.py # Graph structures
├── utils/ # Utilities
│ ├── ollama_client.py # Ollama API wrapper
│ ├── embeddings.py # Embedding generation
│ └── document_parser.py # File parsing
├── vectorstore/ # ChromaDB storage (auto-created)
├── temp_uploads/ # Temporary file storage (auto-created)
├── app.py # Main Gradio application
└── requirements.txt # Python dependencies
🎯 Multi-Agent Framework
KnowledgeHub uses a sophisticated multi-agent architecture:
- Ingestion Agent: Parses documents, creates chunks, generates embeddings
- Question Agent: Retrieves relevant context and answers questions
- Summary Agent: Creates concise summaries and extracts key points
- Connection Agent: Finds semantic relationships between documents
- Export Agent: Formats and exports knowledge in multiple formats
Each agent is independent, reusable, and focused on a specific task, following best practices in agentic AI development.
⚙️ Configuration
Changing Models
Edit app.py to use different models:
# For Llama 3.1 8B (better quality, more RAM)
self.llm_client = OllamaClient(model="llama3.1")
# For Llama 3.2 3B (faster, less RAM)
self.llm_client = OllamaClient(model="llama3.2")
Adjusting Chunk Size
Edit agents/ingestion_agent.py:
self.parser = DocumentParser(
chunk_size=1000, # Characters per chunk
chunk_overlap=200 # Overlap between chunks
)
Changing Embedding Model
Edit app.py:
self.embedding_model = EmbeddingModel(
model_name="sentence-transformers/all-MiniLM-L6-v2"
)
🔧 Troubleshooting
"Cannot connect to Ollama"
- Ensure Ollama is installed:
ollama --version - Start the Ollama service:
ollama serve - Verify the model is pulled:
ollama list
"Module not found" errors
- Ensure all dependencies are installed:
pip install -r requirements.txt - Try upgrading pip:
pip install --upgrade pip
"Out of memory" errors
- Use Llama 3.2 (3B) instead of Llama 3.1 (8B)
- Reduce chunk_size in document parser
- Process fewer documents at once
Slow response times
- Ensure you're using a CUDA-enabled GPU (if available)
- Reduce the number of retrieved chunks (top_k parameter)
- Use a smaller model (llama3.2)
🎓 Learning Resources
This project demonstrates key concepts in LLM engineering:
- RAG (Retrieval Augmented Generation): Combining retrieval with generation
- Vector Databases: Using ChromaDB for semantic search
- Multi-Agent Systems: Specialized agents working together
- Embeddings: Semantic representation of text
- Local LLM Deployment: Using Ollama for privacy-focused AI
📊 Performance
Hardware Requirements:
- Minimum: 8GB RAM, CPU
- Recommended: 16GB RAM, GPU (NVIDIA with CUDA)
- Optimal: 32GB RAM, GPU (RTX 3060 or better)
Processing Speed (Llama 3.2 on M1 Mac):
- Document ingestion: ~2-5 seconds per page
- Question answering: ~5-15 seconds
- Summarization: ~10-20 seconds
🤝 Contributing
This is a learning project showcasing LLM engineering principles. Feel free to:
- Experiment with different models
- Add new agents for specialized tasks
- Improve the UI
- Optimize performance
📄 License
This project is open source and available for educational purposes.
🙏 Acknowledgments
Built with:
- Ollama - Local LLM runtime
- Gradio - UI framework
- ChromaDB - Vector database
- Sentence Transformers - Embeddings
- Llama - Meta's open source LLMs
🎯 Next Steps
Potential enhancements:
- Add support for images and diagrams
- Implement multi-document chat history
- Build a visual knowledge graph
- Add collaborative features
- Create mobile app interface
- Implement advanced filters and search
- Add citation tracking
- Create automated study guides
Made with ❤️ for the LLM Engineering Community