sov-ai¶
Local artificial intelligence - Ollama integration and inference
Sov-ai provides local LLM integration via Ollama with MCP tool exposure, enabling on-device AI without cloud dependencies. Control your AI, keep your data private.
What It Does¶
- Local LLM Integration - Run Ollama models locally
- MCP Tool Wrapping - Expose models as MCP tools
- Model Management - Download, update, switch models
- Inference API - REST interface for predictions
- Prompt Enhancement - Optimize prompts for local models
Key Capabilities¶
Local Model Support¶
- Qwen 2.5:7b - Fast local assistant
- Llama 3:8b - General purpose
- CodeLlama:7b - Code-focused
MCP Integration¶
- Tool Catalog - Available in herald marketplace
- Pricing - Free (no cloud costs)
- Performance - Sub-second latency
- Privacy - Data never leaves device
Model Management¶
- Download Models - From Ollama registry
- Version Control - Switch between models
- Performance Tuning - Optimize for hardware
- Resource Monitoring - Track CPU/memory
Inference Features¶
- Streaming - Real-time output
- Context Windows - 4K-16K context
- Temperature Control - Creativity settings
- Batch Processing - Parallel inference
Accessing sov-ai¶
URL: http://127.0.0.1:8017
Commands:
python manage.py list-models
python manage.py download-model qwen2.5:7b
python manage.py inference --prompt "Hello"
python manage.py ollama-health
Common Use Cases¶
Run Local Code Assistant¶
Use CodeLlama for code generation without cloud.
Summarize Documents¶
Local summarization with full context retention.
Custom Classifications¶
Train and run classifiers locally.
Private Inference¶
Sensitive data stays on device.
Troubleshooting¶
Ollama not running¶
Start Ollama service separately.
Model download fails¶
Check disk space and internet connection.
Slow inference¶
Try smaller model or reduce context window.
Related Realms¶
- herald - Exposes sov-ai tools in marketplace
- 402-payment - Sov-ai tools are free
- All realms - Can use sov-ai inference