sov-ai¶

Local artificial intelligence - Ollama integration and inference

Sov-ai provides local LLM integration via Ollama with MCP tool exposure, enabling on-device AI without cloud dependencies. Control your AI, keep your data private.

What It Does¶

Local LLM Integration - Run Ollama models locally
MCP Tool Wrapping - Expose models as MCP tools
Model Management - Download, update, switch models
Inference API - REST interface for predictions
Prompt Enhancement - Optimize prompts for local models

Key Capabilities¶

Local Model Support¶

Qwen 2.5:7b - Fast local assistant
Llama 3:8b - General purpose
CodeLlama:7b - Code-focused

MCP Integration¶

Tool Catalog - Available in herald marketplace
Pricing - Free (no cloud costs)
Performance - Sub-second latency
Privacy - Data never leaves device

Model Management¶

Download Models - From Ollama registry
Version Control - Switch between models
Performance Tuning - Optimize for hardware
Resource Monitoring - Track CPU/memory

Inference Features¶

Streaming - Real-time output
Context Windows - 4K-16K context
Temperature Control - Creativity settings
Batch Processing - Parallel inference

Accessing sov-ai¶

URL: http://127.0.0.1:8017

Commands:

python manage.py list-models
python manage.py download-model qwen2.5:7b
python manage.py inference --prompt "Hello"
python manage.py ollama-health

Common Use Cases¶

Run Local Code Assistant¶

Use CodeLlama for code generation without cloud.

Summarize Documents¶

Local summarization with full context retention.

Custom Classifications¶

Train and run classifiers locally.

Private Inference¶

Sensitive data stays on device.

Troubleshooting¶

Ollama not running¶

Start Ollama service separately.

Model download fails¶

Check disk space and internet connection.

Slow inference¶

Try smaller model or reduce context window.

herald - Exposes sov-ai tools in marketplace
402-payment - Sov-ai tools are free
All realms - Can use sov-ai inference