Neural Memory Pro¶
Your agent's brain, upgraded. From keyword search to semantic understanding.
Free Neural Memory is a complete, production-ready memory system — you never have to pay. But if your agent's brain is growing past 10K memories and you're noticing missed recalls, slow consolidation, or ballooning storage, Pro is the upgrade path.
One command. No migration. No breaking changes.
pip install neural-memory # Pro features included
nmem pro activate YOUR_LICENSE_KEY # activate with your key
All 52 free tools keep working. Your existing memories are preserved. Pro adds 3 new tools and upgrades the engine underneath.
New to Pro? Start with the Pro Quickstart Guide →
The Problem¶
Free Neural Memory stores memories as text and retrieves them by keyword matching (FTS5 BM25). It works great for small brains (<10K memories). But as your agent accumulates knowledge:
- Recall degrades — keyword search misses semantically related memories that use different words
- Consolidation slows — O(N²) pairwise comparison crawls past 1K neurons
- Storage grows unbounded — every neuron keeps full-precision vectors forever
Neural Memory Pro fixes all three.
What Changes¶
Before (Free — SQLite)¶
User: "What did we decide about the auth system?"
Agent: [FTS5 search: "decide" AND "auth" AND "system"]
→ Finds 2 of 7 relevant memories (keyword match only)
After (Pro — InfinityDB)¶
User: "What did we decide about the auth system?"
Agent: [HNSW vector search: semantic embedding of query]
→ Finds 7 of 7 relevant memories (meaning match)
→ Ranked by: similarity × 0.7 + activation × 0.3
The difference isn't speed. It's recall quality. Your agent remembers by meaning, not by words.
Feature Comparison¶
| Free (SQLite) | Pro (InfinityDB) | |
|---|---|---|
| Recall method | Keyword match (FTS5 BM25) | Semantic similarity (HNSW) |
| Search speed | ~500ms at 10K neurons | ~5ms at 10K neurons |
| Search quality | Exact/fuzzy word match | Meaning-based match |
| Scale tested | ~50K neurons | 2M+ neurons |
| Vector storage | Not stored | Persistent mmap on disk |
| Compression | Text-level (sentence trimming) | Vector-level (5-tier adaptive) |
| Consolidation | O(N²) brute-force | O(N×k) HNSW neighbor clustering |
| Graph traversal | SQL JOINs per hop | Native adjacency BFS, <1ms |
| Crash recovery | SQLite WAL | Custom WAL + idempotent replay |
| MCP tools | 55 tools | 52 + 3 Pro-exclusive |
| Storage per 1M neurons | ~5 GB | ~1 GB (with tier compression) |
Pro-Exclusive Features¶
1. Cone Queries — Semantic Recall¶
Instead of matching keywords, Cone Queries search a semantic cone around your query embedding.
nmem_cone_query(
query="authentication architecture decisions",
threshold=0.7, # 0.65 = wide cone, 0.95 = narrow
max_results=10
)
How scoring works:
A memory can be semantically relevant (high similarity) even if rarely accessed. A frequently accessed memory (high activation) gets a boost even at moderate similarity. Both signals matter.
Threshold controls precision:
| Threshold | Behavior | Use case |
|---|---|---|
| 0.60–0.70 | Wide cone — more results, broader context | Exploration, brainstorming |
| 0.75–0.85 | Balanced — relevant results | Default recall |
| 0.90–0.95 | Narrow cone — only near-exact matches | Precise lookup |
2. Directional Compression — Smarter Memory Trimming¶
Free compression cuts sentences by entity density — it doesn't know which direction matters.
Pro uses multi-axis directional compression: it scores each sentence against multiple semantic directions (the memory's own embedding + up to 3 related neuron embeddings), keeping sentences that preserve all relevant directions.
Example:
A memory about "React performance optimization" relates to both React AND performance. Free compression might keep only React-heavy sentences (losing the performance angle). Pro keeps sentences scoring high on both axes.
Compression levels:
| Level | Content kept | When applied |
|---|---|---|
| FULL | 100% | Active memories |
| SUMMARY | 66% | Warm tier (7-30 days) |
| ESSENCE | 33% | Cool tier (30-90 days) |
| GHOST | First 5 words | Frozen tier (>90 days) |
3. Smart Merge — Consolidation That Scales¶
Free consolidation compares every pair of neurons → O(N²). At 5K neurons, that's 12.5M comparisons. Slow.
Pro uses HNSW neighbor search to find merge candidates in O(N×k):
- For each neuron, find k=10 nearest HNSW neighbors
- Mutual similarity check: A is near B AND B is near A (threshold: 0.82)
- Cluster mutually similar neurons together
- Rank by priority × activation → highest becomes anchor
- Merge low-ranked into anchor, preserving unique information
nmem_pro_merge(
similarity_threshold=0.85,
max_merges=20,
dry_run=true # Preview before committing
)
Scale comparison:
| Neurons | Free (brute-force) | Pro (HNSW) |
|---|---|---|
| 1,000 | ~2s | ~0.1s |
| 10,000 | ~3 min | ~1s |
| 100,000 | ~5 hours | ~10s |
4. Five-Tier Vector Compression — Automatic Lifecycle¶
Memories age. Pro automatically manages their storage footprint:
ACTIVE float32 1,536 bytes/neuron Recently accessed, high priority
↓
WARM float16 768 bytes 7-30 days old (-50%)
↓
COOL int8 384 bytes 30-90 days old (-75%)
↓
FROZEN binary 48 bytes >90 days old (-97%)
↓
CRYSTAL metadata 0 bytes Archived, vector gone (-100%)
Smart rules prevent over-compression:
- Priority ≥ 8 → always ACTIVE (critical memories never compress)
- Recent access → auto-promote back to higher tier
- Ephemeral memories → CRYSTAL immediately (scratch notes don't waste space)
Storage impact at scale:
| Brain size | Free (all float32) | Pro (mixed tiers) | Savings |
|---|---|---|---|
| 10K neurons | 15 MB | 5 MB | 67% |
| 100K neurons | 150 MB | 25 MB | 83% |
| 1M neurons | 1.5 GB | 120 MB | 92% |
5. InfinityDB Engine — Purpose-Built for Neural Graphs¶
Not a wrapper around SQLite. A custom storage engine designed for one job: neural memory graphs.
Architecture:
brain.inf Header (magic + version + dimensions + count)
brain.vec Memory-mapped vectors (numpy mmap, zero-copy read)
brain.idx HNSW index (hnswlib, M=16, ef=200)
brain.graph Directed synapse edges (msgpack adjacency lists)
brain.meta Neuron metadata (msgpack, O(1) ID lookup)
brain.fibers Fiber collections (msgpack, bidirectional index)
brain.wal Write-ahead log (max 50MB, idempotent replay)
Why this matters:
- Zero SQL overhead — binary access, no query parsing
- Memory-mapped vectors — OS handles caching, zero-copy reads
- HNSW index — O(log N) approximate nearest neighbor, not O(N) scan
- Crash-safe — WAL with idempotent replay, survives mid-write interruptions
- Batch operations — vectorized bulk insert + search
Pro MCP Tools¶
Three new tools added to your agent's toolkit:
nmem_cone_query¶
Semantic search with adjustable precision cone.
Returns: neuron_id, content, similarity, activation, combined_score, type.
nmem_tier_info¶
View and manage storage tier distribution.
{ "action": "stats" }
→ { "tiers": { "active": 150, "warm": 200, "cool": 500, "frozen": 1000 },
"estimated_savings_bytes": 1240000 }
{ "action": "sweep" }
→ Demotes stale memories to lower tiers automatically
nmem_pro_merge¶
Smart consolidation with preview mode.
{
"similarity_threshold": 0.9,
"dry_run": true,
"max_merges": 50
}
→ { "clusters_found": 25, "merge_actions": 18, "details": [...] }
Installation¶
One command. No config changes needed.
Pro features are bundled in the main package — all dependencies included. Just activate your license key. Your existing 56 MCP tools keep working unchanged. Three new tools appear automatically.
To enable InfinityDB (semantic search engine), set storage_backend = "infinitydb" in your config.toml. On next startup, existing memories are auto-migrated from SQLite. Both databases coexist — downgrade is safe.
Downgrade is safe: remove Pro dependencies and everything reverts to free SQLite. No data loss — SQLite database is preserved alongside InfinityDB files.
Who Should Upgrade¶
| Use case | Free is enough | Pro is better |
|---|---|---|
| Personal agent, <5K memories | ✅ | |
| Keyword recall is sufficient | ✅ | |
| Single project, light usage | ✅ | |
| Growing brain, >10K memories | ✅ | |
| Need semantic recall (meaning, not words) | ✅ | |
| Storage is a concern | ✅ | |
| Frequent consolidation | ✅ | |
| Team/production deployment | ✅ |
Pricing¶
Free — $0 forever¶
Everything you have today. 56 MCP tools, SQLite storage, spreading activation, 14 consolidation strategies, FTS5 search, cloud sync (100 neurons). No features removed, ever.
Pro — $9/month (219,000 VND)¶
All free features plus:
- InfinityDB engine (HNSW vector search)
- Cone Queries (semantic recall)
- Smart Merge (scalable consolidation)
- Directional Compression (multi-axis)
- 5-tier vector compression (auto lifecycle)
- 3 Pro MCP tools
- Unlimited cloud sync
- Priority support
Team — $29/month per seat (719,000 VND)¶
Everything in Pro plus:
- Shared brain hub (team knowledge graph)
- Brain-to-brain sync
- Role-based access control
- Audit log
- Self-hosted option
Payment Methods¶
International¶
Powered by Polar.sh — GitHub-native checkout.
- Credit/debit card (Visa, Mastercard, Amex)
- GitHub Sponsors integration
- Annual billing: 2 months free
Vietnam¶
Powered by Sepay — local payment gateway.
- Bank transfer (QR code — Vietcombank, Techcombank, MB Bank, etc.)
- MoMo, ZaloPay, VNPay
- Annual billing: 2 months free
FAQ¶
Does Pro require an internet connection? No. InfinityDB runs 100% locally. Cloud sync is optional.
Will my existing memories transfer?
Yes. After you enable InfinityDB (storage_backend = "infinitydb" in config.toml), SQLite data is automatically migrated on next startup. Both databases coexist — nothing is deleted.
What happens if I cancel Pro? Your agent falls back to free SQLite storage. InfinityDB files remain on disk (in case you resubscribe). No data loss.
Is Pro open source? Yes. The Pro plugin is open source on GitHub. InfinityDB engine code is included — no black boxes. A license key unlocks Pro features.
Can I self-host the sync hub? Team plan includes a Docker image for self-hosted deployment. Pro plan uses the managed Cloudflare hub.
How do I verify Pro is active?
Technical Specifications¶
| Spec | Value |
|---|---|
| Python | ≥ 3.11 |
| Vector dimensions | 384 (default, configurable) |
| HNSW params | M=16, ef_construction=200 |
| Max WAL size | 50 MB |
| Max BFS traversal | 1,000 nodes |
| Max cone results | 500 |
| Batch insert | Vectorized, atomic rollback |
| File format | Custom binary (.inf, .vec, .idx, .graph, .meta, .fibers, .wal) |
| Crash recovery | Idempotent WAL replay |
Neural Memory is MIT licensed. Pro is a paid add-on — the free version is complete and fully functional on its own.