Neural Memory Pro¶

Your agent's brain, upgraded. From keyword search to semantic understanding.

Free Neural Memory is a complete, production-ready memory system — you never have to pay. But if your agent's brain is growing past 10K memories and you're noticing missed recalls, slow consolidation, or ballooning storage, Pro is the upgrade path.

One command. No migration. No breaking changes.

pip install neural-memory              # Pro features included
nmem pro activate YOUR_LICENSE_KEY    # activate with your key

All 52 free tools keep working. Your existing memories are preserved. Pro adds 3 new tools and upgrades the engine underneath.

New to Pro? Start with the Pro Quickstart Guide →

The Problem¶

Free Neural Memory stores memories as text and retrieves them by keyword matching (FTS5 BM25). It works great for small brains (<10K memories). But as your agent accumulates knowledge:

Recall degrades — keyword search misses semantically related memories that use different words
Consolidation slows — O(N²) pairwise comparison crawls past 1K neurons
Storage grows unbounded — every neuron keeps full-precision vectors forever

Neural Memory Pro fixes all three.

What Changes¶

Before (Free — SQLite)¶

User: "What did we decide about the auth system?"
Agent: [FTS5 search: "decide" AND "auth" AND "system"]
       → Finds 2 of 7 relevant memories (keyword match only)

After (Pro — InfinityDB)¶

User: "What did we decide about the auth system?"
Agent: [HNSW vector search: semantic embedding of query]
       → Finds 7 of 7 relevant memories (meaning match)
       → Ranked by: similarity × 0.7 + activation × 0.3

The difference isn't speed. It's recall quality. Your agent remembers by meaning, not by words.

Feature Comparison¶

	Free (SQLite)	Pro (InfinityDB)
Recall method	Keyword match (FTS5 BM25)	Semantic similarity (HNSW)
Search speed	~500ms at 10K neurons	~5ms at 10K neurons
Search quality	Exact/fuzzy word match	Meaning-based match
Scale tested	~50K neurons	2M+ neurons
Vector storage	Not stored	Persistent mmap on disk
Compression	Text-level (sentence trimming)	Vector-level (5-tier adaptive)
Consolidation	O(N²) brute-force	O(N×k) HNSW neighbor clustering
Graph traversal	SQL JOINs per hop	Native adjacency BFS, <1ms
Crash recovery	SQLite WAL	Custom WAL + idempotent replay
MCP tools	55 tools	52 + 3 Pro-exclusive
Storage per 1M neurons	~5 GB	~1 GB (with tier compression)

Pro-Exclusive Features¶

1. Cone Queries — Semantic Recall¶

Instead of matching keywords, Cone Queries search a semantic cone around your query embedding.

nmem_cone_query(
  query="authentication architecture decisions",
  threshold=0.7,    # 0.65 = wide cone, 0.95 = narrow
  max_results=10
)

How scoring works:

combined_score = similarity × 0.7 + activation_level × 0.3

A memory can be semantically relevant (high similarity) even if rarely accessed. A frequently accessed memory (high activation) gets a boost even at moderate similarity. Both signals matter.

Threshold controls precision:

Threshold	Behavior	Use case
0.60–0.70	Wide cone — more results, broader context	Exploration, brainstorming
0.75–0.85	Balanced — relevant results	Default recall
0.90–0.95	Narrow cone — only near-exact matches	Precise lookup

2. Directional Compression — Smarter Memory Trimming¶

Free compression cuts sentences by entity density — it doesn't know which direction matters.

Pro uses multi-axis directional compression: it scores each sentence against multiple semantic directions (the memory's own embedding + up to 3 related neuron embeddings), keeping sentences that preserve all relevant directions.

Example:

A memory about "React performance optimization" relates to both React AND performance. Free compression might keep only React-heavy sentences (losing the performance angle). Pro keeps sentences scoring high on both axes.

Score per sentence = primary_similarity × 0.6 + max(reference_similarities) × 0.4

Compression levels:

Level	Content kept	When applied
FULL	100%	Active memories
SUMMARY	66%	Warm tier (7-30 days)
ESSENCE	33%	Cool tier (30-90 days)
GHOST	First 5 words	Frozen tier (>90 days)

3. Smart Merge — Consolidation That Scales¶

Free consolidation compares every pair of neurons → O(N²). At 5K neurons, that's 12.5M comparisons. Slow.

Pro uses HNSW neighbor search to find merge candidates in O(N×k):

For each neuron, find k=10 nearest HNSW neighbors
Mutual similarity check: A is near B AND B is near A (threshold: 0.82)
Cluster mutually similar neurons together
Rank by priority × activation → highest becomes anchor
Merge low-ranked into anchor, preserving unique information

nmem_pro_merge(
  similarity_threshold=0.85,
  max_merges=20,
  dry_run=true        # Preview before committing
)

Scale comparison:

Neurons	Free (brute-force)	Pro (HNSW)
1,000	~2s	~0.1s
10,000	~3 min	~1s
100,000	~5 hours	~10s

4. Five-Tier Vector Compression — Automatic Lifecycle¶

Memories age. Pro automatically manages their storage footprint:

ACTIVE   float32   1,536 bytes/neuron   Recently accessed, high priority
   ↓
WARM     float16     768 bytes          7-30 days old           (-50%)
   ↓
COOL     int8        384 bytes          30-90 days old          (-75%)
   ↓
FROZEN   binary       48 bytes          >90 days old            (-97%)
   ↓
CRYSTAL  metadata      0 bytes          Archived, vector gone   (-100%)

Smart rules prevent over-compression:

Priority ≥ 8 → always ACTIVE (critical memories never compress)
Recent access → auto-promote back to higher tier
Ephemeral memories → CRYSTAL immediately (scratch notes don't waste space)

Storage impact at scale:

Brain size	Free (all float32)	Pro (mixed tiers)	Savings
10K neurons	15 MB	5 MB	67%
100K neurons	150 MB	25 MB	83%
1M neurons	1.5 GB	120 MB	92%

5. InfinityDB Engine — Purpose-Built for Neural Graphs¶

Not a wrapper around SQLite. A custom storage engine designed for one job: neural memory graphs.

Architecture:

brain.inf      Header (magic + version + dimensions + count)
brain.vec      Memory-mapped vectors (numpy mmap, zero-copy read)
brain.idx      HNSW index (hnswlib, M=16, ef=200)
brain.graph    Directed synapse edges (msgpack adjacency lists)
brain.meta     Neuron metadata (msgpack, O(1) ID lookup)
brain.fibers   Fiber collections (msgpack, bidirectional index)
brain.wal      Write-ahead log (max 50MB, idempotent replay)

Why this matters:

Zero SQL overhead — binary access, no query parsing
Memory-mapped vectors — OS handles caching, zero-copy reads
HNSW index — O(log N) approximate nearest neighbor, not O(N) scan
Crash-safe — WAL with idempotent replay, survives mid-write interruptions
Batch operations — vectorized bulk insert + search

Pro MCP Tools¶

Three new tools added to your agent's toolkit:

`nmem_cone_query`¶

Semantic search with adjustable precision cone.

{
  "query": "what helps with memory retention",
  "threshold": 0.7,
  "max_results": 10
}

Returns: neuron_id, content, similarity, activation, combined_score, type.

`nmem_tier_info`¶

View and manage storage tier distribution.

{ "action": "stats" }
→ { "tiers": { "active": 150, "warm": 200, "cool": 500, "frozen": 1000 },
    "estimated_savings_bytes": 1240000 }

{ "action": "sweep" }
→ Demotes stale memories to lower tiers automatically

`nmem_pro_merge`¶

Smart consolidation with preview mode.

{
  "similarity_threshold": 0.9,
  "dry_run": true,
  "max_merges": 50
}
→ { "clusters_found": 25, "merge_actions": 18, "details": [...] }

Installation¶

One command. No config changes needed.

pip install neural-memory
nmem pro activate YOUR_LICENSE_KEY

Pro features are bundled in the main package — all dependencies included. Just activate your license key. Your existing 56 MCP tools keep working unchanged. Three new tools appear automatically.

To enable InfinityDB (semantic search engine), set storage_backend = "infinitydb" in your config.toml. On next startup, existing memories are auto-migrated from SQLite. Both databases coexist — downgrade is safe.

Downgrade is safe: remove Pro dependencies and everything reverts to free SQLite. No data loss — SQLite database is preserved alongside InfinityDB files.

Who Should Upgrade¶

Use case	Free is enough	Pro is better
Personal agent, <5K memories	✅
Keyword recall is sufficient	✅
Single project, light usage	✅
Growing brain, >10K memories		✅
Need semantic recall (meaning, not words)		✅
Storage is a concern		✅
Frequent consolidation		✅
Team/production deployment		✅

Pricing¶

Free — $0 forever¶

Everything you have today. 56 MCP tools, SQLite storage, spreading activation, 14 consolidation strategies, FTS5 search, cloud sync (100 neurons). No features removed, ever.

Pro — $9/month (219,000 VND)¶

All free features plus:

InfinityDB engine (HNSW vector search)
Cone Queries (semantic recall)
Smart Merge (scalable consolidation)
Directional Compression (multi-axis)
5-tier vector compression (auto lifecycle)
3 Pro MCP tools
Unlimited cloud sync
Priority support

Team — $29/month per seat (719,000 VND)¶

Everything in Pro plus:

Shared brain hub (team knowledge graph)
Brain-to-brain sync
Role-based access control
Audit log
Self-hosted option

Payment Methods¶

International¶

Powered by Polar.sh — GitHub-native checkout.

Credit/debit card (Visa, Mastercard, Amex)
GitHub Sponsors integration
Annual billing: 2 months free

Vietnam¶

Powered by Sepay — local payment gateway.

Bank transfer (QR code — Vietcombank, Techcombank, MB Bank, etc.)
MoMo, ZaloPay, VNPay
Annual billing: 2 months free

FAQ¶

Does Pro require an internet connection? No. InfinityDB runs 100% locally. Cloud sync is optional.

Will my existing memories transfer? Yes. After you enable InfinityDB (storage_backend = "infinitydb" in config.toml), SQLite data is automatically migrated on next startup. Both databases coexist — nothing is deleted.

What happens if I cancel Pro? Your agent falls back to free SQLite storage. InfinityDB files remain on disk (in case you resubscribe). No data loss.

Is Pro open source? Yes. The Pro plugin is open source on GitHub. InfinityDB engine code is included — no black boxes. A license key unlocks Pro features.

Can I self-host the sync hub? Team plan includes a Docker image for self-hosted deployment. Pro plan uses the managed Cloudflare hub.

How do I verify Pro is active?

nmem pro status
# or via MCP:
nmem_tier_info(action="stats")

Technical Specifications¶

Spec	Value
Python	≥ 3.11
Vector dimensions	384 (default, configurable)
HNSW params	M=16, ef_construction=200
Max WAL size	50 MB
Max BFS traversal	1,000 nodes
Max cone results	500
Batch insert	Vectorized, atomic rollback
File format	Custom binary (.inf, .vec, .idx, .graph, .meta, .fibers, .wal)
Crash recovery	Idempotent WAL replay

Neural Memory is MIT licensed. Pro is a paid add-on — the free version is complete and fully functional on its own.