How is CrossTalk different from a tenant_id filter?

Incumbent memory layers scope tenants by a metadata filter on one shared store; a misconfigured or forgotten scope leaves the other tenant's data physically present in the same index. CrossTalk isolates each tenant in its own Postgres schema owned by a per-tenant NOLOGIN role, backstopped by FORCE ROW LEVEL SECURITY and per-tenant AES-256-GCM envelope encryption, so a cross-tenant read raises InsufficientPrivilege at the database — there is no shared index to filter.

What does verify_isolation() prove?

It runs a live adversarial probe between two tenants and returns a compact Ed25519-signed detached-JWS manifest with a conclusive flag. It proves we ran these probes and got these results and this key policy denies cross-tenant decrypt. It is not a formal proof that no path can ever leak. You can re-run it against your own two tenants and re-check the signature against our public JWKS.

CTRR is Cross-Tenant Recall Rate; 0.0% is the only passing grade. On the shared store reproducing mem0 #3998, CTRR = 100.0% (leak). On hunta.ai's isolated backend, CTRR = 0.0% (pass, signed).

Is any real PHI used?

No. No real PHI is used anywhere on this site or in the benchmark — all markers are synthetic.

ATTESTABLE AGENT MEMORY · OVER MCP

Zero crosstalk
between tenants.

Attestable, per-tenant-isolated agent memory over MCP. Don't trust the filter — run the verifier and check the signed receipt.

Run verify_isolation Read the CrossTalk scorecard

CH_A · TENANT_A · CH_B · TENANT_B — toggle: shared store (metadata filter) ⇄ hunta isolated

CH_A · TENANT_A CH_B · TENANT_B

CTRR 0.0% probes 3/3 signed ✓ Ed25519 jwks.json: VERIFIED

CH_B · THE BLEED · dBµV

Every incumbent isolates tenants with a WHERE filter. Filters leak.

mem0, Zep and Supermemory scope tenants by metadata — a tenant_id / user_id / group_id filter on one shared store. When that scope is misconfigured or forgotten, the other tenant's data is still physically present in the same index. It doesn't fail loudly. It bleeds.

CH_A · TENANT — CLINIC SOP AGENT

REPRODUCED: mem0ai/mem0 #3998

One gateway, one shared userId, a healthcare-clinic-SOP agent and a consumer personal-assistant agent behind it. Ask the personal assistant "who do I contact after hours?" and the clinic's PHI marker comes back on the shared store.

⚠ Dr. Alvarez, 555-0142 — PHI marker crossing the seam

On the shared store, CTRR = 100.0%. LEAK.

CH_B · TENANT — PERSONAL ASSISTANT

EXPECTED SCOPE

The PA tenant should only ever see JANE_PREF ("Jane prefers vegan restaurants…"). Instead the dissolved seam lets the clinic's PHI ghost through — the filter looks like it works, and still leaks.

✗ scope violated — foreign PHI recalled

mem0 closed #3998 in PR #4245. The class is architectural, not a one-off — #5439 (cross-scope entity linking) is open and unpatched. A patch can fix one filter; it can never make a filter attestable.

CH_A · THE INSTRUMENT · ISOLATION

Isolation is structural, then attestable.

Tenant identity is never a request parameter. It is a signed tid claim inside the OAuth 2.1 access token — no tool accepts a tenant argument, so a caller can only ever touch its own tenant. Under that, three independent enforced layers:

CH_A · TENANT_A · ISOLATED

01 · PHYSICAL

One schema per tenant

Each tenant gets its own Postgres schema (t_<sha256[:16]>) owned by a per-tenant NOLOGIN role. Every op runs under SET ROLE with GRANTs scoped to that schema only. A cross-schema read raises InsufficientPrivilege at the database — there is no shared index to filter.

03 · CRYPTO

Per-tenant envelope encryption

Memory content is AES-256-GCM ciphertext under a per-tenant DEK with the tenant id as AAD, wrapped by a per-tenant KEK in a self-hosted KMS. Even a forced logical bypass returns bytes that decrypt to garbage without the peer tenant's key. Embeddings stay plaintext inside the tenant's own schema so ANN + BM25 still work.

CH_B · TENANT_B · ISOLATED

02 · LOGICAL

FORCE ROW LEVEL SECURITY

A FORCE RLS policy binds even the table owner to tenant = current_setting('app.tenant'). Defence-in-depth backstop — never the boundary, always the second wall.

00 · IDENTITY

Token-bound tenant, no argument

The tenant is the EdDSA-verified tid claim in an audience-bound OAuth 2.1 access token — sig + aud + iss + exp checked, alg:none and HS/RS confusion rejected. Any body or header tenant field is ignored and logged. There is no WHERE tenant_id to spoof or forget.

AMBER · THE RECEIPT · Ed25519

A proof you can re-run, not a chip you have to trust.

verify_isolation() runs a live adversarial probe between two tenants — writes a secret into tenant A, then as tenant B attempts recall plus a forged body {user_id:A}, and asserts cross_read_results == 0. It returns a compact Ed25519-signed detached-JWS manifest binding {tid → store_id → kms_key_id → schema → policy_hash → probe_result_hash → git_sha → ts}, with a conclusive flag so no probe can false-green on 0/0.

{ // detached JWS — verify_isolation() "tid": "9f2c…a11e", "schema": "t_9f2c1b77a3e0d411", "kms_key_id": "kek/9f2c…/v3", "policy_hash": 0x7a41c9e2, "probe_result_hash": 0x0000000000, "git_sha": "b41e0c2", "probes": ["grant_denied","rls_hidden","decrypt_blocked"], "cross_read_results": 0, "conclusive": true, "sig": "eyJhbGciOiJFZERTQSJ9..3xQ_Zk8" }

UNVERIFIED — signature not checked

press Verify to check the JWS against /.well-known/jwks.json

tamper: edit one manifest byte

Press Verify. The stamp live-checks the signature against /.well-known/jwks.json and flips from grey 'unverified' to amber 'VERIFIED ✓'. Edit one byte of the manifest and it goes grey again.

The strength is reproducibility, not our word: run verify_isolation against your OWN two tenants and hand the signed manifest to your auditor.

DEMO — this widget animates the JWKS check deterministically for illustration; the live endpoint is mcp.hunta.ai/.well-known/jwks.json.

OPEN SOURCE · APACHE-2.0 · hunta-ai/crosstalk

We red-teamed 3 agent-memory systems for cross-tenant leaks. Two failed the same way.

CrossTalk is a mechanism-agnostic, MCP-native benchmark harness where ONE parametrized test file turns RED on the shared store and GREEN on hunta-isolated from the same driver. Hermetic CI — pinned mem0 with infer=False, in-memory Qdrant, one deterministic offline embedder for every backend — so the only variable is isolation architecture. No network, no keys, no LLM.

mem0 · shared user_id (#3998)

CTRR 100.0%

LEAK — attest() == None

hunta.ai · schema-per-tenant + envelope crypto

CTRR 0.0%

PASS — signed ✓ Ed25519

✗✓ test_no_cross_tenant_phi_bleed — write CLINIC_SOP under one tenant, JANE_PREF under another; the PA query must return no 'Dr. Alvarez' bleed.

✗✓ test_scope_dump_is_single_tenant — get_all(user_id=GATEWAY_USER) returns BOTH tenants on the shared store; deterministic, ranking-free.

✗✓ test_agent_id_scope_does_not_rescue_mem0 — guards the 'just use agent_id' rebuttal (metadata-only, unindexed; ref #3773).

✗✓ test_isolation_is_attestable — mem0 attest() == None ⇒ FAIL; hunta returns a verifiable signed manifest ⇒ PASS. This is the whole wedge.

Method. CTRR = Cross-Tenant Recall Rate; 0.0% is the only passing grade, reported in two columns (correct-usage AND the #3998 misconfig — no strawmanning). Membership-inference AUC (0.5 = no leak, 1.0 = full leak) tracks the #5439 entity-merge ranking side-channel. These are the benchmark's METHOD, not yet-measured results.

Fairness is the moat. Pre-registered methodology, dual judges (Claude + GPT), published raw transcripts, correct-usage vs misconfig columns, steelmanned incumbents, only synthetic PHI. We invite their PRs.

SIDE BY SIDE · MECHANISM VS PROOF

hunta.ai vs the shared-store field.

	hunta.ai	mem0	Cognee	Supermemory
Tenant isolation mechanism	Schema-per-tenant + role GRANTs + FORCE RLS + per-tenant AES-256-GCM	Metadata scope (user_id filter, shared store)	Metadata / tenant scope on shared graph	Metadata scope (shared store)
Tenant identity source	Signed tid claim in OAuth 2.1 token — no tenant argument	Request parameter (spoofable)	Request parameter	Request parameter
Structural vs filter	Structural — no shared index to filter	Filter	Filter	Filter
Machine-checkable isolation proof	✓ Ed25519 signed attestation, re-runnable vs public JWKS	None (attest() == None)	None	None
Documented cross-tenant leak	0.0% CTRR in CrossTalk	#3998 (100% CTRR), #5439 open #3998 · 100% CTRR	Not benchmarked here	Not benchmarked here
Recall engine	Forked Graphiti (parity target, no superiority claim)	Own retrieval	Graph + vector	Own vector graph

We claim recall PARITY with the un-isolated fork (|ΔJ| ≤ 1.0 LLM-judge), not recall superiority. Incumbent mechanisms are quoted from their own public docs; 'Not benchmarked here' means we have not run it, not that it passes.

CH_A · WHY IT HOLDS · SIGNAL INTEGRITY

Six properties, each machine-checkable.

Tenant is a token claim, not a parameter

Identity is a signed tid inside an audience-bound OAuth 2.1 access token — EdDSA verified for sig + aud + iss + exp. Any body or header tenant field is ignored and logged. There is no WHERE tenant_id to spoof or forget.

Physically separate, not filtered

Per-tenant Postgres schema, own tables, own HNSW index, own tsvector, own NOLOGIN role. Cross-tenant retrieval cannot even be addressed — it raises at the database, it doesn't return the wrong rows.

Encrypted at rest, bound to the tenant

Content is AES-256-GCM ciphertext with the tenant id as AAD, keyed per-tenant. Blob-swap is blocked; a cross-tenant Decrypt returns AccessDenied and lands in the KMS audit log.

A verifier your auditor can run

verify_isolation() returns a live, signed, conclusive fault-injection proof. Re-run it against your own two tenants; re-check the signature against our published JWKS. Reproducibility over trust.

Honest about recall

Recall is a thin fork of Graphiti (Apache-2.0, BM25 + vector + graph + RRF, no query-time LLM). We target parity with the un-isolated fork — we fork recall, we do not claim to have solved it.

Fails closed

The server refuses to bind without env-provided auth / KEK / attest keys unless CROSSTALK_DEV=1 is set explicitly. Every read and write is logged — know what, who, and when.

SETUP · MCP + curl

Connect over MCP. Verify in one call.

Four MCP tools: remember, recall, verify_isolation, whoami — none takes a tenant argument. Apache-2.0. Live at mcp.hunta.ai.

1 · Discover

curl https://mcp.hunta.ai/.well-known/oauth-protected-resource

2 · Recall (tenant = your token's tid)

curl -H "Authorization: Bearer $TOKEN" \
  -d '{"tool":"recall","query":"after-hours contact"}' \
  https://mcp.hunta.ai/mcp

3 · Prove it

curl -H "Authorization: Bearer $TOKEN" \
  -d '{"tool":"verify_isolation"}' https://mcp.hunta.ai/mcp
# → { attempted_cross_read:true, results:0, jws:"eyJ…", conclusive:true }

4 · Re-run the benchmark yourself

pip install crosstalk-bench && crosstalk run --your-config
# RED on shared store, GREEN on isolated — same driver

TOKEN-METERED · tiktoken cl100k_base

Priced by the tokens your memory processes. Reproducible by you.

Free

$0/mo

1M store + 250k recall tokens/mo
1 tenant
verify_isolation attestation included
MCP + CrossTalk harness

Get a token

Starter

$19/mo

10M store + 2.5M recall tokens
Overage $2.00/1M store · $0.50/1M recall
Email support
OTEL export

Start

Attestation included

Pro

$79/mo

50M store + 15M recall tokens
Signed monthly isolation attestation
The deliverable a filter can't match
Priority support

Go Pro

Scale

$249/mo

200M store + 60M recall tokens
Multi-tenant fleet attestation
Audit-log retention
SLA

Scale up

Enterprise

Custom

BYOC / dedicated
SOC 2 / HIPAA path
1-yr attestation retention
Forward-deployed onboarding

Talk to us

Metering is best-effort and never on the critical path — Lago downtime never delays or fails remember/recall. Counts use tiktoken cl100k_base so you can reproduce your bill; verify_isolation and whoami are unmetered.

STATED UP FRONT · CALIBRATION

What the signed receipt does — and does not — prove.

A signed attestation proves 'we ran THESE probes and got THESE results, and this key policy denies cross-tenant decrypt.' It is not a formal proof that no path can ever leak.
We never market 'mathematically proven leak-free' beyond the specific crypto claim.
Plaintext lives in application RAM at query time — only a TEE removes that (Phase-2).
Embeddings are plaintext for ANN, mitigated by physical per-tenant schema separation.
Self-attestation is not a third-party audit. Lead with the runnable verifier — check it yourself.

Run the verifier against your own two tenants.

If the red doesn't collapse to a flat mint line and the signature doesn't check against our JWKS, don't believe us.

Run verify_isolation Clone hunta-ai/crosstalk

Zero crosstalkbetween tenants.