Protocol Deep-Dive: Scopes, Purpose Binding, and the Audit Log
Published 21 April 2026 · 12 min read
Why four layers, not one
The temptation in personal-context systems is to do one thing well: either encryption-at-rest, or scoped OAuth, or differential privacy. Each is necessary and none is sufficient. A vault that encrypts perfectly but returns too much data in queries has a soft side. A vault with rigorous scopes but no audit log cannot prove non-abuse after the fact. GeraMind layers all four deliberately, and every user query touches all of them.
Layer 1: the scope vocabulary
Scopes are the nouns of the vault. They describe what kind of fact is being requested. We use a two-level tree:
health/
conditions, medications, allergies, vitals, family-history
finance/
income-band, spending-categories, subscriptions, credit, debts
family/
members, relationships, birthdays, legal-status
work/
employer, role, projects, colleagues, calendar
travel/
itineraries, preferences, documents, loyalty
preferences/
dietary, communication, language, accessibility
documents/
passport, nid, utility-bills, contracts, tax-records
journal/
notes, reflections, moodScopes are not free-form strings. Every ingestion tags data against this vocabulary; every query requests a specific scope (or scopes). The vocabulary is versioned and evolves slowly — adding a scope is a minor-version change, removing one is a major-version change with migration notes.
Why not free-form scopes?
Free-form scopes (“agent:read anything about my health week”) are un-auditable in the general case. A fixed vocabulary trades expressiveness for verifiability. If an app wants to ask a question that does not map cleanly, the answer is to add a scope, not to sidestep the vocabulary.
Layer 2: purpose-bound tokens
Every query carries a token that binds the allowed scope set to a stated purpose. A purpose is a short, structured reason the query is being made:
{
"sub": "user:[email protected]",
"aud": "app:com.geraeats",
"scopes": ["preferences/dietary", "health/allergies"],
"purpose": "restaurant_booking",
"purpose_detail": "Selecting dishes compatible with shellfish allergy.",
"exp": 1745263200
}The purpose is visible to the user when they approve the token and travels with the query. The vault checks that the requested scope set is appropriate for the purpose — e.g. a “restaurant_booking” purpose cannot requestfinance/credit. This check uses a policy table maintained in the open and versioned independently.
Why bind to purpose?
GDPR purpose-limitation is not a nice-to-have, it is a legal principle. An app that obtains data for one purpose cannot lawfully repurpose it for another without a new consent. Making purpose a first-class field in every query forces the architecture to honour this; the audit log records purpose; and a misaligned purpose is a correctable bug, not a quiet data leak.
Layer 3: minimisation
The vault rarely returns rows. It returns the smallest useful answer. An agent asking “is Ruth allergic to shellfish?” gets true or falseor unknown — not her full allergy list. An agent asking “can Ruth afford £200 this month?” gets a boolean, not her bank balance. An agent asking “what are Ruth’s dietary preferences?” gets a short canonicalised list like [“vegetarian”, “no-alcohol”], not her 4-year journal.
The minimisation engine is a set of query-shaped functions rather than a generic SQL surface. We deliberately refuse to expose a generic query API because the minimisation guarantees rely on the narrow shape of the allowed questions.
What about cases where minimisation is not possible?
Some queries require raw content — “read me my recent journal entries.” These require a higher-friction consent step (face/biometric/hardware key), are narrowly time-bounded, and are always logged in the audit layer with a visible “raw content released” marker that the user can inspect.
Layer 4: the audit log
Every query, every token issuance, every minimisation output, every raw-content release goes into an append-only log. The log is hashed chain-wise (each entry contains the hash of the previous) so tampering is detectable. Users can:
- Browse the log from the dashboard.
- Export the log (JSON) for legal or personal use — this is part of the GDPR Art. 15 export.
- Receive a weekly digest of activity with unusual-pattern flags (e.g. a new app queried 47 times in one day).
- Revoke tokens or whole apps from the log view with a single action.
Why not a generic access log?
Access logs are usually ops tools, not user tools. The audit log in GeraMind is designed for the user as the primary consumer. Every entry is a sentence in English (plus structured JSON for export), and the visual presentation is built around action (revoke, inspect, complain).
How the four layers interact on one query
An agent asks “can Ruth eat shellfish?” on behalf of GeraEats:
- Agent presents a token with
scopes=[“health/allergies”],purpose=“restaurant_booking”. - Vault verifies the token signature, checks the purpose/scope compatibility policy.
- Vault routes to the
allergy.containsminimisation function, which returnstrue. - Audit log appends an entry: “com.geraeats asked about shellfish allergy for restaurant_booking; returned boolean true; token exp 2026-04-21T13:00Z.”
No raw row was exposed. The agent got the smallest useful answer. The user can see exactly what happened on their dashboard and revoke if they want to.
What we are still working on
Three live design questions for v0.2:
- Cross-scope inference attacks — can an adversary combinemany small answers to reconstruct a row? Mitigation research is ongoing; early answer is query-budget-per-app- per-scope.
- Derived data — if an agent computes something from a minimised answer and stores it, what obligations attach?
- Multi-vault federation — for users whose data lives in competing vaults (e.g. Apple, Google, Gera) how does purpose binding compose?
Where this lives
The draft spec is at geramind.com/spec. We publish changes early and iterate in public. Cross-product integrations are being designed alongside GeraNexus (the transactional-commerce layer) and GeraClinic (the first clinical health use case).
Help us design the vault.
Join the waitlist