# Thinkn — Belief State Infrastructure for AI Agents > Full doc bundle for AI ingestion. For per-page markdown, fetch https://thinkn.ai/dev/.md. > Live HTML docs: https://thinkn.ai/dev > Index with section links: https://thinkn.ai/llms.txt --- # Quick Reference (curated) # beliefs SDK — LLM Reference > Belief state infrastructure for AI agents. npm package: `beliefs` > Docs: https://thinkn.ai/dev > Hack Guide: https://thinkn.ai/dev/tutorial/hack-guide > Manifesto: https://thinkn.ai/dev/why/index This file is a single-read reference for coding agents. It explains why belief state exists, how it differs from memory/RAG, the mental model you need to use the SDK well, the full API surface, and how to wire it into common agent frameworks. ## Why beliefs matter ### Every breakthrough begins as a belief Progress does not begin with certainty. It begins with a view of reality that is incomplete, testable, and worth pursuing. But the current AI stack is not built for that. It stores documents, retrieves memories, and generates language. It does not maintain a living model of what is currently believed to be true, why it is believed, how strongly, what evidence supports it, where it conflicts, and what should change next. The `beliefs` SDK is that missing layer. It gives your agent a structured model of its current understanding: claims with confidence scores, conflict detection, gap awareness, a clarity score (0-1 readiness to act), and ranked next actions by expected information gain. Every transition is recorded with provenance. ### Scaling the past vs. discovering the unknown ``` ┌──────────────────────────────────────────────────────────────┐ │ SCALING THE PAST │ │ │ │ Memory ──────▶ "What happened before?" │ │ Retrieval ───▶ "What text is similar?" │ │ Generation ──▶ "What can I produce from this?" │ │ │ │ These scale what is already known. │ │ They do not model what is currently believed. │ │ They cannot surface what has never been seen. │ │ │ ├──────────────────────────────────────────────────────────────┤ │ DISCOVERING THE UNKNOWN │ │ │ │ Beliefs ─────▶ "What is true? How strongly? Why?" │ │ Evidence ────▶ "What supports or contradicts it?" │ │ Gaps ────────▶ "What do we not know?" │ │ Clarity ─────▶ "Are we ready to act or must we look │ │ deeper?" │ │ │ │ These model the present understanding of reality. │ │ They evolve as evidence changes. │ │ They surface what was previously invisible. │ └──────────────────────────────────────────────────────────────┘ ``` Without an explicit way to model and update beliefs, AI does not scale truth. It scales whatever assumptions it happened to start with. It scales inherited frames. It scales contradiction. It scales drift. As intelligence becomes abundant, coherence becomes scarce — and some beliefs are load-bearing. If the pricing model, the fundraising plan, and the diagnostic recommendation all rest on one unexamined assumption, the cost of that assumption being wrong is everything built on top of it. ### The five symptoms of drift These are what drift looks like in practice — the visible failures of systems that accumulate information without modeling what they believe: 1. **Agents contradict themselves.** Turn 3: "The market is $4.2B." Turn 12: "SEC filings suggest $3.8B." Turn 18: the agent cites $4.2B because it appeared first. No detection, no resolution, no awareness. 2. **Confidence is invisible.** The agent stated a number. Is it from one source or ten? Is it corroborated or contested? The context window does not encode this. Every piece of text looks equally valid. 3. **Guesses and facts are indistinguishable.** A user's intuition and a peer-reviewed study carry identical weight. There is no distinction between assumption and evidence. 4. **Agents do not know what they do not know.** No concept of "gap." No awareness that critical data is missing. No mechanism to prioritize what would reduce the most uncertainty. 5. **Bigger context makes it worse.** A 200K context window does not fix these problems. It carries stale assumptions further, with more fluency. More context is more surface area for drift. Belief state infrastructure is how we fix this. A shared layer where assumptions, evidence, confidence, contradictions, and decisions stay in sync, so humans and AI can think more clearly, adapt more honestly, and push together toward what has not yet been seen. ## Mental model ### The core loop Every agent turn follows three steps: read state, act, observe. ``` ┌──────────────────┐ user input ───▶│ beliefs.before() │─── returns current state, └────────┬─────────┘ clarity, gaps, moves, │ and a prompt to inject ▼ ┌──────────────────┐ │ your agent │─── runs with belief context └────────┬─────────┘ in its system prompt │ ▼ ┌──────────────────┐ │ beliefs.after() │─── extracts, fuses, └──────────────────┘ returns delta + new state ``` The SDK wraps your loop. It does not own it. It does not replace your agent framework, does not decide what your agent does, does not require a specific LLM provider, and does not sit in the critical path of your LLM calls. ### Belief types A belief is a structured assertion your agent holds about the world. Every belief has a type: | Type | Use case | |------|----------| | `claim` | An assertion supported or refuted by evidence | | `assumption` | Something taken as true without direct evidence | | `risk` | A potential negative outcome | | `evidence` | A data point or source that supports/refutes other beliefs | | `gap` | Something the agent has not investigated yet | | `goal` | What the agent is pursuing | Types are assigned automatically during extraction. You can also specify a type when adding beliefs manually via `beliefs.add(text, { type: 'assumption' })`. ### Evidence hierarchy Different evidence types carry different weight. A single verified measurement shifts confidence more than several inferences. | Type | Weight | Description | |------|--------|-------------| | `measurement` | highest | Audited metric, verified data point | | `citation` | high | Research report, external source with provenance | | `user-assertion` | medium-high | User explicitly stated this | | `expert-judgment` | medium | Expert opinion with rationale | | `inference` | low-medium | Agent-derived inference from available data | | `assumption` | lowest | Explicit assumption, no supporting evidence | Every piece of evidence has a direction: **supports** (increases confidence), **refutes** (decreases confidence), or **neutral** (adds information weight without shifting direction). Refuting evidence is captured, not discarded — nothing is silently dropped. ### Two-channel clarity Clarity is a 0-1 score that answers one question: does the agent understand enough to move forward? It decomposes into four channels, exposed on `BeliefContext.channels` and `BeliefDelta.channels`: ``` ┌──────────────────────────────────────────────────────────────┐ │ THE TWO QUESTIONS │ │ │ │ 1. DECISION RESOLUTION: "Can we make a call?" │ │ ───────────────────────────────────────── │ │ 80% → Yes, lean toward it │ │ 50% → No, it is ambiguous │ │ 99% → Strong signal │ │ │ │ 2. KNOWLEDGE CERTAINTY: "Have we done the work?" │ │ ───────────────────────────────────────── │ │ Just stated → No evidence yet │ │ 10 data points → Some certainty │ │ 100 data points → High certainty in our assessment │ │ │ └──────────────────────────────────────────────────────────────┘ ``` Two claims at 50% confidence are not the same. One has zero evidence (research it). The other has 40 data points that genuinely split both ways (decide, don't research). The two-channel model separates them. **Knowing you do not know is categorically different from not knowing.** The four quadrants: ``` Knowledge Certainty Low High ┌────────────┬────────────────┐ High │ │ │ Decision │ Belief │ Validated │ Resolution │ without │ belief. │ │ evidence. │ Ready to act. │ │ ▶ Invest- │ │ │ igate. │ │ ├────────────┼────────────────┤ Low │ │ │ Decision │ No idea. │ Genuinely │ Resolution │ Start │ uncertain. │ │ from │ Surface │ │ scratch. │ trade-offs. │ │ │ ▶ Decide, │ │ │ don't │ │ │ research. │ └────────────┴────────────────┘ ``` The other two channels: **coherence** (do the beliefs hang together, or are there unresolved contradictions?) and **coverage** (are important areas addressed, or are there large gaps?). Open gaps reduce clarity. Gaps with more downstream dependencies reduce it more. ### Fusion, not averaging When multiple agents share a namespace (`new Beliefs({ agent, namespace })`), their deltas merge into one world state. Conflicts are detected, resolved by trust weight, and kept visible in the trace. A measurement from an SEC filing outweighs an inference from an agent — but the contradiction is never silently dropped. Last-write-wins is not fusion. Averaging confidences is not fusion. Fusion is trust-weighted Bayesian merging with a visible conflict log. ### How knowledge certainty accumulates When you seed beliefs with `add()`, knowledge certainty starts at zero. `add('Market is $4.2B', { confidence: 0.8 })` sets decision resolution to 0.8, but the system has not seen evidence yet. Knowledge certainty tracks *earned evidence* — data accumulated since the belief was created. It grows when `after()` processes real agent output that references the claim, when multiple observations reinforce it, or when tool results provide independent confirmation. To build KC quickly, prefer `after()` on real output over seeding with `add()`. ## When to use beliefs Use the SDK when: - Your agent runs more than a few turns on the same topic. - Conflicting information from different sources matters to the outcome. - You need to trace why the agent believes something (compliance, debugging, audit). - Multiple agents share state and you need trust-weighted merging. - Your agent's readiness to act is decision-relevant (research more, or proceed?). Do not use it when: - The task is a single-turn chatbot reply with no persistent state. - You just need retrieval over documents — use a vector store. - You want to micromanage the fusion math — the SDK deliberately hides it. ### Anti-patterns - **Do not call `after()` per stream chunk.** Call it once per turn, after the model finishes. `after()` runs extraction and fusion; per-chunk calls are wasteful and produce inconsistent deltas. - **Do not bypass fusion.** Do not write to belief state through any path other than `after()` / `add()` / `resolve()` / `retract()` / `remove()`. The fusion engine owns conflict resolution and trust weighting. - **Do not read or depend on internal distributions, scoring models, or fusion weights.** The SDK exposes developer-facing contracts: `text`, `confidence`, `clarity`, `channels`, `readiness`, `moves`. The underlying math (Beta/Gaussian/Dirichlet distributions, entropy tracking, Bayesian updates) is intentionally hidden and may change. - **Do not treat confidence as truth.** A belief at 0.9 confidence with zero knowledge certainty is a stated guess, not a validated fact. Check `channels.knowledgeCertainty` before acting on high-confidence claims. - **Do not share an API key across untrusted tenants.** Use `namespace` for multi-tenant isolation. ## Documentation map Full docs live at https://thinkn.ai/dev. The sections below mirror the in-app navigation. ### Start — get running fast - [start/index](https://thinkn.ai/dev/start/index) — World models, the core loop, the integration strip, the "I want to..." matrix. - [start/install](https://thinkn.ai/dev/start/install) — `npm i beliefs`, get an API key, scopes at a glance, run a verification snippet. - [start/quickstart](https://thinkn.ai/dev/start/quickstart) — The 3-step loop and a 30-line runnable example. - [start/faq](https://thinkn.ai/dev/start/faq) — RAG vs. beliefs, vector store vs. beliefs, when not to use beliefs. ### Why — the positioning and the rationale - [why/index](https://thinkn.ai/dev/why/index) — The bug, the five symptoms, memory vs. beliefs, a worked example. ### Core — the vocabulary and the model - [core/beliefs](https://thinkn.ai/dev/core/beliefs) — Belief structure, the six types, evidence hierarchy, extraction vs. manual assertion. - [core/intent](https://thinkn.ai/dev/core/intent) — Goals, gaps, and the is/ought firewall. - [core/clarity](https://thinkn.ai/dev/core/clarity) — Two-channel clarity, the four quadrants, load-bearing beliefs. - [core/moves](https://thinkn.ai/dev/core/moves) — How thinking moves are ranked by expected information gain. - [core/world](https://thinkn.ai/dev/core/world) — The fused world state: beliefs, edges, goals, gaps, contradictions. ### Tutorial - [tutorial/research-agent](https://thinkn.ai/dev/tutorial/research-agent) — A 30-minute guided build, one concept per section. - [tutorial/hack-guide](https://thinkn.ai/dev/tutorial/hack-guide) — Framework recipes (Vercel AI, Anthropic, OpenAI, fetch) and project ideas. ### Reference — the SDK surface - [sdk/core-api](https://thinkn.ai/dev/sdk/core-api) — Full reference for every method, every option, every return field. - [sdk/patterns](https://thinkn.ai/dev/sdk/patterns) — Loop patterns (single/multi-turn, streaming, tool-aware, multi-agent) and smaller integration patterns. - [sdk/reads](https://thinkn.ai/dev/sdk/reads) — Eight scope-read methods (gaps, decisions, goals, risks, insights, evidence, intents, contradictions). - [sdk/moves](https://thinkn.ai/dev/sdk/moves) — Move recommender (list, generate, act, rank), `moves.forecast`, `moves.cascade`, free-form `forecast.predict`. - [sdk/trust](https://thinkn.ai/dev/sdk/trust) — Trust overrides for agents and sources, plus tool reliability priors. - [sdk/streaming](https://thinkn.ai/dev/sdk/streaming) — Subscribe / events / streamExtraction / drift. - [sdk/scoping](https://thinkn.ai/dev/sdk/scoping) — `namespace`, `writeScope`, `thread`, `agent`, `contextLayers` — how to isolate or share belief state. - [sdk/auth](https://thinkn.ai/dev/sdk/auth) — `apiKey` (server) vs. `scopeToken` (browser, edge, untrusted runtimes). ### Adapters — framework integrations - [adapters/claude-agent-sdk](https://thinkn.ai/dev/adapters/claude-agent-sdk) — `beliefs/claude-agent-sdk` hooks for `@anthropic-ai/claude-agent-sdk`. - [adapters/vercel-ai](https://thinkn.ai/dev/adapters/vercel-ai) — `beliefs/vercel-ai` middleware for `generateText` / `streamText`. - [adapters/react](https://thinkn.ai/dev/adapters/react) — React hooks for belief state (coming soon). - [adapters/devtools](https://thinkn.ai/dev/adapters/devtools) — Debug UI for inspecting belief state in development (coming soon). ### Use cases - [cases/finance](https://thinkn.ai/dev/cases/finance) — Investment theses, contradictions across sources, temporal decay on risk. - [cases/health](https://thinkn.ai/dev/cases/health) — Differential diagnosis, drug interactions, the is/ought firewall in clinical context. - [cases/engineering](https://thinkn.ai/dev/cases/engineering) — Security posture, cross-boundary assumption detection, decay on dependency claims. - [cases/science](https://thinkn.ai/dev/cases/science) — Hypothesis tracking, contradiction detection across experiments, swarm coherence. ### Internals — how it works under the hood - [internals/how-it-works](https://thinkn.ai/dev/internals/how-it-works) — The lifecycle: fusion, decay, evidence (with the is/ought firewall), and the ledger. - [internals/contracts](https://thinkn.ai/dev/internals/contracts) — Eight behavioral guarantees the engine commits to. ## SDK reference ### Install ```bash npm i beliefs ``` ### Authentication Get an API key at https://thinkn.ai/profile/api-keys. Set it as `BELIEFS_KEY` in your environment. ### Constructor ```ts import Beliefs from 'beliefs' // or: import { beliefs } from 'beliefs' const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, // required — get at thinkn.ai/profile/api-keys agent: 'research-agent', // optional, default 'agent' — identifies this contributor namespace: 'project-alpha', // optional, default 'default' — isolates or shares state thread: 'conversation-42', // optional — scope to a conversation debug: true, // optional — log requests to console timeout: 120000, // optional, default 120000ms maxRetries: 2, // optional, default 2 }) ``` Scope rule: beliefs with the same `namespace` are fused across `agent` values (trust-weighted). Different `namespace` values are fully isolated. Use `thread` for per-conversation scoping inside a namespace. ### Methods #### before(input?: string): Promise Read current belief state before the agent acts. Inject `context.prompt` into your agent's system prompt. ```ts const context = await beliefs.before(userMessage) // context.prompt — string to inject as system prompt // context.beliefs — Belief[] with confidence scores // context.goals — string[] // context.gaps — string[] (what the agent doesn't know) // context.clarity — number 0-1 (readiness to act) // context.channels — { decisionResolution, knowledgeCertainty, coherence, coverage } // context.moves — Move[] (ranked next actions by info gain) ``` #### after(text: string, options?: AfterOptions): Promise Feed agent output after it acts. Extracts beliefs, detects conflicts, fuses into world state. **Call once per turn, not per stream chunk.** ```ts const delta = await beliefs.after(result.text) // delta.changes — DeltaChange[] (created / updated / removed / resolved) // delta.clarity — number 0-1 // delta.channels — ClarityChannels // delta.readiness — 'low' | 'medium' | 'high' // delta.moves — Move[] // delta.state — WorldState (full state after this turn) // For tool results, tag the source: const delta = await beliefs.after(toolResult, { tool: 'web_search' }) ``` #### add(text: string, options?: AddOptions): Promise Assert a single belief, goal, or gap. ```ts await beliefs.add('Market is $4.2B', { confidence: 0.85 }) await beliefs.add('Missing APAC data', { type: 'gap' }) await beliefs.add('Determine TAM', { type: 'goal' }) await beliefs.add('Market is $6.8B', { confidence: 0.95, evidence: 'IDC Q4 2025 report', supersedes: 'Market is $4.2B', }) ``` AddOptions: `confidence?: number`, `type?: 'claim'|'assumption'|'evidence'|'risk'|'gap'|'goal'`, `evidence?: string`, `supersedes?: string` #### add(items: AddManyItem[]): Promise Assert multiple beliefs, goals, or gaps in one request. ```ts await beliefs.add([ { text: 'Market is $4.2B', confidence: 0.8 }, { text: 'Missing APAC data', type: 'gap' }, { text: 'Determine TAM', type: 'goal' }, ]) ``` #### resolve(text: string): Promise Mark a gap as resolved. The gap is removed from `context.gaps` and its resolution is recorded in the trace. ```ts await beliefs.resolve('Missing APAC data') ``` #### read(): Promise Full world state: beliefs, goals, gaps, edges, contradictions, clarity, channels, moves, prompt. Use when you need everything in one call. ```ts const world = await beliefs.read() ``` #### snapshot(): Promise Lightweight read — beliefs, goals, gaps, edges, contradictions. No clarity, moves, or prompt computation. Faster than `read()` when you only need raw state. ```ts const snap = await beliefs.snapshot() ``` #### search(query: string): Promise Find beliefs by text, sorted by confidence. ```ts const results = await beliefs.search('market size') ``` #### trace(beliefId?: string): Promise Audit trail of belief transitions. Pass a `beliefId` to trace one belief; omit for the full ledger. ```ts const history = await beliefs.trace() const single = await beliefs.trace('belief-abc123') ``` #### retract(beliefId: string, reason?: string): Promise Mark a belief as retracted. The belief stays in the ledger but is excluded from the active world state. Use when you learn a claim was wrong and want to preserve the trace. ```ts await beliefs.retract('belief-abc123', 'Source was misquoted') ``` #### remove(beliefId: string): Promise Delete a belief entirely. Stronger than `retract()` — the belief is removed from the world state. Prefer `retract()` when you need the audit trail. ```ts await beliefs.remove('belief-abc123') ``` #### reset(): Promise<{ removed: number }> Clear all beliefs, goals, and gaps in the current scope. Returns the number of items removed. Destructive — primarily for tests and development. ```ts const { removed } = await beliefs.reset() ``` ### Types ```ts interface Belief { id: string; text: string; confidence: number; type: string label?: string; createdAt: string; updatedAt?: string } interface BeliefContext { prompt: string; beliefs: Belief[]; goals: string[]; gaps: string[] clarity: number; channels?: ClarityChannels; moves: Move[] } interface BeliefDelta { changes: DeltaChange[]; clarity: number; channels?: ClarityChannels readiness: 'low' | 'medium' | 'high'; moves: Move[]; state: WorldState } interface WorldState { beliefs: Belief[]; goals: string[]; gaps: string[]; edges: Edge[] contradictions: string[]; clarity: number; channels?: ClarityChannels moves: Move[]; prompt: string } interface BeliefSnapshot { beliefs: Belief[]; goals: string[]; gaps: string[]; edges: Edge[] contradictions: string[] } interface ClarityChannels { decisionResolution: number; knowledgeCertainty: number coherence: number; coverage: number } interface Move { action: string; target: string; reason: string value: number; executor?: 'agent' | 'user' | 'both' } interface Edge { type: string; source: string; target: string; confidence: number } interface DeltaChange { action: 'created' | 'updated' | 'removed' | 'resolved' beliefId: string; text: string confidence?: { before?: number; after?: number }; reason?: string } interface TraceEntry { action: 'created' | 'updated' | 'removed' | 'resolved' beliefId?: string; confidence?: { before?: number; after?: number } agent?: string; timestamp: string; reason?: string } ``` ### Error handling ```ts import Beliefs, { BetaAccessError, BeliefsError } from 'beliefs' try { const delta = await beliefs.after(text) } catch (err) { if (err instanceof BetaAccessError) { // Missing or invalid API key (401/403) } if (err instanceof BeliefsError) { // err.code — e.g. 'rate_limit/exceeded', 'validation/invalid_params' // err.retryable — boolean // err.retryAfterMs — suggested wait (ms) } } ``` Rate limit: 60 requests/minute per key. ## SDK capabilities The hosted API does real work behind each call. Understanding which capability is triggered by which method helps you pick the right entry point. | Capability | What it does | Triggered by | |------------|--------------|--------------| | **Extraction** | LLM-powered belief extraction from agent output and tool results. Finds claims, assumptions, risks, evidence, and gaps. You do not parse outputs yourself. | `after(text)` | | **Linking** | Automatic detection of contradictions, support, derivation, and supersession relationships between beliefs. Populates `edges` and `contradictions`. | `after()` → surfaced via `read().edges` | | **Deduplication** | Embedding-based similarity matching prevents duplicate beliefs when agents restate the same claim in different words. Runs transparently inside `after()`. | `after()` (invisible) | | **Fusion** | Trust-weighted merging across multiple agents sharing a namespace. Conflicts stay visible in the ledger, never silently dropped. | `Beliefs({ agent, namespace })` + `after()` | | **Clarity scoring** | 0-1 readiness assessment combining decision resolution, knowledge certainty, coherence, and coverage into one number plus four sub-channels. | `before().clarity` / `delta.clarity` / `world.channels` | | **Thinking moves** | Ranked next actions by expected information gain. Tells the agent where to look next to reduce uncertainty where it matters most. | `before().moves` / `delta.moves` | | **Provenance and trace** | Full audit trail of every transition: who stated it, what evidence, how confidence evolved, entropy before/after. | `trace()` | | **Gap tracking** | Explicit modeling of what the agent has not investigated. Gaps are first-class, penalize clarity, and drive move ranking. | `before().gaps` / `add(text, { type: 'gap' })` / `resolve(text)` | | **Retraction without loss** | Mark beliefs as wrong while preserving the ledger. Supports compliance and debugging. | `retract(id, reason)` | ## Framework integrations ### The core loop (any framework) ```ts const context = await beliefs.before(userMessage) const result = await yourAgent(context.prompt, userMessage) const delta = await beliefs.after(result) if (delta.readiness === 'high') { /* act */ } else { /* keep investigating — follow delta.moves[0] */ } ``` ### Vercel AI SDK ```ts import { generateText } from 'ai' import { anthropic } from '@ai-sdk/anthropic' const context = await beliefs.before(question) const { text } = await generateText({ model: anthropic('claude-sonnet-4-20250514'), system: context.prompt, prompt: question, }) const delta = await beliefs.after(text) ``` ### Anthropic SDK ```ts import Anthropic from '@anthropic-ai/sdk' const context = await beliefs.before(question) const message = await new Anthropic().messages.create({ model: 'claude-sonnet-4-20250514', max_tokens: 4096, system: context.prompt, messages: [{ role: 'user', content: question }], }) const text = message.content.filter(b => b.type === 'text').map(b => b.text).join('') const delta = await beliefs.after(text) ``` ### OpenAI SDK ```ts import OpenAI from 'openai' const context = await beliefs.before(question) const completion = await new OpenAI().chat.completions.create({ model: 'gpt-4o', messages: [ { role: 'system', content: context.prompt }, { role: 'user', content: question }, ], }) const text = completion.choices[0]?.message?.content ?? '' const delta = await beliefs.after(text) ``` Note: for o-series models (o3, o4-mini), use `role: 'developer'` instead of `role: 'system'`. ### Clarity-driven routing ```ts const context = await beliefs.before(input) if (context.clarity < 0.3) await runResearch(context.gaps) else if (context.clarity > 0.7) await draftRecommendations(context.beliefs) else await investigateGaps(context.gaps) ``` ### Multi-agent shared state ```ts const researcher = new Beliefs({ apiKey, agent: 'researcher', namespace: 'project' }) const analyst = new Beliefs({ apiKey, agent: 'analyst', namespace: 'project' }) await researcher.after(researchOutput) const context = await analyst.before('Interpret the findings') // analyst sees researcher's beliefs, fused by trust weight ``` ### Gap-driven research ```ts const context = await beliefs.before(input) for (const gap of context.gaps) { const result = await searchTool.run(gap) await beliefs.after(result, { tool: 'search' }) } ``` ## Built-in adapters Adapters are subpath imports from the same `beliefs` package. No extra install. ### Vercel AI SDK middleware ```ts import { generateText, wrapLanguageModel } from 'ai' import { anthropic } from '@ai-sdk/anthropic' import Beliefs from 'beliefs' import { beliefsMiddleware } from 'beliefs/vercel-ai' const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY }) const { text } = await generateText({ model: wrapLanguageModel({ model: anthropic('claude-sonnet-4-20250514'), middleware: beliefsMiddleware(beliefs), }), prompt: 'Research the AI tools market', }) ``` Options: `capture?: 'response' | 'tools' | 'all'` (default: `'response'`), `includeContext?: boolean` (default: `true`) ### Claude Agent SDK hooks ```ts import { query } from '@anthropic-ai/claude-agent-sdk' import Beliefs from 'beliefs' import { beliefsHooks } from 'beliefs/claude-agent-sdk' const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY }) const result = await query({ prompt: 'Research AI tools market', options: { hooks: beliefsHooks(beliefs) }, }) ``` Options: `capture?: 'tools' | 'all'` (default: `'tools'`), `includeContext?: boolean` (default: `true`), `toolFilter?: string` (regex) ## Public HTTP API If you cannot use the npm package, hit the HTTP API directly. All endpoints require `Authorization: Bearer `. Base URL: `https://thinkn.ai/api/sdk/v1/beliefs`. | Endpoint | Purpose | SDK equivalent | |----------|---------|----------------| | `POST /context` | Get belief context before the agent acts | `before()` | | `POST /ingest` | Submit observation for extraction and fusion | `after(text)` | | `POST /ingest-tool-result` | Submit tool result with source tagging | `after(text, { tool })` | | `POST /apply` | Apply a structured delta to world state | `add()` / `resolve()` | | `POST /reset` | Clear all beliefs in scope | `reset()` | | `GET /search` | Search beliefs by text | `search(query)` | | `GET /snapshot` | Get lightweight state snapshot | `snapshot()` | | `GET /ledger` | Query the audit trail | `trace()` | Preferred HTTP scope fields are `namespace` and `thread`. When an endpoint accepts contributor identity, the field name is `agentId` (not `agent`). - `POST /context`: JSON body uses `namespace`, optional `thread`, and optional `input`. - `POST /ingest`: JSON body uses `namespace`, optional `thread`, optional `agentId`, plus the ingest payload. - `POST /ingest-tool-result`: JSON body uses `namespace`, optional `thread`, optional `agentId`, `toolName`, `toolResult`, and optional `source`. - `POST /apply`: JSON body uses `namespace`, optional `thread`, optional `agentId`, `delta`, and optional `source`. - `POST /reset`: JSON body uses `namespace` and optional `thread`. - `GET /search`: query params `namespace`, optional `thread`, `query`, and optional `limit`. - `GET /snapshot`: query params `namespace` and optional `thread`. - `GET /ledger`: query params `namespace`, optional `beliefId`, optional `agentId`, optional `since`, optional `until`, and optional `limit`. Legacy aliases remain accepted for compatibility: `workspaceId`, `threadId`, and nested `scope.thread` / `scope.threadId` on POST bodies. Prefer `namespace` and `thread` for new integrations. --- More patterns: https://thinkn.ai/dev/sdk/patterns Full API reference: https://thinkn.ai/dev/sdk/core-api Integrations: https://thinkn.ai/dev/adapters/claude-agent-sdk, https://thinkn.ai/dev/adapters/vercel-ai --- # Start ## Start Here Source: https://thinkn.ai/dev/start/index Summary: World models, beliefs, and the fastest path into the SDK. Companies and agents alike are organizing around **world models** instead of hierarchies and pipelines. A world model isn't a pile of context, and it isn't a smarter archive. **A world model is a living set of beliefs about reality** — what the system thinks is true, why, how confident, what contradicts it, and what would change its mind. Beliefs are the operational core of a world model: claims with confidence, evidence, and lifecycle. They are how the model stays accurate as reality changes. ```bash npm i beliefs ``` ```ts import Beliefs from 'beliefs' const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, namespace: 'my-project', writeScope: 'space', }) // Before the agent acts — read current understanding const context = await beliefs.before(userMessage) // Run your agent with belief context injected const result = await myAgent.run({ system: context.prompt }) // Feed the output — beliefs extracted, conflicts detected, state updated const delta = await beliefs.after(result.text) ``` That's the loop. Three calls per turn, regardless of which framework you ship on. ## Works with your stack ```ts import { beliefsHooks } from 'beliefs/claude-agent-sdk' // Anthropic Claude Agent SDK import { beliefsMiddleware } from 'beliefs/vercel-ai' // Vercel AI SDK // React hooks + browser DevTools — coming soon ``` Or call `beliefs.before()` / `beliefs.after()` manually around any LLM (OpenAI, plain fetch, your own agent loop). See the [Hack Guide](/dev/tutorial/hack-guide) for working recipes across frameworks. ## I want to... | I want to... | Start here | |---|---| | **Ship in 10 minutes.** Hackathon, prototype, exploration. | [Hack Guide](/dev/tutorial/hack-guide) — install + framework recipes + project ideas | | **See it run end-to-end** before committing. | [Quickstart](/dev/start/quickstart) — 30 lines that print clarity rising | | **Learn the model first**, then build. | [Why beliefs](/dev/why/index) → [Concepts](/dev/core/beliefs) → [Tutorial](/dev/tutorial/research-agent) | | **Build chat memory** that's separate per conversation. | [Install](/dev/start/install) → use `writeScope: 'thread'` and bind `thread: 'id'` | | **Run multi-agent shared state** (debate, supervisor/worker, swarm). | [Patterns → Multi-Agent](/dev/sdk/patterns) — same namespace, `writeScope: 'space'` | | **Audit why an agent believes something.** | [How it works → Ledger](/dev/internals/how-it-works) and [`beliefs.trace()`](/dev/sdk/core-api) | | **Evaluate fit** before integrating. | [FAQ](/dev/start/faq) — when beliefs help, when they don't | | **Add beliefs to a Claude Agent SDK app.** | [Adapter: Claude Agent SDK](/dev/adapters/claude-agent-sdk) | | **Add beliefs to a Vercel AI SDK app.** | [Adapter: Vercel AI](/dev/adapters/vercel-ai) | | **See it across domains** (finance, health, science, engineering). | [Use cases](/dev/cases/finance) | ## Why coding agents first A codebase is already a compact world. It has laws (types, invariants), assumptions (architecture decisions, dependencies), history (commits, PRs), ownership, and contradictions (stale docs, drifted assumptions). Coding agents are already operating inside this world — but with short-lived context and weak memory. The first world model thinkⁿ targets is the one your coding agent already lives in. Concrete beliefs the engine can hold for a repo: ``` belief: Authentication is enforced at the API middleware layer confidence: 0.82 evidence: middleware.ts, auth.test.ts, architecture.md contradicts: /api/internal/export bypasses middleware next move: inspect route-level auth coverage before modifying export flow ``` The same machinery applies to research agents (claims about a market), analyst agents (beliefs about a customer or portfolio), or any system that needs to maintain a coherent picture of reality across many turns and many sources. Give your agent the SDK reference: [llms.txt](https://thinkn.ai/llms.txt). It writes correct code on the first try. ## Install Source: https://thinkn.ai/dev/start/install Summary: Configure your stack and grab the install command + boilerplate. Pick package manager, framework, and memory scope. ## Configure your install Pick your package manager, framework, and memory scope. The install command and starter snippet update as you choose. The `^0.7.0` range pulls in patches and new features without breaking changes — pin tighter if you want. ## Get your API key 1. Log in at [thinkn.ai](https://thinkn.ai) 2. Go to **Profile > API Keys** (`/profile/api-keys`) 3. Click **Create Key**, give it a name, and copy the `bel_live_...` value 4. Add it to your environment: ```bash # .env (do not commit this file) BELIEFS_KEY=bel_live_... ``` Copy the key immediately — it is only shown once. If lost, revoke it and create a new one. ## Memory scope, in detail `writeScope` decides where new beliefs are stored — and therefore who else can read them. | Scope | What it does | Use when | |---|---|---| | `'space'` | One shared belief state per `namespace`. Every agent and conversation reads and writes the same world. | Prototypes, multi-agent collaboration, anything where shared context is the point. | | `'thread'` | Separate belief state per conversation. Bind with `thread: 'id'` or `beliefs.withThread(id)`. | Chat apps, per-user memory, sessions that shouldn't leak into each other. | | `'agent'` | Durable scratchpad per agent, isolated from other agents in the same namespace. | Background workers and tool-running agents that keep private notes. | The SDK defaults to `writeScope: 'thread'` (and requires a bound thread ID). For copy-paste verification, the configurator above starts on `'space'` — that's the simplest runnable setup. See [Scoping](/dev/sdk/scoping) for the full breakdown including `contextLayers`. ## Verify it's working After setting `BELIEFS_KEY` in your environment, run the snippet from the configurator (or any of the [framework recipes](/dev/tutorial/hack-guide)). A successful first call returns a `BeliefDelta` with `clarity` set, no auth errors. If you see `BetaAccessError`, your key isn't on the allowlist — request access via the [waitlist](https://thinkn.ai/waitlist). Give your agent the SDK reference so it can write correct code on the first try: [`thinkn.ai/llms.txt`](https://thinkn.ai/llms.txt) ## Requirements - Node.js 18+ - TypeScript 5+ (recommended) The SDK is in private beta. Request access at [thinkn.ai/waitlist](https://thinkn.ai/waitlist). ## Quickstart Source: https://thinkn.ai/dev/start/quickstart Summary: The 3-step loop, plus a runnable example that prints clarity rising in under a minute. The hosted API is in private beta. [Request access](https://thinkn.ai/waitlist) to get an API key. ## The Loop Every agent using beliefs follows the same cycle: ```ts import Beliefs from 'beliefs' const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, namespace: 'quickstart', writeScope: 'space', }) ``` ```ts const context = await beliefs.before(userMessage) ``` `context.prompt` is a serialized belief brief — drop it straight into your agent's system prompt. `context.clarity` is a 0–1 score: how much does the agent know vs. how much is still uncertain? `context.moves` are ranked next actions. ```ts const result = await myAgent.run({ context: context.prompt }) const delta = await beliefs.after(result.text) ``` The infrastructure extracts beliefs from the output, detects conflicts, updates confidence, and records provenance — automatically. `delta.clarity` tells you whether to keep investigating or act. Three steps, repeated every turn. ## Run It A 30-line script that exercises the whole loop without a real LLM. Save as `quickstart.ts` and run with `BELIEFS_KEY=bel_live_xxx npx tsx quickstart.ts`. ```ts import Beliefs from 'beliefs' const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY!, namespace: `quickstart-${Date.now()}`, writeScope: 'space', }) // 1. Seed a goal and a few priors. await beliefs.add('Determine the size of the AI dev tools market', { type: 'goal' }) await beliefs.add('Market is around $4B', { confidence: 0.6, type: 'assumption' }) // 2. Feed agent output — beliefs are extracted automatically. const research = `Per Gartner 2024, the AI developer tools market is $4.2B, growing 25% YoY. GitHub Copilot, Cursor, and Tabnine account for about 65% of the market.` const first = await beliefs.after(research) console.log(`turn 1: clarity ${first.clarity.toFixed(2)}, ${first.changes.length} changes`) // 3. Feed contradicting evidence — conflict is detected. const idc = `IDC Q4 2025 reports the market at $6.8B, not $4.2B. The earlier figure excluded embedded AI in mainstream IDEs.` const second = await beliefs.after(idc, { tool: 'market_research' }) console.log(`turn 2: clarity ${second.clarity.toFixed(2)}, contradictions ${second.state.contradictions.length}`) // 4. Read the fused world state. const world = await beliefs.read() console.log(`final: ${world.beliefs.length} beliefs, clarity ${world.clarity.toFixed(2)}`) ``` You'll see clarity move turn-over-turn, a contradiction surface when sources disagree, and the fused world state at the end. Exact numbers vary across runs — extraction is LLM-driven — but the pattern (clarity rising, conflicts detected) is stable. ## Where to next ## FAQ Source: https://thinkn.ai/dev/start/faq Summary: Common questions about beliefs, answered. ## How is this different from RAG? RAG retrieves text by similarity and pastes it into a prompt. Beliefs maintains a probabilistic model of what your agent thinks is true. RAG can tell you "here is a relevant paragraph." Beliefs can tell you "this claim is 85% confident, based on 3 sources, and contradicted by one." They solve different problems. You can use both. ## When should I NOT use beliefs? Skip beliefs if you don't have a coherence problem yet. - **Single-turn agents** — Q&A bots, content generation, one-shot tasks. Memory or nothing is enough. - **Pure ephemeral session memory** with no cross-source claims, no evolving understanding, no contradictions to track — overkill. - **Tasks where the agent never reasons about its own confidence or gaps** — beliefs add overhead without payoff. Beliefs earns its keep when an agent runs more than a few turns on the same topic, accumulates information from multiple sources, or needs to know what it doesn't know. If those don't apply, use memory or skip both. ## How is this different from a vector store? A vector store recalls snippets. Beliefs models uncertainty, tracks provenance, detects contradictions, and decays stale evidence. A vector store answers "what text is related to this query." Beliefs answers "what does the agent currently believe, how strongly, and why." ## How is this different from a "world model"? A world model is the *whole structure* the system uses to represent reality — beliefs, goals, gaps, contradictions, clarity, recommended next moves. Beliefs are the operational core inside it: the claims with confidence, evidence, and lifecycle. Without beliefs, a "world model" is an intelligent archive — it can retrieve context, but it can't maintain coherence as reality changes. Beliefs are what make the model live. ## How is this different from an agentic framework (LangChain, CrewAI, AutoGen)? Agentic frameworks orchestrate *what your agent does* — graphs of LLM calls, tool routing, multi-agent coordination, control flow. Beliefs handles *what your agent thinks* — the structured model of claims, confidence, evidence, contradictions, and gaps that persists across turns. They sit at different layers. You use a framework to run your agent; you use beliefs to give it a coherent view of the world. Beliefs works with LangGraph, CrewAI, AutoGen, the Claude Agent SDK, the Vercel AI SDK, or any custom loop. ## How is this different from LLM observability (LangSmith, Langfuse, Helicone)? Observability tools record what your agent did. They give you traces, latency breakdowns, token counts, and replayable logs of past runs. Beliefs records what your agent *currently believes* — it is decision-facing state, not a backward-looking log. Observability answers "what happened?" Beliefs answers "what is true right now, and what should the agent do next?" They are complementary; large agent deployments tend to use both. ## Does this replace my agent framework? No. Beliefs wraps your loop. It does not replace it. Use it with Claude Agent SDK, Vercel AI SDK, LangGraph, or any custom agent loop. If you have a loop, beliefs works with it. ## What if I only have a single-turn agent? Beliefs is designed for multi-turn, multi-source scenarios. For single-turn Q&A, memory or retrieval is typically sufficient. Beliefs matters when agents run multiple turns on the same topic, accumulate information that might conflict, or need to coordinate across agents. ## Do I need to understand probability to use it? No. Confidence is a number between 0 and 1; that's all you need to read it. The SDK handles the math. If you want a deeper read on the probabilistic foundations, see [Internals > Math](/dev/internals/how-it-works). ## What happens when two claims conflict? The fusion engine detects conflicts, resolves them by trust weight, and keeps both in the trace. Nothing is silently discarded. You can query contradictions from `beliefs.read()` and decide how to handle them in your agent logic. ## Can I use beliefs without an adapter? Yes. The core `before`/`after` loop works with any agent framework: ```ts const context = await beliefs.before(input) const result = await yourAgent.run({ prompt: context.prompt }) await beliefs.after(result) ``` Adapters handle framework-specific plumbing for you, but the core SDK is framework-agnostic. ## Does performance degrade with many claims? The SDK handles pruning and decay. Stale claims lose weight over time through [temporal decay](/dev/internals/how-it-works). The system self-manages so old, low-confidence claims do not accumulate indefinitely. ## Do beliefs persist across sessions? Yes. The hosted backend persists beliefs automatically. An API key is required — there is no local-only mode in the current release. Local persistence adapters are planned for a future version. ## Where do I get help? - **Beta support channel.** Direct access to the engineering team. - **GitHub.** [Open an issue](https://github.com/thinkn-ai/beliefs/issues) on the beliefs repo. - **Beta access.** [Request access](/dev/beta) if you are not yet in the program. --- # Why ## Why beliefs Source: https://thinkn.ai/dev/why/index Summary: Why agents drift, why memory and RAG don't fix it, and what beliefs change. ## A bug you've already debugged Your agent looks up a market size on turn 3 and says "$4.2B." On turn 12 a tool returns SEC filings showing "$3.8B." On turn 18 the agent cites "$4.2B" again — because that's what it said first, and the context window doesn't distinguish "stated earlier" from "verified." You've seen this. It looks like flakiness, or hallucination, or model error. It's none of those. **It's the absence of a primitive your agent doesn't have: a structured model of what it currently believes, and how that belief changed when new evidence arrived.** ``` Turn 3 ─▶ "Market is $4.2B" ⟵ stated. no source. Turn 12 ─▶ "SEC filings suggest $3.8B" ⟵ different number. no comparison. Turn 18 ─▶ "Market is $4.2B" ⟵ first one wins. drift wins. ``` There's no tracked confidence. No evidence weight. No detection that the numbers disagree. No awareness that one came from a tool and the other from a guess. A peer-reviewed study and a guess three turns ago look identical to the model. With beliefs, the same trace becomes: ```ts // Turn 3 await beliefs.add('Market is $4.2B', { confidence: 0.5 }) // stated, no evidence // Turn 12 await beliefs.after(secFilings) // engine extracts "$3.8B", detects contradicts edge to "$4.2B" // world.contradictions surfaces both with sources // Turn 18 const context = await beliefs.before(userMessage) // context.prompt surfaces both numbers, their sources, the conflict // the agent now knows the question is unsettled ``` The agent stops self-contradicting because the infrastructure remembers what it has and hasn't investigated. ## The five symptoms These are what drift looks like in practice — what you've seen if you've shipped agents that run more than a few turns: 1. **Agents contradict themselves.** Turn 3 cites $4.2B, turn 18 cites $4.2B again because it appeared first. No detection, no resolution. 2. **Confidence is invisible.** A claim from one source and a claim from ten look identical in the context window. A three-month-old estimate sits next to yesterday's verified data with no distinction. 3. **Guesses and facts are indistinguishable.** A user's intuition and a peer-reviewed study carry equal weight in the prompt. There's no separation between assumption and evidence. 4. **Agents don't know what they don't know.** No concept of "gap." No mechanism to prioritize what would reduce the most uncertainty. 5. **Bigger context makes it worse.** A 200K window doesn't fix any of this — it carries more conflicting claims with more fluency, and the model interpolates fluently across all of it. Wider window, murkier understanding. ## Memory and RAG don't fix it Memory and retrieval each solve a piece of the problem (recall what was said, find similar text), but neither models what's *currently* believed. Here's what the gap looks like in practice: | Dimension | Memory / RAG | Beliefs | |-----------|-------------|---------| | **What it stores** | Text chunks and vectors (similarity-based retrieval) | Structured claims with confidence and evidence | | **Uncertainty** | None — every retrieved chunk looks equally valid | Two channels: decision resolution + knowledge certainty | | **Conflicts** | Returns both conflicting chunks, or last-write-wins | Detects, tracks, and resolves by source reliability | | **Decay** | Falls out of context window randomly | Principled decay toward an uninformative prior over time | | **Provenance** | "This chunk was retrieved" | Full trail: who stated it, what evidence, how confidence evolved | | **What is missing** | No concept | Gaps are first-class — they drive the next action | A three-month-old market estimate and a verified data point from yesterday look identical in memory. Beliefs distinguishes them by confidence, evidence, and recency. Memory says "this was mentioned before." Belief state says "this is probably true, but confidence dropped after the latest filing — here's the contradiction, here's the next move." That's the difference between *recall* and *judgment*. ## What changes when beliefs are explicit A worked example. Three agents audit a legacy auth module — a code analyst reads the files, an architecture agent maps service dependencies, a runtime profiler watches actual traffic. They share a `namespace` with `writeScope: 'space'`, so every observation lands in one fused belief state. ```ts // Code analyst await analyst.after( 'Auth module has 3 token validation paths. Path A uses JWT. ' + 'Path B uses custom HMAC. Path C checks a session cookie ' + 'but never validates expiry. 14 services import this module.' ) // Architecture agent await architect.after( 'Only 6 of 14 services use JWT. 5 use HMAC. ' + '3 services use Path C — all customer-facing payment APIs.' ) // Runtime profiler await profiler.after( 'Path C handles 73% of all auth requests. It was a "temporary bypass" ' + 'added during a migration 2 years ago. The migration completed ' + 'but the bypass was never removed. 4.2M active sessions use this path.' ) ``` The fused world state reveals what no single agent could have seen alone: ```ts const world = await analyst.read() world.beliefs // [ // { text: 'Path C handles 73% of auth traffic', confidence: 0.92 }, // { text: 'JWT is the primary auth mechanism', confidence: 0.15 }, // { text: 'Path C has no session expiry validation', confidence: 0.85 }, // ] world.edges // [{ type: 'contradicts', source: 'JWT is primary', target: 'Path C handles 73%' }] world.moves // [{ action: 'research', // target: 'Audit Path C session security', // reason: 'Payment-facing path with no expiry on 4.2M sessions', // value: 0.96 }] ``` The team's stated belief was "JWT is our auth." The agents' fused observation was "73% of traffic runs on a forgotten bypass." That contradiction is invisible in any single agent's report. It's the belief state that makes it visible. ## What this unlocks When beliefs are explicit, the world model stops being read-only: - **Hidden assumptions become examinable.** Beliefs that were silently driving decisions get named, scored, and traceable. - **Uncertainty becomes directional.** The system knows which gap, if filled, would reduce the most uncertainty — and surfaces it as a recommended next move. - **Contradictions become signal, not noise.** A swarm of agents producing partially-overlapping views isn't a coordination failure; it's the input to a fusion engine that resolves through evidence. - **The frontier becomes visible.** What was never investigated is just as trackable as what was. This is what beliefs change. The rest of these docs show you how. ## Where to next --- # Concepts ## Beliefs Source: https://thinkn.ai/dev/core/beliefs Summary: The unit of account inside a world model — text, type, confidence, evidence, lifecycle. A belief captures what the agent currently holds true — the claim itself, how confident the agent is, what evidence backs it, and how it has evolved over time. Without that structure, a peer-reviewed citation, an agent's inference, and a guess three turns ago all look identical in the prompt. ## What a Belief Is A belief is a structured assertion your agent holds about the world. It has a type, a confidence score, an optional evidence weight, and an optional label for richer categorization. ```ts { id: 'belief_auth_middleware', text: 'Authentication is enforced at the API middleware layer', type: 'claim', confidence: 0.82, evidenceWeight: 4, label: 'load-bearing', createdAt: '2026-04-15T10:30:00Z', // Engine-tracked alongside this belief: // evidence: middleware.ts, auth.test.ts, architecture.md // contradicts: /api/internal/export bypasses middleware // next move: inspect route-level auth coverage before modifying export flow } ``` - **text.** The natural language assertion. - **type.** What kind of belief: `claim`, `assumption`, `risk`, `evidence`, `gap`, `goal`. - **confidence.** A 0–1 score reflecting the current evidence balance. - **evidenceWeight.** How much evidence backs this belief. `0` means stated but uninvestigated; higher means corroborated. - **label.** A semantic label for richer categorization: `risky-assumption`, `load-bearing`, `limiting-belief`, `pain-point`, `opportunity`, etc. The example above is from a coding agent's world model — a repo. The same shape applies to any domain: a research agent's belief about market size, an analyst's belief about a customer's churn risk, a finance agent's belief about a portfolio position. The structure is the same; only the content changes. ## Belief Types | Type | One-line gloss | Use case | |------|----------------|----------| | `claim` | An evidenced assertion | Supported or refuted by collected evidence | | `assumption` | An untested supposition | Stated as true without direct evidence yet | | `risk` | A potential negative outcome | Something the agent should hedge against | | `evidence` | A data point or source | Used to support/refute other beliefs | | `gap` | A known unknown | Something the agent has flagged as unresolved | | `goal` | A pursued outcome | What the agent is trying to accomplish | The system assigns types automatically during extraction. You can also specify a type when adding beliefs manually via `beliefs.add(text, { type: 'assumption' })`. ## How Confidence Works Confidence reflects the balance of evidence behind a belief. When new evidence arrives, confidence shifts. How much depends on the evidence quality. A Gartner report citing $4.2B market size carries more weight than an agent's inference from incomplete data. Both update beliefs, but by different amounts. ### Evidence Types Different evidence types carry different weight: | Type | Description | |------|-------------| | `measurement` | Audited metric, verified data point | | `citation` | Research report, external source with provenance | | `user-assertion` | User explicitly stated this | | `expert-judgment` | Expert opinion with rationale | | `inference` | Agent-derived inference from available data | | `assumption` | Explicit assumption, no supporting evidence | A single verified measurement shifts confidence more than several inferences. The SDK calibrates the weight of each type so that evidence quality matters, not just volume. ### Direction Every piece of evidence has a direction: - **supports.** Increases confidence in the claim. - **refutes.** Decreases confidence in the claim. - **neutral.** Adds information weight without shifting direction. When the research agent finds a Gartner report supporting "Market size is $4.2B," confidence increases. When it finds an SEC filing showing a smaller number, that refuting evidence decreases confidence. Both are captured. Nothing is discarded. ## Extraction The SDK extracts beliefs automatically when you pass output to `after`. You do not need to parse agent outputs yourself. ```ts // Beliefs are extracted from the output automatically const delta = await beliefs.after(result.text) ``` With an adapter, the lifecycle is wired up for you: ```ts const agent = createAgent({ hooks: beliefsHooks(beliefs, { capture: 'all' }), }) ``` ## Manual Assertion When you have domain-specific knowledge, you can add beliefs explicitly: ```ts await beliefs.add('Market size is $4.2B', { confidence: 0.85, type: 'assumption', }) ``` Manual assertions and automatic extraction feed the same update pipeline. ## Intent Source: https://thinkn.ai/dev/core/intent Summary: Goals and gaps. What your agent is trying to do and what it does not know. ## What Intent Is Intent is what your agent is *trying to accomplish* — as opposed to what it *understands to be true*. It covers goals, gaps, decisions, and constraints: the preferences and rules that shape what the agent pursues. "Map the competitive landscape" is intent — it expresses a direction, not a fact. "The market is $4.2B" is a belief — it can be supported or refuted by evidence. The SDK keeps these separate because they serve different purposes. ## Goals Goals drive action selection. They tell the system what questions to answer and what gaps to fill. ```ts await beliefs.add('Map the competitive landscape', { type: 'goal' }) await beliefs.add('Identify top 3 market opportunities', { type: 'goal' }) ``` An unmet goal reduces clarity, signaling to the agent that more investigation is needed before acting. Goals accumulate context but do not participate in the confidence system — an agent pursuing a hypothesis shouldn't grow more confident in it just because it's pursuing it. ## Gaps Gaps represent missing information: what the agent has not investigated or cannot answer yet. ```ts await beliefs.add('No data on enterprise pricing models', { type: 'gap' }) await beliefs.add('Missing APAC market analysis', { type: 'gap' }) ``` Gaps are first-class in the belief system because they drive the next research action. An agent that knows what it does not know can prioritize its work. Gaps penalize the [clarity](/dev/core/clarity) score. High-impact gaps, those with many downstream dependencies, penalize it more. The system naturally prioritizes filling the most important gaps first. ### Resolving Gaps When the agent has addressed a gap, mark it resolved: ```ts await beliefs.resolve('Missing APAC market analysis') ``` Resolved gaps stop penalizing clarity and update the world state. ## The Is/Ought Firewall Factual evidence updates beliefs. Normative information (preferences, goals, desires) does not. | Input | Type | Effect | |-------|------|--------| | "The TAM is $5B" | Factual | Updates the market size belief | | "I want to target enterprise" | Normative | Recorded as a goal | | "We must support SOC2" | Normative | Recorded as a constraint | | "Gartner reports 34% growth" | Factual | Updates the growth rate belief | This prevents a common failure mode: a user's strong preference inflating factual confidence. Without the firewall, the more a user says "I want X," the more confident the system becomes that X is the right answer, regardless of what the evidence shows. Preferences do not update factual beliefs. Without this separation, a user repeating "I want X" would gradually inflate the agent's confidence that X is *true* — preferences masquerading as evidence. The firewall keeps factual claims and normative intent on separate tracks, so user conviction can't distort the belief state. ## Reading Intent Goals and gaps are returned from `beliefs.read()`: ```ts const world = await beliefs.read() console.log(world.goals) // ['Map the competitive landscape'] console.log(world.gaps) // ['Missing APAC market analysis'] ``` They are also included in `beliefs.before()` context, so the agent sees its current goals and open gaps at the start of each turn. ## Clarity Source: https://thinkn.ai/dev/core/clarity Summary: A single score that tells you how ready your agent is to act. ## What Clarity Answers Clarity is a 0-1 score that answers one question: does this agent understand enough to move forward? It combines multiple signals into one number. Low clarity means keep investigating. High clarity means the agent has enough to act. ## The Two-Channel Insight This is the most important concept in the SDK. Consider two claims, both at 50% confidence. One has no evidence. The agent has never investigated. It is a guess. The other has 40 data points that genuinely split both ways. The first needs research. The second needs a decision framework. Beliefs tracks two separate channels to distinguish them: ``` ┌──────────────────────────────────────────────────────────────┐ │ THE TWO QUESTIONS │ │ │ │ 1. DECISION RESOLUTION: "Can we make a call?" │ │ ───────────────────────────────────────── │ │ 80% → Yes, lean toward it │ │ 50% → No, it is ambiguous │ │ 99% → Strong signal │ │ │ │ 2. KNOWLEDGE CERTAINTY: "Have we done the work?" │ │ ───────────────────────────────────────── │ │ Just stated → No evidence yet │ │ 10 data points → Some certainty │ │ 100 data points → High certainty in our assessment │ │ │ └──────────────────────────────────────────────────────────────┘ ``` ### The Four Quadrants ``` Knowledge Certainty Low High ┌────────────┬────────────────┐ High │ │ │ Decision │ Belief │ Validated │ Resolution │ without │ belief. │ │ evidence. │ Ready to act. │ │ ▶ Invest- │ │ │ igate. │ │ ├────────────┼────────────────┤ Low │ │ │ Decision │ No idea. │ Genuinely │ Resolution │ Start │ uncertain. │ │ from │ Surface │ │ scratch. │ trade-offs. │ │ │ ▶ Decide, │ │ │ don't │ │ │ research. │ └────────────┴────────────────┘ ``` The bottom-right quadrant is the critical one. "We have done extensive research and this is genuinely a close call" is a valuable conclusion. The system should help the user decide. Knowing you do not know is categorically different from not knowing. The two-channel model captures this distinction. ## What Clarity Measures Clarity combines four signals, weighted by their relative importance: **Decision resolution.** Are key claims far enough from ambiguous to act on? **Knowledge certainty.** Has enough evidence accumulated to trust the current picture? **Coherence.** Do the beliefs hang together, or are there unresolved contradictions? **Coverage.** Are important areas addressed, or are there large gaps? Open gaps reduce clarity. Gaps with more downstream dependencies reduce it more. This creates natural pressure to fill the gaps that matter most. ## Load-Bearing Beliefs Some beliefs carry more weight than others. A load-bearing belief is one that, if proven wrong, would collapse the strategy built on top of it. "The TAM is $4.2B" is load-bearing if the pricing model, fundraising projections, and go-to-market plan all depend on it. The engine identifies load-bearing beliefs by tracing the dependency graph. A belief is load-bearing when many other beliefs derive from, support, or are conditioned on it — so removing it would invalidate everything downstream. These are flagged automatically when their evidence is weak, decayed, or contradicted, so the agent doesn't keep building on a foundation that's eroding. ## Directing Attention With explicit beliefs, gaps, and uncertainty, the system can identify which actions would most reduce uncertainty in the beliefs that matter most. It considers what gaps exist, which beliefs are weakly supported, and where contradictions remain unresolved. A research action that fills a high-impact gap is prioritized over one that confirms something already well-supported. An action that tests a fragile, load-bearing assumption is prioritized over one that validates a peripheral detail. See [Moves](/dev/core/moves) for how the system surfaces these recommendations. ## Using Clarity in Your Agent Read clarity from `beliefs.read()`: ```ts const world = await beliefs.read() ``` Route on it: ```ts if (world.clarity < 0.3) { // Not enough to work with. Research the biggest gaps. await runResearch(world.gaps) } else if (world.clarity > 0.7) { // Ready to act. Draft recommendations. await draftRecommendations(world.beliefs) } else { // Middle ground. Investigate remaining gaps. await investigateGaps(world.gaps) } ``` ## Clarity and Uncertainty A genuinely uncertain topic can have high clarity. If the research agent investigates extensively and finds that the market could go either way, clarity can be high. The agent has done the work to understand that the question is uncertain. Clarity measures readiness to act. Low clarity means "keep investigating." High clarity means "you have enough to make a decision, even if the decision is hard." ## How Knowledge Certainty Accumulates When you seed beliefs with `add()`, Knowledge Certainty starts at zero. This is by design. `add('Market is $4.2B', { confidence: 0.8 })` sets the belief's starting position (Decision Resolution reflects 0.8), but the system has not yet seen evidence for it. Knowledge Certainty tracks *earned evidence* — the difference between where the belief started and how much data has accumulated since. KC grows when: - The same claim receives additional evidence via `after()` (LLM extraction finds supporting or refuting data) - Multiple observations reinforce the same claim over time - Tool results provide independent confirmation This distinction matters. A belief stated with high confidence but no evidence is in the "belief without evidence" quadrant. The system correctly flags it as needing validation rather than treating stated confidence as proof. To build KC quickly, use `after()` to process real agent output rather than seeding everything with `add()`. The extraction pipeline finds evidence in the agent's work and accumulates it against existing beliefs. ## Moves Source: https://thinkn.ai/dev/core/moves Summary: Ranked next actions by expected information gain. Without recommended next actions, the world model is informative but not action-guiding. Moves turn understanding into direction: the engine's ranked answer to "given what the agent currently believes, what should it investigate next?" Each move targets a specific belief, gap, or contradiction and reports the expected information gain from acting on it. ## What a Move Is A move is a recommended action the system surfaces based on the current belief state. Each move has an action type, a target, a reason, and an expected value representing how much it would improve the agent's understanding. ```ts { action: 'gather_evidence', target: 'Missing APAC market analysis', reason: 'High-impact gap with 3 downstream dependencies', value: 0.72, executor: 'agent', } ``` - **action.** The type of move: `clarify`, `resolve_uncertainty`, `gather_evidence`, `compare_paths`. - **target.** The belief, gap, or contradiction the move addresses. - **reason.** Why this move matters, in natural language. - **value.** Expected information gain, 0–1. Higher means more uncertainty reduction. - **executor.** Who should act: `agent`, `user`, or `both`. ## Move Types | Action | When it surfaces | |--------|-----------------| | `gather_evidence` | A gap or weakly supported belief needs investigation | | `clarify` | A contradiction exists between beliefs | | `resolve_uncertainty` | A load-bearing belief has insufficient evidence | | `compare_paths` | Multiple valid interpretations need a decision framework | Each action can have a subtype for specificity: | Subtype | Description | |---------|-------------| | `research` | Find external data or sources | | `validate_assumption` | Test whether an assumption holds | | `resolve_contradiction` | Address conflicting beliefs | | `quantify_risk` | Measure exposure on a risk belief | | `design_test` | Propose an experiment to confirm or refute | | `synthesize` | Combine multiple findings into a conclusion | | `reframe` | Restructure the problem based on new information | ## How Moves Are Ranked Moves are ranked by expected information gain: which action would most reduce uncertainty in the beliefs that matter most. A gap with many downstream dependencies generates a higher-value move than a gap with none. A contradiction between two **load-bearing** beliefs — beliefs that other beliefs derive from, so if they're wrong the rest collapse — generates a higher-value clarify move than a contradiction between peripheral claims that nothing depends on. The system considers: - How much uncertainty the move would reduce - How many other beliefs depend on the target - Whether the target belief is load-bearing (see [Clarity](/dev/core/clarity) for how the engine identifies them) - The current clarity score and what would improve it most ## Reading Moves Moves are returned from every major SDK method: ```ts // Before the agent acts const context = await beliefs.before(userMessage) console.log(context.moves) // ranked actions for this turn // After the agent acts const delta = await beliefs.after(result.text) console.log(delta.moves) // updated recommendations // Full world state const world = await beliefs.read() console.log(world.moves) // all current recommendations ``` ## Routing on Moves Use moves to direct agent behavior: ```ts const delta = await beliefs.after(result.text) const next = delta.moves[0] if (!next) { // No recommended actions. Clarity is likely high. await finalize(delta.state) } else if (next.action === 'gather_evidence') { await runResearch(next.target) } else if (next.action === 'clarify') { await resolveContradiction(next.target) } else if (next.action === 'resolve_uncertainty') { await deepDive(next.target) } else if (next.action === 'compare_paths') { await presentTradeoffs(next.target) } ``` ## Executor The `executor` field indicates who should act on the move. | Executor | Meaning | |----------|---------| | `agent` | The agent can handle this autonomously | | `user` | This requires human input or judgment | | `both` | The agent can start, but the user needs to weigh in | A `user` executor move might surface when the system detects a value judgment or strategic decision that the agent should not make alone. ## World Model Source: https://thinkn.ai/dev/core/world Summary: The agent's current understanding of reality — beliefs, gaps, conflicts, and recommended next steps. ## What a World Model Is Your agent's world model is its current understanding of the reality it operates in. For a research agent, that's claims about a market. For a coding agent, that's beliefs about a codebase. For an analyst, that's beliefs about a customer or a portfolio. `beliefs.read()` returns this world model in full: the agent's beliefs, the relationships between them (what supports, contradicts, derives), all open gaps, active contradictions, the clarity score, and recommended next moves. ```ts const world = await beliefs.read() world.beliefs // all beliefs the agent holds world.goals // what the agent is pursuing world.gaps // unknowns and open questions world.edges // relationships: supports, contradicts, derives world.contradictions // active conflicts that need resolution world.clarity // 0-1 readiness score world.moves // suggested next actions world.prompt // serialized context for LLM injection ``` ## Edges Beliefs form a coherence graph. Each edge represents a relationship between two beliefs. | Edge type | Meaning | |-----------|---------| | `supports` | Evidence or reasoning that backs a claim | | `contradicts` | Direct conflict between two beliefs | | `supersedes` | A newer belief replaces an older one | | `derived_from` | One belief was inferred from another | | `depends_on` | A conclusion that rests on an assumption | ```ts { type: 'contradicts', source: 'belief_gartner_tam', target: 'belief_sec_tam', confidence: 0.9, } ``` Edges are detected automatically during extraction. When the research agent finds a Gartner report citing $4.2B and an SEC filing showing $3.8B, the system creates a `contradicts` edge between them. ## Contradictions Contradictions are surfaced explicitly in `world.contradictions`. They penalize the [clarity](/dev/core/clarity) score and generate `clarify` [moves](/dev/core/moves). A contradiction between two load-bearing beliefs has more impact on clarity than one between peripheral claims. The system weights contradictions by how many downstream beliefs depend on the conflicting pair. ## The Prompt Field `world.prompt` is a serialized summary of the current belief state, formatted for LLM injection. It includes the most relevant beliefs, open gaps, contradictions, and recommended moves. ```ts const context = await beliefs.before(userMessage) const result = await myAgent.run({ systemPrompt: context.prompt, message: userMessage, }) ``` The prompt is constructed to give the agent awareness of its current understanding without overwhelming the context window. ## Multi-Agent Fusion When multiple agents contribute beliefs, the world state is the fused view: a single picture of reality assembled from every agent's contributions. Each agent contributes via its own `agent` identifier. The backend merges contributions across agents into one shared state. To share one fused world state, every contributing agent must use the same `namespace` (the logical problem space) and `writeScope: 'space'` (so writes land in the shared store rather than per-agent or per-thread). With different namespaces or scopes, each agent maintains a separate world model and cross-agent contradictions won't be detected. See [Scopes](/dev/sdk/scoping) for the full breakdown. ```ts const researcher = new Beliefs({ apiKey: process.env.BELIEFS_KEY, agent: 'researcher', namespace: 'market-map', writeScope: 'space', }) const analyst = new Beliefs({ apiKey: process.env.BELIEFS_KEY, agent: 'analyst', namespace: 'market-map', writeScope: 'space', }) // Both contribute to the same namespace await researcher.after(researchResult) await analyst.after(analysisResult) // The world state includes beliefs from both, fused by the backend const world = await analyst.read() ``` If you have a single agent, world state is your agent's belief state. Multi-agent fusion activates when different `agent` values write to the same namespace with a shared space scope. Trust weight configuration and conflict resolution strategy selection are handled by the hosted backend. Per-agent views and advanced fusion controls are planned for a future SDK release. --- # Tutorial ## Build a Research Agent Source: https://thinkn.ai/dev/tutorial/research-agent Summary: A 30-minute guided build. Learn the model by writing one. End with a working agent that knows what it does not know. This tutorial teaches the model by building one thing end-to-end: a research agent that investigates a question, accumulates evidence, detects when sources disagree, and stops when it has enough to act. You will not call an LLM during this tutorial. Every "agent output" is a literal string so the tutorial runs deterministically with no API key beyond the Beliefs key. At the end you'll see how to swap in Claude, GPT, or any other model. **You'll learn one concept per section.** Each builds on the previous. Copy the code as you go — the final section assembles everything into one runnable file. - Node 18+ - A Beliefs API key ([request access](/dev/beta) if you don't have one yet) - ~30 minutes ## What you're building By the end, you'll have a research agent that: 1. Takes a research question 2. Reads what it currently believes about the question 3. Investigates the highest-priority gap 4. Detects when new evidence contradicts what it already knew 5. Stops when its **clarity score** crosses a threshold 6. Reports what it found — and what it deliberately doesn't know ``` ┌────────────────────────────────────────────────────────────┐ │ Turn 1 clarity 0.18 ████░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ │ Turn 2 clarity 0.34 ██████████░░░░░░░░░░░░░░░░░░░░░░ │ │ Turn 3 clarity 0.51 ███████████████░░░░░░░░░░░░░░░░░ │ │ Turn 4 clarity 0.68 ████████████████████░░░░░░░░░░░░ │ │ Turn 5 clarity 0.74 ██████████████████████░░░░░░░░░░ │ → STOP └────────────────────────────────────────────────────────────┘ ``` The clarity score is what makes this different from a turn-counter or a token budget. The agent stops when it has *learned* enough — not when it has *talked* enough. --- ## 1. Setup ### Install ```bash mkdir research-agent && cd research-agent npm init -y npm pkg set type=module npm install beliefs tsx typescript npm install -D @types/node ``` The `npm pkg set type=module` line marks the package as ESM so top-level `await` works in your script. Without it, you'd need to wrap each section in `async function main() { ... }` — a small annoyance, but worth setting up once. ### Configure TypeScript Create `tsconfig.json`: ```json { "compilerOptions": { "target": "ES2022", "module": "ES2022", "moduleResolution": "Bundler", "esModuleInterop": true, "strict": true } } ``` ### Set your key ```bash export BELIEFS_KEY=bel_live_xxx ``` ### First call Create `agent.ts`: ```ts import Beliefs from 'beliefs' const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY!, namespace: `research-${Date.now()}`, writeScope: 'space', }) const world = await beliefs.read() console.log('beliefs:', world.beliefs.length) console.log('clarity:', world.clarity) ``` Run it: ```bash npx tsx agent.ts ``` You should see something like: ``` beliefs: 0 clarity: 0.25 ``` **What just happened.** You created a fresh belief state in a unique namespace. It has zero beliefs — clarity defaults to a low baseline because there's nothing to be clear about yet. Every call from here will operate against this same namespace. `namespace: \`research-${Date.now()}\`` makes every tutorial run a fresh slate. In production, pick stable namespaces (one per project, customer, or session). The engine uses LLM extraction, so exact claims, counts, and clarity values vary across runs. Your output will show the same *patterns* (clarity rising turn-over-turn, contradictions detected when sources disagree, gaps closing as evidence comes in) but different *exact* values. That's expected and correct — wherever a tutorial output shows a range or `~`, treat it as illustrative. → Concept: **[World](/dev/core/world)** — the read-only view of everything your agent currently believes. --- ## 2. Tell the agent what to research A research agent needs a goal. Goals are first-class in beliefs — not just a string in your prompt. Add this to `agent.ts`: ```ts await beliefs.add('Determine the size of the AI developer tools market', { type: 'goal', }) const world = await beliefs.read() console.log('goals:', world.goals) ``` Output: ``` goals: [ 'Determine the size of the AI developer tools market' ] ``` **What just happened.** `add()` with `type: 'goal'` registers what the agent is *trying to do*, distinct from claims (what the agent thinks is *true*). The clarity score now considers whether the goal has been resolved. → Concept: **[Intent](/dev/core/intent)** — goals, decisions, and constraints. What the agent wants, distinct from what it knows. --- ## 3. Seed what you already know Often you start an investigation with priors — things you've heard, suspect, or have weak evidence for. Beliefs lets you assert these explicitly with a confidence score. ```ts await beliefs.add('AI dev tools market is around $4B', { confidence: 0.6, type: 'assumption', }) await beliefs.add('GitHub Copilot has the largest market share', { confidence: 0.7, type: 'assumption', }) await beliefs.add('Missing breakdown by enterprise vs individual developers', { type: 'gap', }) const state = await beliefs.read() console.log('beliefs:', state.beliefs.length) console.log('gaps: ', state.gaps.length) console.log('clarity:', state.clarity.toFixed(2)) ``` Output (illustrative — exact values vary): ``` beliefs: 2 gaps: 1 clarity: 0.30–0.40 ``` **What just happened.** Three calls, three different shapes: - `type: 'assumption'` — a stated belief without supporting evidence yet. Confidence = 0.6 means "I lean this way, but haven't done the work." - `type: 'gap'` — something the agent has flagged as unknown. Gaps are first-class — they reduce clarity until filled. Notice clarity went up — but only slightly. That's because **stated confidence is not the same as evidence**. The system tracks both: - **Decision resolution** — how confident you are in the answer (0.6 here) - **Knowledge certainty** — how much *evidence* backs the answer (zero here, you just stated it) A claim stated at 0.95 with zero evidence is in a different epistemic category than a claim at 0.65 with 40 supporting observations. The clarity score reflects this. → Concept: **[Beliefs](/dev/core/beliefs)** — what types exist, how confidence works, why the two-channel model matters. --- ## 4. The core loop Now the loop that defines a beliefs-aware agent: **read context, act, feed observation**. For this tutorial, "act" returns a literal string — what your agent might output if you'd run a real LLM. In Section 9 you'll swap it for a real model. ```ts async function fakeAgent(_systemPrompt: string, _userMessage: string): Promise { // Pretend an LLM ran. Return a realistic agent output. return `Based on a Gartner 2024 report, the AI developer tools market is valued at $4.2B. The top three players (GitHub Copilot, Cursor, and Tabnine) account for approximately 65% of the market. Enterprise adoption is currently around 40% of total spend, with individual developers making up the remainder. The market is growing at roughly 25% year over year.` } const userMessage = 'Research the AI developer tools market' // 1. Read what the agent currently believes const context = await beliefs.before(userMessage) // 2. Run the agent const output = await fakeAgent(context.prompt, userMessage) // 3. Feed the observation const delta = await beliefs.after(output) console.log('changes: ', delta.changes.length) console.log('clarity: ', delta.clarity.toFixed(2)) console.log('readiness:', delta.readiness) ``` Output (illustrative): ``` changes: 4–7 clarity: 0.45–0.55 readiness: medium ``` **What just happened.** Three calls did real work: - `before(message)` — returned a `BeliefContext` with `prompt` (a serialized summary of the current belief state, ready to inject into a system prompt), plus the agent's beliefs, goals, gaps, clarity, and recommended next moves. - `fakeAgent(...)` — produced output. In production this is your LLM call; the system prompt is `context.prompt`. - `after(output)` — extracted beliefs from the agent's text, detected if any conflicted with what was already there, and updated the world state. `delta.changes` is the list of what changed; `delta.readiness` is a coarse `'low' | 'medium' | 'high'` label derived from `clarity`. Clarity jumped because the agent now has evidenced claims (Gartner cited as a source) instead of bare assumptions. → Concept: **[The loop](/dev/sdk/patterns)** — patterns for single-turn, multi-turn, streaming, and tool-aware agent loops. --- ## 5. Watching clarity rise A research agent should run more than one turn. Let's loop until clarity is high enough. ```ts async function fakeAgentTurn2(_prompt: string, _focus: string): Promise { return `Looking deeper into enterprise adoption: among Fortune 500 companies, 72% have at least piloted an AI coding assistant, but only 31% have rolled it out company-wide. The biggest blockers cited are security review (mentioned by 58% of CIOs surveyed), licensing complexity (44%), and uncertainty about ROI (37%). Adoption is highest in technology and financial services, lowest in healthcare and government.` } async function fakeAgentTurn3(_prompt: string, _focus: string): Promise { return `On individual developer adoption: of approximately 28 million professional developers worldwide, 9.2 million have used an AI coding assistant at least monthly in 2024 — about 33% penetration. Among those, 4.1 million pay personally for a tool (the rest use free tiers or employer-provided licenses). Average individual spend is ~$15/month across paid users.` } const turns = [fakeAgentTurn2, fakeAgentTurn3] for (let i = 0; i < turns.length; i++) { const ctx = await beliefs.before(userMessage) // If clarity is already high, stop early if (ctx.clarity > 0.7) { console.log(`\nclarity ${ctx.clarity.toFixed(2)} — stopping`) break } // Use the highest-value move as the focus for this turn const focus = ctx.moves[0]?.target ?? userMessage const output = await turns[i](ctx.prompt, focus) const d = await beliefs.after(output) console.log( `turn ${i + 2}: clarity ${d.clarity.toFixed(2)}, ` + `+${d.changes.length} changes, readiness ${d.readiness}`, ) } ``` Output (illustrative — typically reaches `'high'` readiness within a few turns): ``` turn 2: clarity 0.55–0.65, +N changes, readiness medium turn 3: clarity 0.70–0.80, +M changes, readiness high ``` **What just happened.** The loop reads `clarity` and stops when it's high enough. Each turn: - Uses `ctx.moves[0].target` as the focus — the engine suggests the highest-value gap to investigate next. - Calls the "agent" with `ctx.prompt` — a serialized summary of current state, so the agent acts with awareness of what's already known. - Feeds the output back via `after()` — extraction, conflict detection, and clarity recompute happen automatically. After a few turns, clarity typically crosses your `'high'` threshold. The agent has enough to act. → Concept: **[Clarity](/dev/core/clarity)** — what the score actually measures, the two-channel model, and how to use it for routing decisions. → Concept: **[Moves](/dev/core/moves)** — how the engine ranks next-best actions by expected information gain. --- ## 6. Contradictions What happens if a tool returns evidence that disagrees with what the agent already believes? ```ts const conflictingTool = `Tool result from market_research_db: { "source": "IDC Q4 2024 AI DevTools Tracker", "finding": "Global AI developer tools market is $6.8B, not $4.2B as earlier estimates suggested. The discrepancy is because earlier figures excluded embedded AI features in mainstream IDEs (VS Code Copilot, JetBrains AI Assistant). When those are included, the market is 60% larger than Gartner's narrower scope.", "methodology": "Bottom-up survey of 2,400 enterprises across 18 countries" }` const delta = await beliefs.after(conflictingTool, { tool: 'market_research_db', source: 'IDC Q4 2024 Tracker', }) const world = await beliefs.read() console.log('contradictions:', world.contradictions.length) for (const c of world.contradictions) { console.log(' -', c) } ``` Output (the contradiction summary string is engine-formatted and will vary): ``` contradictions: 1 - ``` **What just happened.** The engine extracted a new belief from the tool result ("market is $6.8B") and recognized it directly conflicts with an existing belief ("market is around $4B"). Both are kept — nothing is silently overwritten. The contradiction surfaces in `world.contradictions` (a `string[]`) and reduces the clarity score until you resolve it. The two beliefs aren't equally weighted, though. The new claim has: - A tool source (`market_research_db`) tagged via `{ tool, source }` - A concrete methodology cited in the text The original was `type: 'assumption'` with no evidence. When the system fuses them, the evidenced claim dominates — but the original is preserved in the trace so you can see how the agent's view shifted. → Concept: **[World](/dev/core/world)** — how `world.contradictions` and `world.edges` surface conflicts and supersedence. --- ## 7. Resolving — and following moves Now use the engine's recommended next move to direct what to investigate. ```ts const ctx = await beliefs.before(userMessage) console.log('top 3 moves the engine suggests:') for (const m of ctx.moves.slice(0, 3)) { console.log(` - [${m.action}] ${m.target}`) console.log(` reason: ${m.reason}`) } // Investigate the top move const topMove = ctx.moves[0] if (topMove) { const investigation = `Resolving the market-size question: I cross-checked the IDC figure against Forrester and McKinsey reports. Forrester pegs the "AI-augmented dev tools" market at $7.1B for 2024 — closer to IDC than Gartner. The discrepancy is methodology: Gartner's $4.2B excludes embedded AI features in IDEs, while IDC and Forrester include them. The $6.8-7.1B range is the broader market; $4.2B is the narrow "AI-native" tools market.` await beliefs.after(investigation, { source: 'Forrester + McKinsey cross-check' }) } const final = await beliefs.read() console.log('\nfinal clarity:', final.clarity.toFixed(2)) console.log('contradictions:', final.contradictions.length) ``` Output (illustrative — move actions, targets, and reasons are engine-generated): ``` top 3 moves the engine suggests: - [] reason: - ... final clarity: 0.75–0.85 contradictions: 0 ``` Common `action` values: `clarify`, `gather_evidence`, `resolve_uncertainty`, `compare_paths`, `validate`. `target` is the specific claim or gap to act on (for example, `"market size by region"` or `"enterprise vs individual split"`); `reason` is the engine's plain-English explanation of why the move is high-value right now. **What just happened.** `ctx.moves` is a ranked list of recommended next actions. The engine derived these from the current state — it knows which gaps are open, which beliefs are weakly evidenced, and which contradictions need clarifying. You don't have to plan the next step yourself; you can just route on `moves[0]`. After feeding the cross-check, the engine sees the methodology distinction, supersedes the old "around $4B" assumption, and the contradiction typically resolves. Clarity climbs. → Concept: **[Moves](/dev/core/moves)** — Q-value ranking, executor types, and how to use moves for autonomous routing. --- ## 8. Trace — what changed and why Every transition is recorded. Look at the audit trail. ```ts const entries = await beliefs.trace() console.log(`total transitions: ${entries.length}\n`) console.log('most recent 5:') for (const e of entries.slice(0, 5)) { const conf = e.confidence ? ` (${e.confidence.before?.toFixed(2) ?? '?'} → ${e.confidence.after?.toFixed(2) ?? '?'})` : '' console.log(` - ${e.action}${conf} | ${e.reason ?? '—'}`) } ``` Output (abbreviated; specific reasons and confidence shifts will vary): ``` total transitions: most recent 5: - updated (0.X → 0.Y) | - resolved | - created | - ... ``` Each `TraceEntry` carries `action` (`'created' | 'updated' | 'removed' | 'resolved'`), optional `beliefId`, optional `confidence` shift `{ before, after }`, optional `agent`, optional `source`, `timestamp`, and optional `reason`. **What just happened.** Every belief mutation — created, updated, removed, resolved — landed in the ledger with the reason and the confidence shift. You can replay the agent's reasoning at any point. In production this is what you show on a "why did the agent decide X?" debug page. → Concept: **[Ledger](/dev/internals/how-it-works)** — what's recorded, replay semantics, and how to query the trail. --- ## 9. The complete agent Here's everything assembled into one file. Save as `agent.ts` and run. ```ts import Beliefs from 'beliefs' // ─── Setup ───────────────────────────────────────────────────────── const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY!, namespace: `research-${Date.now()}`, writeScope: 'space', }) const userMessage = 'Research the AI developer tools market' // ─── Stub agent ──────────────────────────────────────────────────── // Each turn returns a realistic agent output. In production, replace // the body of `runAgent` with a call to your LLM (see Section 10). const turnOutputs = [ `Based on a Gartner 2024 report, the AI developer tools market is valued at $4.2B. The top three players (GitHub Copilot, Cursor, and Tabnine) account for approximately 65% of the market. Enterprise adoption is currently around 40% of total spend, with individual developers making up the remainder. Growing at roughly 25% YoY.`, `Looking deeper into enterprise adoption: among Fortune 500 companies, 72% have at least piloted an AI coding assistant, but only 31% have rolled it out company-wide. Biggest blockers: security review (58%), licensing complexity (44%), and uncertainty about ROI (37%). Highest in technology and financial services, lowest in healthcare and government.`, `On individual developer adoption: of approximately 28 million professional developers worldwide, 9.2 million have used an AI coding assistant at least monthly in 2024 — about 33% penetration. Of those, 4.1 million pay personally for a tool. Average individual spend is ~$15/month across paid users.`, ] const conflictingTool = `Tool result from market_research_db: { "source": "IDC Q4 2024 AI DevTools Tracker", "finding": "Global AI developer tools market is $6.8B, not $4.2B. Earlier estimates excluded embedded AI features in mainstream IDEs.", "methodology": "Bottom-up survey of 2,400 enterprises across 18 countries" }` const reconciliation = `Cross-checked IDC against Forrester and McKinsey. Forrester: $7.1B for 2024, closer to IDC. The discrepancy is methodology — Gartner's $4.2B excludes embedded IDE features; IDC and Forrester include them. The $6.8-7.1B range is the broader market; $4.2B is the narrow "AI-native" tools market.` async function runAgent(_systemPrompt: string, turn: number): Promise { return turnOutputs[turn] ?? '' } // ─── Goal + priors ───────────────────────────────────────────────── await beliefs.add(userMessage, { type: 'goal' }) await beliefs.add('AI dev tools market is around $4B', { confidence: 0.6, type: 'assumption', }) await beliefs.add('GitHub Copilot has the largest market share', { confidence: 0.7, type: 'assumption', }) await beliefs.add('Missing breakdown by enterprise vs individual developers', { type: 'gap', }) // ─── Research loop ───────────────────────────────────────────────── const TARGET_CLARITY = 0.7 const MAX_TURNS = 5 for (let turn = 0; turn < MAX_TURNS; turn++) { const ctx = await beliefs.before(userMessage) console.log( `turn ${turn + 1}: clarity ${ctx.clarity.toFixed(2)}, ` + `${ctx.beliefs.length} beliefs, ${ctx.gaps.length} gaps`, ) if (ctx.clarity >= TARGET_CLARITY) { console.log(` → clarity hit ${TARGET_CLARITY}, stopping`) break } const output = await runAgent(ctx.prompt, turn) if (!output) break await beliefs.after(output) } // ─── Conflicting evidence + reconciliation ───────────────────────── console.log('\nfeeding conflicting tool result...') await beliefs.after(conflictingTool, { tool: 'market_research_db', source: 'IDC Q4 2024 Tracker', }) const afterConflict = await beliefs.read() console.log(` contradictions: ${afterConflict.contradictions.length}`) console.log('\ncross-checking and reconciling...') await beliefs.after(reconciliation, { source: 'Forrester + McKinsey' }) // ─── Report ──────────────────────────────────────────────────────── const final = await beliefs.read() console.log('\n── Final state ──') console.log(`clarity: ${final.clarity.toFixed(2)}`) console.log(`beliefs: ${final.beliefs.length}`) console.log(`gaps remaining: ${final.gaps.length}`) console.log(`contradictions: ${final.contradictions.length}`) console.log('\n── Top beliefs ──') const top = [...final.beliefs] .sort((a, b) => b.confidence - a.confidence) .slice(0, 5) for (const b of top) { console.log(` [${b.confidence.toFixed(2)}] ${b.text}`) } console.log('\n── What we still don\'t know ──') for (const gap of final.gaps) console.log(` - ${gap}`) ``` Run it: ```bash npx tsx agent.ts ``` Expected output (illustrative — specific numbers and extracted texts vary across runs): ``` turn 1: clarity 0.25–0.35, 2 beliefs, 1 gaps turn 2: clarity 0.45–0.55, 5–7 beliefs, 1 gaps turn 3: clarity 0.60–0.70, 8–10 beliefs, 1 gaps turn 4: clarity 0.70–0.80, 11–13 beliefs, 0 gaps → clarity hit 0.7, stopping (or: loop exits when stub outputs are exhausted) feeding conflicting tool result... contradictions: 1 cross-checking and reconciling... ── Final state ── clarity: 0.75–0.85 beliefs: 13–15 gaps remaining: 0 contradictions: 0 ── Top beliefs ── [0.85–0.95] [0.80–0.90] [0.75–0.85] ... ── What we still don't know ── (empty when the agent has filled its declared gaps) ``` That's a complete agent. It investigated, recognized when its assumptions were wrong, reconciled competing sources, and stopped when it had enough to act — all without you writing any tracking code. --- ## 10. Swap in a real LLM Replace `runAgent` with a call to your model of choice. The rest of the file stays identical. ### Anthropic ```ts import Anthropic from '@anthropic-ai/sdk' const client = new Anthropic() async function runAgent(systemPrompt: string, turn: number): Promise { const focus = turn === 0 ? userMessage : `Investigate further: ${(await beliefs.before(userMessage)).moves[0]?.target ?? userMessage}` const msg = await client.messages.create({ model: 'claude-sonnet-4-20250514', max_tokens: 1024, system: systemPrompt, messages: [{ role: 'user', content: focus }], }) return msg.content .filter((b): b is { type: 'text'; text: string } => b.type === 'text') .map((b) => b.text) .join('') } ``` ### Vercel AI SDK ```ts import { generateText } from 'ai' import { anthropic } from '@ai-sdk/anthropic' async function runAgent(systemPrompt: string, _turn: number): Promise { const { text } = await generateText({ model: anthropic('claude-sonnet-4-20250514'), system: systemPrompt, prompt: userMessage, }) return text } ``` ### OpenAI ```ts import OpenAI from 'openai' const openai = new OpenAI() async function runAgent(systemPrompt: string, _turn: number): Promise { const completion = await openai.chat.completions.create({ model: 'gpt-4o', messages: [ { role: 'system', content: systemPrompt }, { role: 'user', content: userMessage }, ], }) return completion.choices[0]?.message?.content ?? '' } ``` The belief layer is provider-agnostic. Anything that takes a system prompt and returns text plugs in here. → Reference: **[Hack Guide](/dev/tutorial/hack-guide)** — full integration patterns for every major framework. --- ## What you learned You touched eight concepts in one build: | Concept | Where it appeared | Reference | |---|---|---| | World state | `beliefs.read()` returning the full picture | [World](/dev/core/world) | | Goals | `add(text, { type: 'goal' })` | [Intent](/dev/core/intent) | | Beliefs + types | `add` with `type: 'assumption'`, `'gap'` | [Beliefs](/dev/core/beliefs) | | The loop | `before → act → after` | [Loop Patterns](/dev/sdk/patterns) | | Clarity | Stopping condition | [Clarity](/dev/core/clarity) | | Moves | Ranked next actions | [Moves](/dev/core/moves) | | Contradictions | Auto-detected via `after()` | [World](/dev/core/world) | | Trace | Audit trail of every transition | [Ledger](/dev/internals/how-it-works) | ## Where to go next You now have the model. The rest of the docs are reference for the parts you haven't needed yet. ## Hack Guide Source: https://thinkn.ai/dev/tutorial/hack-guide Summary: Zero to building with beliefs in 10 minutes. Everything you need for the hackathon. ## Get Your Key 1. Sign in at [thinkn.ai](https://thinkn.ai) 2. Go to [Profile > API Keys](/profile/api-keys) 3. Click **Create Key**, copy the `bel_live_...` value ```bash export BELIEFS_KEY=bel_live_... ``` ## Install ```bash npm i beliefs ``` Verify the connection: ```bash node -e "import('beliefs').then(async ({default: B}) => { const b = new B({ apiKey: process.env.BELIEFS_KEY, namespace: 'hack-guide', writeScope: 'space' }); const s = await b.read(); console.log('beliefs:', s.beliefs.length, 'clarity:', s.clarity) })" ``` You should see `beliefs: 0 clarity: 0.25` — an empty belief state with baseline clarity, ready to go. These examples use `writeScope: 'space'` so they run immediately. For chat apps, keep the SDK default `writeScope: 'thread'` and bind a thread with `thread` or `beliefs.withThread(threadId)`. Give your agent the SDK reference so it can write correct code on the first try: `https://thinkn.ai/llms.txt` ## The Pattern Every agent turn follows three steps: ```ts // 1. What does the agent believe right now? const context = await beliefs.before(userMessage) // 2. Run your agent with belief context injected const result = await myAgent.run({ system: context.prompt }) // 3. Feed the output — beliefs extracted automatically const delta = await beliefs.after(result.text) ``` That is the entire integration. `before()` gives your agent context about what's already known. `after()` feeds the result back, so the world model learns from the turn — claims extracted, conflicts detected, confidence updated, next moves recomputed. ``` ┌─────────┐ ┌───────────┐ ┌─────────┐ │ before()│────▶│ your agent │────▶│ after() │ │ beliefs │ │ runs here │ │ extract │ │ + moves │ │ │ │ + fuse │ └─────────┘ └───────────┘ └────┬────┘ ▲ │ └──────────── next turn ───────────┘ ``` ### What comes back `before()` gives you a `BeliefContext`: - `prompt` — inject this into your agent's system prompt - `beliefs` — current claims with confidence scores - `gaps` — what the agent doesn't know yet - `clarity` — 0-1 readiness score (higher = more confident) - `moves` — ranked next actions by expected information gain `after()` gives you a `BeliefDelta`: - `changes` — what was created, updated, or removed - `clarity` — updated readiness score - `readiness` — `'low'`, `'medium'`, or `'high'` - `moves` — updated next actions - `state` — full world state after this turn --- ## Framework Recipes ### Vercel AI SDK Best for: streaming, multi-provider model swapping, and TypeScript-first ergonomics. ```ts import { generateText } from 'ai' import { anthropic } from '@ai-sdk/anthropic' import Beliefs from 'beliefs' const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, namespace: 'vercel-ai-hack', writeScope: 'space', }) async function research(question: string) { const context = await beliefs.before(question) const { text } = await generateText({ model: anthropic('claude-sonnet-4-20250514'), system: context.prompt, prompt: question, }) const delta = await beliefs.after(text) console.log(`clarity: ${delta.clarity}, changes: ${delta.changes.length}`) return text } ``` With streaming: ```ts import { streamText } from 'ai' const context = await beliefs.before(question) const result = streamText({ model: anthropic('claude-sonnet-4-20250514'), system: context.prompt, prompt: question, }) let fullText = '' for await (const chunk of result.textStream) { process.stdout.write(chunk) fullText += chunk } await beliefs.after(fullText) ``` Call `after()` exactly once per turn, after the stream completes. Do not call it on partial chunks — each call triggers extraction and fusion. Calling per-chunk creates duplicate beliefs from incomplete text. ### Anthropic SDK Best for: direct control over Claude features (tool use, vision, extended thinking), minimal dependencies. ```ts import Anthropic from '@anthropic-ai/sdk' import Beliefs from 'beliefs' const client = new Anthropic() const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, namespace: 'anthropic-hack', writeScope: 'space', }) async function research(question: string) { const context = await beliefs.before(question) const message = await client.messages.create({ model: 'claude-sonnet-4-20250514', max_tokens: 4096, system: context.prompt, messages: [{ role: 'user', content: question }], }) const text = message.content .filter(b => b.type === 'text') .map(b => b.text) .join('') const delta = await beliefs.after(text) return { text, delta } } ``` ### OpenAI SDK Best for: GPT/o-series models, Responses API workflows, OpenAI-native ecosystems. ```ts import OpenAI from 'openai' import Beliefs from 'beliefs' const openai = new OpenAI() const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, namespace: 'openai-hack', writeScope: 'space', }) async function research(question: string) { const context = await beliefs.before(question) const completion = await openai.chat.completions.create({ model: 'gpt-4o', messages: [ { role: 'system', content: context.prompt }, { role: 'user', content: question }, ], }) const text = completion.choices[0]?.message?.content ?? '' const delta = await beliefs.after(text) return { text, delta } } ``` > If using an o-series model (o3, o4-mini), change `role: 'system'` to `role: 'developer'`. ### Any LLM / Plain Fetch Best for: serverless/edge runtimes, custom or self-hosted models, anywhere you don't want a vendor SDK in the dependency tree. The SDK works with anything that produces text. Call `before`, pass `context.prompt` to your model, call `after` with the output. ```ts import Beliefs from 'beliefs' const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, namespace: 'plain-fetch-hack', writeScope: 'space', }) async function withBeliefs(input: string, runAgent: (prompt: string) => Promise) { const context = await beliefs.before(input) const output = await runAgent(context.prompt + '\n\nUser: ' + input) const delta = await beliefs.after(output) return { output, delta } } ``` --- ## Project Ideas The snippets below use `callLLM(systemPrompt, userMessage)` and `searchWeb(query)` as stand-ins for your model and search tool of choice. Plug in any of the four framework recipes above (Vercel AI, Anthropic, OpenAI, plain fetch) wherever you see `callLLM(...)`, and any search API for `searchWeb(...)`. The point of these examples is the belief flow, not the LLM wiring. ### Research Agent (Beginner) An agent that researches a topic and tracks what it knows, what conflicts, and what's missing. Use `clarity` to decide when to stop researching and summarize. ```ts const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, namespace: 'research-agent', writeScope: 'space', }) async function deepResearch(topic: string) { await beliefs.add(`Research: ${topic}`, { type: 'goal' }) for (let turn = 0; turn < 5; turn++) { const context = await beliefs.before(topic) if (context.clarity > 0.7) break const result = await callLLM(context.prompt, topic) const delta = await beliefs.after(result) console.log(`Turn ${turn + 1}: clarity ${delta.clarity.toFixed(2)}, ` + `${delta.changes.length} new beliefs`) } const world = await beliefs.read() return { beliefs: world.beliefs, gaps: world.gaps, clarity: world.clarity } } deepResearch('AI developer tools market').then(console.log) ``` ### Multi-Agent Debate (Intermediate) Two agents with different perspectives contribute to the same namespace. The belief system detects contradictions and tracks which claims survive. ```ts const optimist = new Beliefs({ apiKey, agent: 'optimist', namespace: 'debate', writeScope: 'space' }) const skeptic = new Beliefs({ apiKey, agent: 'skeptic', namespace: 'debate', writeScope: 'space' }) const bullCase = await callLLM('Make the bull case for AI startups in 2026') await optimist.after(bullCase) const bearCase = await callLLM('Make the bear case for AI startups in 2026') await skeptic.after(bearCase) const world = await optimist.read() console.log(`Contradictions: ${world.contradictions.length}`) console.log(`Beliefs: ${world.beliefs.length}`) ``` ### Fact Checker (Intermediate) Verify claims by gathering evidence. Watch confidence shift as supporting and refuting evidence arrives. ```ts const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, namespace: 'fact-check', writeScope: 'space', }) async function checkClaim(claim: string) { await beliefs.add(claim, { confidence: 0.5 }) const sources = await searchWeb(claim) for (const source of sources) { await beliefs.after(source.text, { tool: 'web_search' }) } const page = await beliefs.list({ query: claim }) return page.beliefs.map(b => ({ text: b.text, confidence: b.confidence })) } checkClaim('Global AI market is worth $200B by 2030').then(console.log) ``` ### Decision Support (Advanced) Use `moves` and `clarity` to build a system that tells you when you have enough information to make a decision, and what you should investigate next. ```ts const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, namespace: 'decision-support', writeScope: 'space', }) async function decisionLoop(question: string) { await beliefs.add(question, { type: 'goal' }) while (true) { const context = await beliefs.before(question) if (context.clarity > 0.8) { return { recommendation: context.beliefs, confidence: context.clarity } } const nextMove = context.moves[0] if (!nextMove) break console.log(`Investigating: ${nextMove.target} (value: ${nextMove.value})`) const result = await callLLM( `${context.prompt}\n\nInvestigate: ${nextMove.target}` ) await beliefs.after(result) } } decisionLoop('Should we enter the European market?').then(console.log) ``` --- ## API Cheatsheet | Method | What it does | Returns | |--------|-------------|---------| | `before(input?)` | Get current beliefs + next moves | `BeliefContext` | | `after(text, { tool? })` | Feed agent output, extract beliefs | `BeliefDelta` | | `add(text, opts?)` | Assert a belief, goal, or gap | `BeliefDelta` | | `add([...items])` | Assert multiple in one request | `BeliefDelta` | | `resolve(text)` | Mark a gap as resolved (exact text match) | `BeliefDelta` | | `retract(id, reason?)` | Retract a belief (stays in graph) | `BeliefDelta` | | `remove(id)` | Delete a belief entirely | `BeliefDelta` | | `reset()` | Clear all state in this scope | `{ removed }` | | `read()` | Full world state with clarity + moves | `WorldState` | | `snapshot()` | Lightweight state without clarity/moves | `BeliefSnapshot` | | `list({ query, filter, limit })` | Paged search by keyword + filters | `BeliefList` | | `trace(beliefId?)` | Audit trail of belief changes | `TraceEntry[]` | Belief types: `claim`, `assumption`, `evidence`, `risk`, `gap`, `goal` --- ## Troubleshooting **`BetaAccessError: beliefs is in private beta…`** Either your API key is missing from the environment, or it's not on the beta allowlist. Check that `BELIEFS_KEY` is exported in the shell you're running from, then verify the key at [Profile > API Keys](/profile/api-keys). New keys start with `bel_live_`. **`resolve()` didn't remove my gap** `resolve(text)` matches gap text exactly. Pass the same string you originally added, or call `read()` and copy the gap text from `state.gaps`. **HTTP 429 — Rate limit exceeded** The API allows 60 requests/minute per key. Add a small delay between calls in loops, or batch your work into fewer turns. **Empty beliefs after `after()`** The text you passed might be too short or not contain extractable claims. Try passing a longer, more substantive output. The extraction works best with paragraphs of analysis, not single sentences. **`before()` returns empty state** This is expected on a fresh namespace. Beliefs accumulate as you call `after()` and `add()`. The first `before()` will always have zero beliefs. **Different agents not seeing each other's beliefs** Make sure both agents use the same `namespace` and a shared write scope. The `agent` parameter identifies who contributed, but `namespace` plus `writeScope: 'space'` determine the shared state. ```ts const a = new Beliefs({ apiKey, agent: 'agent-a', namespace: 'shared', writeScope: 'space' }) const b = new Beliefs({ apiKey, agent: 'agent-b', namespace: 'shared', writeScope: 'space' }) ``` **Need to see what the SDK is doing?** Enable debug mode to log every request and response: ```ts const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, namespace: 'debug-example', writeScope: 'space', debug: true, }) ``` --- # Reference ## Core API Source: https://thinkn.ai/dev/sdk/core-api Summary: The beliefs API — what each method does and what comes back. The SDK is a thin client over the belief engine. The core loop is `before`, `after`, and either `add` or `resolve` for explicit edits — that covers most use cases. Everything else (streaming, multi-agent, trust overrides, forecasting) is convenience for specific workflows. The SDK does not modify your agent, decide what it does, or sit in the critical path of your LLM calls. It observes, extracts, and surfaces. When you call `before` and `after`, the engine handles extraction, linking (supports/contradicts/derives), deduplication, fusion across multiple agents, and provenance recording — and surfaces decision aids you can route on (`clarity`, `moves`). ``` Your Agent Loop │ ├── beliefs.before() ←── get context + moves │ ├── agent.run() ←── your agent, unchanged │ └── beliefs.after() ←── feed observation → extract → fuse ``` ## Setup ```ts import Beliefs from 'beliefs' const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, namespace: 'project-alpha', writeScope: 'space', }) ``` That's it. The only required option is `apiKey`. For multi-agent systems, add `agent` and `namespace`: ```ts const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, agent: 'research-agent', namespace: 'project-alpha', writeScope: 'space', }) ``` | Option | Default | What it does | |--------|---------|-------------| | `apiKey` | — | **Required.** Your API key. | | `agent` | `'agent'` | Who is contributing beliefs. Use different names for different agents sharing a namespace. | | `namespace` | `'default'` | Developer-facing isolation boundary. Each namespace maps to its own backing workspace. | | `thread` | — | Bind a thread for `writeScope: 'thread'`. Use this for per-conversation or per-task memory. | | `writeScope` | `'thread'` | Which layer is authoritative: `'thread'`, `'agent'`, or `'space'`. See [Scoping](/dev/sdk/scoping). | | `contextLayers` | Depends on `writeScope` | Which layers `before()` and `read()` merge. Thread defaults to `['self', 'agent', 'space']`. | | `baseUrl` | `'https://www.thinkn.ai'` | Override the API origin for local or self-hosted environments. | | `timeout` | `120000` | Request timeout in ms. | | `maxRetries` | `2` | Auto-retries on 429/5xx with exponential backoff. | | `debug` | `false` | Logs every request and response to console. | For copy-paste examples, `writeScope: 'space'` is the simplest setup. The SDK default is `writeScope: 'thread'`, which is ideal for chat and session memory but requires a bound thread. ### `beliefs.withThread(threadId)` Bind a thread later while preserving the rest of the client config: ```ts const baseBeliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, namespace: 'support', writeScope: 'thread', }) const beliefs = baseBeliefs.withThread(conversationId) ``` --- ## `beliefs.before(input?)` Get the agent's current understanding before it acts. ```ts const context = await beliefs.before('Research the AI tools market') ``` Returns: ```json { "prompt": "{\"state\":{\"goals\":[\"Determine total addressable market\"],\"claims\":[{\"text\":\"AI tools market is valued at $4.2B\",\"confidence\":0.85},{\"text\":\"GitHub Copilot has dominant market share\",\"confidence\":0.85}],\"phase\":\"researching\",\"uncertainty\":0.58},\"gaps\":[\"Missing APAC market data\"],\"contradictions\":[]}", "beliefs": [ { "id": "xK9mR2vL3pT4nW8q", "text": "AI tools market is valued at $4.2B", "confidence": 0.80, "type": "claim" }, { "id": "pT4nW8qJ5mR7vL2x", "text": "GitHub Copilot has dominant market share", "confidence": 0.85, "type": "claim" } ], "goals": ["Determine total addressable market"], "gaps": ["Missing APAC market data"], "clarity": 0.42, "moves": [ { "action": "research", "target": "xK9mR2vL3pT4nW8q", "reason": "Market size has one source — verify with a second", "value": 0.7 } ] } ``` **What you use:** Inject `context.prompt` into your agent's system prompt. Check `context.clarity` to decide whether to keep investigating or act. Follow `context.moves[0]` for the highest-value next step. `context.prompt` is a JSON-serialized belief brief. Drop it directly into your system prompt — don't `JSON.parse` it, don't try to template fields out of it. The string itself is the artifact your agent reads. --- ## `beliefs.after(text, options?)` Feed the agent's output. Beliefs are extracted, conflicts detected, and the world state updated automatically. ```ts const delta = await beliefs.after(agentOutput) ``` For tool results, tag the tool name so the system knows the source: ```ts await beliefs.after(JSON.stringify(searchResults), { tool: 'web_search' }) ``` You can also label the source explicitly — this gets stored on each extracted belief and appears in the trace: ```ts await beliefs.after(agentOutput, { source: 'quarterly-earnings-call' }) ``` Returns: ```json { "changes": [ { "action": "created", "beliefId": "hV7bQ3kN9yU6wE4r", "text": "European market is 28% of global revenue" }, { "action": "updated", "beliefId": "xK9mR2vL3pT4nW8q", "text": "AI tools market is valued at $4.2B", "confidence": { "before": 0.80, "after": 0.75 }, "reason": "New regional data suggests original estimate may exclude segments" } ], "clarity": 0.58, "readiness": "medium", "moves": [ { "action": "gather_evidence", "target": "xK9mR2vL3pT4nW8q", "reason": "Market size estimate weakened — need authoritative source", "value": 0.8 } ], "state": { "beliefs": ["..."], "goals": ["..."], "gaps": ["..."], "clarity": 0.58, "..." : "..." } } ``` **What you use:** Check `delta.readiness` to route — `'high'` means act, `'low'` means keep investigating. `delta.changes` tells you exactly what the system learned. `delta.state` is the full world state if you need it. --- ## `beliefs.observe(envelope)` Structured-input primitive for non-agent surfaces — UI events, document edits, manual ingest, hooks, integration webhooks. Where `before()` / `after()` model an agent loop, `observe()` carries the engine's full provenance vocabulary (`surface`, `kind`, `tags`, `agentId`, `actor`) directly so non-agent consumers don't smuggle metadata through a `source` string. ```ts await beliefs.observe({ content: 'User dragged "Pricing" onto the Decisions frame.', surface: 'canvas', kind: 'block_moved', actor: 'user', tags: [`block:${blockId}`, `frame:${frameId}`], }) ``` Envelope fields: `content` (required, non-empty), `actor` (default `'assistant'`), `surface` (e.g. `'chat'`, `'canvas'`, `'document'`, `'tool'`, `'integration'`), `kind` (sub-event tag like `'block_created'`, `'doc_written'`), `tags` (free-form provenance tags), `agentId`, `depth`, `signalFocused`. Returns: ```json { "success": true, "applied": true, "extractionStatus": "ok", "beliefsExtracted": 3, "edgesCreated": 1, "contradictionsDetected": 0, "gapsResolved": 0 } ``` `extractionStatus` is `'ok'`, `'empty'` (ran but produced no beliefs), or `'error'` (with `extractionError` set). `applied` is `true` only when a non-empty delta hit the world state. --- ## `beliefs.add(text, options?)` Assert something the agent knows. Use this to seed beliefs, set goals, or flag gaps. ```ts await beliefs.add('The market is $4.2B', { confidence: 0.85, source: 'IDC Q4 2025 Tracker', }) await beliefs.add('Determine total addressable market', { type: 'goal' }) await beliefs.add('Missing APAC market data', { type: 'gap' }) ``` Options: `confidence` (0–1, default 0.5), `type` (`'claim'`, `'assumption'`, `'evidence'`, `'risk'`, `'gap'`, `'goal'`), `source` (where this belief came from — document name, URL, tool, etc.), `evidence` (source text), `supersedes` (text of a belief this replaces), `mode` (see below). ### `mode` option Three ingest paths are available; the default behavior is unchanged. | `mode` | Engine route | When to use | |--------|-------------|-------------| | omitted | `/ingest` | Default. The engine dispatches based on body shape. | | `'claims'` | `/ingest/claims` | Deterministic hot path. No LLM invocation; faster when you already know the structured shape. | | `'output'` | `/ingest/output` | LLM-gated extraction from raw text. Equivalent to `after(text)`, exposed on `add()` for symmetry. | ```ts // Default — backwards-compatible, same behavior as v0.6.0: await beliefs.add('Market is $4.2B', { confidence: 0.85 }) // Explicit deterministic path (fewer LLM calls): await beliefs.add('Market is $4.2B', { mode: 'claims', confidence: 0.85 }) // Extract beliefs from raw output: await beliefs.add(longTranscript, { mode: 'output', source: 'agent-trace' }) ``` `mode: 'output'` only works with the single-text overload; passing it to `add(items[], { mode: 'output' })` throws `TypeError`. For batch ingestion of structured items, use `mode: 'claims'` or omit `mode`. Returns `BeliefDelta` — same shape as `after()`. --- ## `beliefs.add(items)` Assert multiple items in a single request. All items are processed as one atomic delta. ```ts await beliefs.add([ { text: 'Market is $4.2B', confidence: 0.8, source: 'IDC Q4 2025 Tracker' }, { text: 'Missing APAC data', type: 'gap' }, { text: 'Determine TAM', type: 'goal' }, ]) ``` Returns `BeliefDelta` — same shape as `after()`. --- ## `beliefs.resolve(text)` Mark a gap as resolved. ```ts const delta = await beliefs.resolve('Missing APAC market data') ``` Returns `BeliefDelta`. --- ## `beliefs.retract(beliefId, reason?)` Retract a belief. The belief stays in the graph with `lifecycle: 'retracted'` so the audit trail is preserved. Use this when the agent no longer believes something. ```ts await beliefs.retract('xK9mR2vL3pT4nW8q', 'Superseded by updated market data') ``` The retracted belief remains visible in `read()` and `snapshot()` with `lifecycle: 'retracted'`. The reason appears in `trace()` as the `reasoning` field. Returns `BeliefDelta`. --- ## `beliefs.remove(beliefId)` Delete a belief from the graph entirely. A final ledger entry is recorded for traceability. Use this for cleanup of garbage or accidental beliefs. ```ts await beliefs.remove('xK9mR2vL3pT4nW8q') ``` Unlike `retract()`, the belief is gone from state after removal. Use `trace()` to see the removal in the audit trail. Returns `BeliefDelta`. --- ## `beliefs.removeWhere(filter)` Bulk-remove every belief that originated from a specific provenance reference. Used when an upstream block, document, or message is deleted and its derived beliefs should go with it. ```ts const { removed } = await beliefs.removeWhere({ source: `block:${blockId}` }) console.log(`Retracted ${removed} beliefs`) ``` The `source` filter is a `':'` string. **Currently only `'block:'` is supported** — the call throws `BeliefsError('remove_where/unsupported_source')` for any other kind. Engine support for additional kinds (`agent:`, `thread:`, `source:`) is in flight; until it lands, retract individual beliefs with `retract()` or `remove()`. Returns `{ success: true, removed: number, source: string }` — `removed` is the count of beliefs retracted (zero when nothing matched). --- ## `beliefs.reset()` Remove all beliefs, goals, gaps, and intents in this scope. Every removal is recorded in the ledger. ```ts const { removed } = await beliefs.reset() console.log(`Cleared ${removed} items`) ``` Returns `{ removed: number }` — the count of items removed. Reset clears everything in the current authoritative scope. For `writeScope: 'thread'` that means one thread. For `writeScope: 'agent'` it means one agent's durable memory. For `writeScope: 'space'` it clears the shared namespace-wide state. The audit trail is preserved in the ledger, but the state itself is wiped clean. --- ## `beliefs.read()` Full world state with clarity, moves, and a serialized prompt. ```ts const world = await beliefs.read() ``` Returns: ```json { "beliefs": [ { "id": "xK9mR2vL3pT4nW8q", "text": "AI tools market is valued at $6.8B", "confidence": 0.95, "type": "claim" }, { "id": "pT4nW8qJ5mR7vL2x", "text": "GitHub Copilot market share has declined to 32%", "confidence": 0.90, "type": "claim" } ], "goals": ["Determine total addressable market"], "gaps": ["Missing APAC market data"], "edges": [ { "type": "contradicts", "source": "xK9mR2vL3pT4nW8q", "target": "hV7bQ3kN9yU6wE4r", "confidence": 0.8 } ], "contradictions": ["AI tools market is valued at $4.2B vs AI tools market is valued at $6.8B"], "clarity": 0.72, "moves": [ { "action": "research", "target": "xK9mR2vL3pT4nW8q", "reason": "APAC data would complete the picture", "value": 0.6 } ], "prompt": "{\"state\":{\"goals\":[...],\"claims\":[...],\"phase\":\"researching\"},\"gaps\":[...],\"contradictions\":[...]}" } ``` --- ## `beliefs.snapshot()` Same as `read()` but faster — skips computing clarity, moves, and prompt. Use when you only need the raw state. ```ts const snap = await beliefs.snapshot() console.log(`${snap.beliefs.length} beliefs, ${snap.gaps.length} gaps`) ``` Returns beliefs, goals, gaps, edges, and contradictions. No clarity, moves, or prompt. --- ## `beliefs.stateAt(options?)` Replay belief state at a specific point in time. Use one of `step`, `traceId`, or `asOf` to select the replay window; pass `beliefId` and/or `agentId` to narrow the deltas considered. Returned state is the same shape as `snapshot()`. ```ts // State as of 24 hours ago: const yesterday = new Date(Date.now() - 24 * 3600_000).toISOString() const { state, appliedDeltas } = await beliefs.stateAt({ asOf: yesterday }) console.log(`Replayed ${appliedDeltas} deltas to reconstruct state`) // State after a specific trace: const replay = await beliefs.stateAt({ traceId: 'trace-abc-123' }) // State for one belief's history: const beliefHistory = await beliefs.stateAt({ beliefId: 'b-market-size' }) ``` Options: | Option | Type | What it does | |--------|------|--------------| | `beliefId` | `string` | Replay only this belief's deltas. | | `agentId` | `string` | Restrict to one agent's contributions. | | `step` | `number` | Replay every delta up to (and including) this seq number. | | `traceId` | `string` | Replay deltas in this trace plus everything before. | | `asOf` | `string` | ISO timestamp; replay deltas with `appliedAt ≤ this time`. | Returns: ```ts { state: BeliefSnapshot // same shape as snapshot() appliedDeltas: number // count of archived deltas replayed } ``` A workspace with no archive activity returns an empty state with `appliedDeltas: 0` rather than erroring — the cold-start case is honest, not a 404. --- ## `beliefs.graph(options?)` Render-focused projection of the belief graph — nodes, edges, contradictions, and aggregate stats. Use this when you need to draw the graph in a UI or analyze its shape. ```ts const projection = await beliefs.graph({ filter: { kinds: ['claim', 'goal'], minConfidence: 0.4, limit: 200 }, }) for (const node of projection.nodes) renderNode(node) for (const edge of projection.edges) renderEdge(edge) ``` Options: | Option | Type | What it does | |--------|------|--------------| | `filter.limit` | `number` | Cap on returned nodes (engine default applies if omitted). | | `filter.kinds` | `string[]` | Restrict to specific node kinds (`'claim'`, `'goal'`, `'gap'`, etc.). | | `filter.minConfidence` | `number` | Drop edges below this confidence (0–1). | | `filter.maxContradictions` | `number` | Cap on returned contradiction pairs. | | `scope` | `{ spaceId?, studioId?, sessionId? }` | Optional scope override; defaults to the scope bound at construction. | Returns: ```ts { nodes: GraphNode[] edges: GraphEdge[] contradictions: ContradictionPair[] stats: { nodeCount: number edgeCount: number contradictionCount: number edgesByLayer?: { explicit: number, ledger: number, contradiction: number, similarity: number, domain: number } } } ``` The `stats.edgesByLayer` breakdown tells you which kind of edge each one is: `explicit` are user-asserted, `ledger` are causal-history derived, `contradiction` and `similarity` are detected, `domain` are extension-supplied. --- ## `beliefs.list(options?)` Paged search over beliefs by query and filters. Supersedes the older `search(query)` method, which is now deprecated and will be removed in a future minor. ```ts const page = await beliefs.list({ query: 'market size', filter: { type: ['claim', 'goal'] }, limit: 25, }) for (const b of page.beliefs) render(b) if (page.nextCursor) // ...fetch the next page ``` Options: | Option | Type | What it does | |--------|------|--------------| | `query` | `string` | Full-text search. Falls back to a plain snapshot scan when omitted. | | `filter.type` | `string \| string[]` | Restrict to specific belief types (`'claim'`, `'goal'`, `'gap'`, etc.). | | `filter.source` | `string \| string[]` | Restrict to one or more sources. | | `filter.lifecycle` | `string \| string[]` | Restrict to specific lifecycle states. | | `limit` | `number` | Max items returned. Forwarded to the engine and re-enforced after client-side filtering. | | `cursor` | `string` | Forward-paginate via `nextCursor` from a prior response. | Returns `{ beliefs: Belief[], nextCursor?: string }`. --- ## `beliefs.get(beliefId)` Detail page for one belief — the belief itself plus inline supporting and contradicting relations, cross-belief links, history timeline, recommended next move, and an optional clarity sidebar. One method, one round-trip from the consumer's perspective; internally the SDK fans out to three engine endpoints in parallel. ```ts const detail = await beliefs.get('belief-abc123') renderHeader(detail.belief.text, detail.belief.clarity) for (const n of detail.relations.supporting) renderSupport(n) for (const n of detail.relations.contradicting) renderContradiction(n) if (detail.thinkingMove) renderRecommendedMove(detail.thinkingMove) ``` Returns: ```ts { success: boolean belief: { id, title?, text, category, kind, clarity, source, provenanceType?, updatedAt?, sourceId?, childEvidence, } relations: { supporting: GraphNeighbor[] contradicting: GraphNeighbor[] } links: BeliefDetailLink[] // cross-belief links shown in "Linked beliefs" history: BeliefDetailHistoryItem[] // most-recent-first thinkingMove: ThinkingMove | null // engine's recommended next move clarity?: BeliefDetailClarity // optional sidebar with insights + readiness durationMs: number } ``` Designed so a detail page can render entirely from one response — no follow-up calls needed for evidence, contradictions, or history. --- ## `beliefs.trace(beliefId?)` Audit trail. See every transition — what changed, when, why, and who changed it. ```ts const history = await beliefs.trace() ``` Returns: ```json [ { "action": "updated", "beliefId": "xK9mR2vL3pT4nW8q", "confidence": { "before": 0.80, "after": 0.95 }, "agent": "research-agent", "source": "IDC Q4 2025 Tracker", "timestamp": "2026-04-08T14:23:01Z", "reason": "IDC Q4 2025 tracker provided authoritative $6.8B figure" }, { "action": "created", "beliefId": "hV7bQ3kN9yU6wE4r", "agent": "research-agent", "source": "agent-output", "timestamp": "2026-04-08T14:22:45Z", "reason": "Extracted from European market analysis" } ] ``` Trace a single belief's history: ```ts const oneBeliefHistory = await beliefs.trace('xK9mR2vL3pT4nW8q') ``` --- ## Errors ```ts import Beliefs, { BetaAccessError, BeliefsError } from 'beliefs' ``` **`BetaAccessError`** — API key missing, invalid, or account lacks access (401/403). ```ts try { await beliefs.before(input) } catch (err) { if (err instanceof BetaAccessError) { console.log(err.signupUrl) // 'https://thinkn.ai/waitlist' } } ``` **`BeliefsError`** — server errors with structured codes and retry guidance. The SDK auto-retries transient errors (429, 5xx) with exponential backoff, so you only see these after retries are exhausted. ```ts try { await beliefs.after(result.text) } catch (err) { if (err instanceof BeliefsError) { console.log(err.code) // 'rate_limit/exceeded' console.log(err.retryable) // true } } ``` | HTTP | Error | Retryable | Example codes | |------|-------|-----------|---------------| | 400 | `BeliefsError` | No | `validation/invalid_json` | | 401/403 | `BetaAccessError` | No | `auth/missing_key` | | 429 | `BeliefsError` | Yes | `rate_limit/exceeded` | | 5xx | `BeliefsError` | Yes | `internal/error` | --- ## Types ### Belief The core unit. Every claim, assumption, and risk is a belief with a confidence score. ```ts { id: string text: string confidence: number // 0–1 type: string // 'claim', 'assumption', 'evidence', 'risk', 'gap', 'goal' createdAt: string } ```
Additional fields These are present when the server provides richer data. You don't need them to get started. | Field | Type | What it tells you | |-------|------|-------------------| | `label` | `string` | Semantic label: `'limiting-belief'`, `'load-bearing'`, etc. | | `evidenceWeight` | `number` | How much evidence backs this belief. `0` = uninvestigated prior. | | `distribution` | `string` | `'claim'` (true/false), `'category'` (multinomial), `'measurement'` (numeric) | | `lifecycle` | `string` | `'active'`, `'retracted'`, `'invalidated'`, `'expired'`, `'resolved'` | | `provenance` | `string` | `'user-created'`, `'research-discovered'`, `'chat-extracted'`, `'agent-generated'` | | `source` | `string` | Where this belief came from — document name, URL, tool, agent output label. | | `updatedAt` | `string` | Last modification timestamp | `confidence` alone doesn't tell you how well-founded a belief is. Confidence `0.5` with `evidenceWeight: 0` means no one has looked. Confidence `0.5` with `evidenceWeight: 40` means extensive evidence but genuine uncertainty. Use `evidenceWeight` to distinguish "unknown" from "uncertain."
### Move A suggested next action, ranked by expected information gain. ```ts { action: string // 'research', 'gather_evidence', 'clarify', 'validate_assumption', etc. target: string // which belief this move targets reason: string // why this is the best next step value: number // expected information gain (0–1) executor?: string // 'agent', 'user', or 'both' } ``` ### Edge A relationship between two beliefs. ```ts { type: string // 'supports', 'contradicts', 'supersedes', 'derived_from', 'depends_on' source: string target: string confidence: number } ``` ### DeltaChange What happened to a single belief during a mutation. ```ts { action: string // 'created', 'updated', 'removed', 'resolved' beliefId: string text: string confidence?: { before?: number, after?: number } reason?: string source?: string // where this change originated from } ```
Clarity channels When available, clarity breaks down into four dimensions you can access via `channels` on `BeliefContext`, `BeliefDelta`, or `WorldState`: ```ts const context = await beliefs.before(input) if (context.channels) { console.log(context.channels.knowledgeCertainty) // how confident in current knowledge console.log(context.channels.coverage) // how much of the goal space is covered console.log(context.channels.coherence) // consistency across beliefs console.log(context.channels.decisionResolution) // how well decisions are resolved } ```
## Patterns Source: https://thinkn.ai/dev/sdk/patterns Summary: How to structure your agent loop with beliefs — single-turn, multi-turn, streaming, tool-aware, multi-agent, and the smaller patterns that compose with them. Every agent using beliefs follows the same `before → act → after` cycle. The difference is how you arrange that cycle for your use case. These patterns assume you already chose an appropriate scope. For copy-paste examples, `writeScope: 'space'` is the simplest starting point. For chat apps, bind `writeScope: 'thread'` with `thread` or `beliefs.withThread(threadId)`. See [Scoping](/dev/sdk/scoping). Snippets below use `callLLM(systemPrompt, userMessage)` as a stand-in for your model. Replace it with whichever framework you ship on (Vercel AI, Anthropic SDK, OpenAI, plain fetch — see the [Hack Guide](/dev/tutorial/hack-guide) for working recipes). The point of these examples is the belief flow. ## Choosing a loop pattern ``` ┌─ Is this a single request/response? ──→ Single-turn │ ├─ Does the agent use tools? ──→ Tool-aware │ ├─ Does the agent stream output? ──→ Streaming │ ├─ Should the agent loop until confident? ──→ Multi-turn │ └─ Do multiple agents collaborate? ──→ Multi-agent ``` Most production agents combine patterns — a multi-turn loop with streaming and tool use. Start with the simplest pattern that fits, then layer in complexity. --- ## Single-Turn The simplest integration. One `before`, one agent call, one `after`. ```ts async function answer(question: string) { const context = await beliefs.before(question) const result = await callLLM(context.prompt, question) const delta = await beliefs.after(result) return result } ``` **When to use:** Chatbots, Q&A, any request/response flow where you want to accumulate knowledge across interactions but don't need to loop within a single request. **What you get:** Beliefs accumulate across calls within the same scope (the same `thread` if you're thread-scoped, or the same `namespace` if you're space-scoped). The second time the user asks about a topic, `before()` returns richer context with existing beliefs, gaps, and moves. --- ## Multi-Turn (Clarity-Driven) Loop until the agent has enough confidence to act. Use `clarity` as the stopping condition. ```ts async function research(question: string) { await beliefs.add(question, { type: 'goal' }) for (let turn = 0; turn < 10; turn++) { const context = await beliefs.before(question) // Stop when clarity is high enough if (context.clarity > 0.7) { return { beliefs: context.beliefs, clarity: context.clarity, gaps: context.gaps, } } // Follow the highest-value move const focus = context.moves[0]?.target ?? question const result = await callLLM(context.prompt, focus) const delta = await beliefs.after(result) console.log( `Turn ${turn + 1}: clarity ${delta.clarity.toFixed(2)}, ` + `${delta.changes.length} changes` ) } // Hit turn limit — return what we have return await beliefs.read() } ``` **When to use:** Research agents, fact-checkers, decision support — any task where the agent should investigate until it has enough information. **Key decisions:** - **Clarity threshold** — `0.7` is a good starting point. Lower for exploratory tasks, higher for critical decisions. - **Turn limit** — Always set a hard cap to prevent infinite loops. - **Move routing** — Use `context.moves[0]` to direct the next investigation. The move with the highest `value` has the most expected information gain. --- ## Streaming Accumulate the full response, then call `after()` once when the stream completes. ```ts import { streamText } from 'ai' import { anthropic } from '@ai-sdk/anthropic' async function researchStream(question: string) { const context = await beliefs.before(question) const result = streamText({ model: anthropic('claude-sonnet-4-20250514'), system: context.prompt, prompt: question, }) let fullText = '' for await (const chunk of result.textStream) { process.stdout.write(chunk) fullText += chunk } // Call after() once with the complete text const delta = await beliefs.after(fullText) return { text: fullText, delta } } ``` In a Next.js route handler, use `onFinish`: ```ts export async function POST(req: Request) { const { messages } = await req.json() const lastMessage = messages[messages.length - 1]?.content ?? '' const context = await beliefs.before(lastMessage) const result = streamText({ model: anthropic('claude-sonnet-4-20250514'), system: context.prompt, messages, onFinish: async ({ text }) => { await beliefs.after(text) }, }) return result.toDataStreamResponse() } ``` Call `after()` exactly once per turn, after the stream completes. Do not call it on partial chunks. Each `after()` triggers full extraction and fusion against incomplete text, which produces duplicate beliefs, spurious contradictions, and ledger churn that's hard to clean up later. If you need live UI feedback during a stream, use `subscribe()` for projection updates instead — let the final `after()` do the actual extraction. --- ## Tool-Aware When your agent uses tools, feed each tool result separately so beliefs update as evidence arrives mid-turn. ```ts const context = await beliefs.before(question) const message = await client.messages.create({ model: 'claude-sonnet-4-20250514', system: context.prompt, messages: [{ role: 'user', content: question }], tools: myTools, }) // Feed each tool result — source is tracked per-belief for traceability for (const block of message.content) { if (block.type === 'tool_use') { const result = await executeTool(block.name, block.input) await beliefs.after(JSON.stringify(result), { tool: block.name, source: `tool:${block.name}`, }) } else if (block.type === 'text') { await beliefs.after(block.text) } } ``` With the Vercel AI SDK and `maxSteps`: ```ts const { text, toolResults } = await generateText({ model: anthropic('claude-sonnet-4-20250514'), system: context.prompt, prompt: question, tools: myTools, maxSteps: 5, }) // Feed tool results individually for (const result of toolResults) { await beliefs.after(JSON.stringify(result.result), { tool: result.toolName }) } // Then feed the final text await beliefs.after(text) ``` **When to use:** Agents that call external APIs, search the web, query databases, or use any tools that return factual data. **Why per-tool?** Each tool result is a distinct observation with its own provenance. Feeding them individually lets the system attribute claims to the right tool, detect when a tool result *contradicts* an existing belief, and notice when a tool result *resolves* a gap. If you concatenate everything into one `after()` call, those per-source contradictions and gap-resolutions get smeared together and the relationships are missed. --- ## Multi-Agent Multiple agents contribute to the same shared belief state. They share a `namespace` and `writeScope: 'space'`, but use different `agent` identifiers so contributions are attributed. ```ts const researcher = new Beliefs({ apiKey, agent: 'researcher', namespace: 'market-analysis', writeScope: 'space', }) const critic = new Beliefs({ apiKey, agent: 'critic', namespace: 'market-analysis', writeScope: 'space', }) // Researcher gathers evidence const researchContext = await researcher.before('AI tools market size') const findings = await callLLM(researchContext.prompt, 'Research AI tools market') await researcher.after(findings) // Critic challenges the findings const criticContext = await critic.before('Challenge these market findings') const critique = await callLLM(criticContext.prompt, 'Find weaknesses') await critic.after(critique) // Both see the same world state const world = await researcher.read() console.log(`Contradictions: ${world.contradictions.length}`) console.log(`Total beliefs: ${world.beliefs.length}`) ``` **When to use:** Debate systems, red-team/blue-team, supervisor/worker patterns, any architecture with multiple agents reasoning about the same domain. **How it works:** All agents in the same namespace with `writeScope: 'space'` share one authoritative state. When the critic adds beliefs that contradict the researcher's findings, the system detects the contradiction automatically. If you want private agent memory plus shared background, switch to `writeScope: 'agent'`. --- ## Combining Patterns Most production agents combine patterns. Here's a multi-turn streaming agent with tool use: ```ts async function deepResearch(question: string) { await beliefs.add(question, { type: 'goal' }) for (let turn = 0; turn < 5; turn++) { const context = await beliefs.before(question) if (context.clarity > 0.8) break const { text, toolResults } = await generateText({ model: anthropic('claude-sonnet-4-20250514'), system: context.prompt, prompt: context.moves[0]?.target ?? question, tools: myTools, maxSteps: 3, }) for (const result of toolResults) { await beliefs.after(JSON.stringify(result.result), { tool: result.toolName }) } await beliefs.after(text) } return await beliefs.read() } ``` --- ## Smaller patterns Once the loop is in place, these are the moves and accessors you'll reach for most. ### Clarity-driven routing Branch on `context.clarity` to decide what to do next: ```ts const context = await beliefs.before(input) if (context.clarity < 0.3) { await runResearch(context.gaps) } else if (context.clarity > 0.7) { await draftRecommendations(context.beliefs) } else { await investigateGaps(context.gaps) } ``` For coarser routing, `delta.readiness` returns `'low' | 'medium' | 'high'` — a categorical projection of the underlying 0–1 clarity score, useful when you want simple branching without picking your own thresholds. ### Confidence gating Only act on beliefs above a confidence threshold: ```ts const world = await beliefs.read() const strong = world.beliefs.filter(b => b.confidence > 0.7) const weak = world.beliefs.filter(b => b.confidence <= 0.7) // Use strong beliefs in the response // Flag weak beliefs for further investigation ``` ### Gap-driven research Read open gaps and use them to drive the next research action — the agent's next move is shaped by what it doesn't know, not just what the user asked: ```ts const context = await beliefs.before(input) for (const gap of context.gaps) { const result = await searchTool.run(gap) await beliefs.after(result, { tool: 'search' }) } ``` ### Custom assertion with evidence When you have domain-specific knowledge, assert it directly with evidence and supersession: ```ts await beliefs.add('Market is $6.8B', { confidence: 0.92, evidence: 'IDC Q4 2025 tracker, 2400 enterprise survey', supersedes: 'Market is $4.2B', }) ``` Explicit assertions take precedence over auto-extracted beliefs when they conflict. ### Inspecting the trace Use the trace to debug belief transitions: ```ts const history = await beliefs.trace() for (const entry of history) { console.log(`${entry.timestamp} | ${entry.action}`) if (entry.confidence) { console.log(` ${entry.confidence.before} → ${entry.confidence.after}`) } if (entry.reason) console.log(` reason: ${entry.reason}`) } ``` For replay-shaped reads — "what did the world look like at time T?" — use [`beliefs.stateAt({ asOf })`](/dev/sdk/core-api) instead. ## Scope reads Source: https://thinkn.ai/dev/sdk/reads Summary: Plain-English summaries of gaps, decisions, goals, risks, insights, evidence, intents, and contradictions. Eight top-level methods give you focused projections of the current belief state. Each returns a flat `Summary[]` — pre-shaped lists ready to render in a UI without further processing or normalization. Raw underlying scores live under each summary's optional `internals` field for power users; the default shape stays free of engine jargon. All eight share the same option base — `agentId`, `limit`, `since` — plus per-method filters where relevant. | Method | Returns | Per-method filter | |--------|---------|-------------------| | `beliefs.gaps(opts?)` | `GapSummary[]` | `priority?: 'low' \| 'medium' \| 'high'` | | `beliefs.decisions(opts?)` | `DecisionSummary[]` | — | | `beliefs.goals(opts?)` | `GoalSummary[]` | — | | `beliefs.risks(opts?)` | `RiskSummary[]` | — | | `beliefs.insights(opts?)` | `InsightSummary[]` | — | | `beliefs.evidence(opts?)` | `EvidenceSummary[]` | `beliefId?: string` | | `beliefs.intents(opts?)` | `IntentSummary[]` | — | | `beliefs.contradictions(opts?)` | `ContradictionSummary[]` | `severity?: 'low' \| 'medium' \| 'high'` | ## Quickstart ```ts const gaps = await beliefs.gaps({ priority: 'high', limit: 5 }) const risks = await beliefs.risks() const recentEvidence = await beliefs.evidence({ since: '2026-04-01T00:00:00Z' }) for (const gap of gaps) { console.log(`[${gap.priority}] ${gap.summary}`) if (gap.suggestion) console.log(` → ${gap.suggestion}`) } ``` ## Common options | Option | Type | What it does | |--------|------|--------------| | `agentId` | `string` | Restrict to a single agent's contributions. Defaults to the SDK's bound `agent`. | | `limit` | `number` | Cap returned items. Server applies its own cap if you don't specify. | | `since` | `string` | ISO timestamp. Filter items at-or-after this time. | | `signal` | `AbortSignal` | Abort the request mid-flight. | ## Summary shapes Every summary follows the `Summary` template: `id`, plain-English `summary`, optional `suggestion`, optional `relatedBeliefs`, plus type-specific fields. Internals are opt-in. ### `GapSummary` ```ts { id: string summary: string priority: 'low' | 'medium' | 'high' openSince: string // ISO timestamp suggestion?: string relatedBeliefs?: string[] internals?: { rawConfidence?: number; evidenceWeight?: number } } ``` The priority inversion is intentional: a *low*-confidence gap means the agent has little evidence either way, so the question is wide open and worth investigating — that gets `priority: 'high'`. A *high*-confidence gap means the agent is close to resolving it on its own, so the priority drops. ### `DecisionSummary` ```ts { id: string summary: string status: 'tentative' | 'committed' | 'reversed' decidedAt: string commitment: 'loose' | 'firm' | 'revoked' suggestion?: string relatedGoals?: string[] relatedBeliefs?: string[] } ``` Distinct from `moves.list()` (recommendations) and `trace()` (audit log). Decisions are committed intent: "I will do X." ### `EvidenceSummary` ```ts { id: string summary: string observedAt: string source: string // agent or source identifier direction: 'supports' | 'contradicts' | 'neutral' strength: 'weak' | 'medium' | 'strong' relatedBeliefs?: string[] } ``` `strength` is derived from how much this observation shifted the belief — the engine's reading of "how meaningful was this observation." ### `IntentSummary` ```ts { id: string summary: string kind: 'goal' | 'decision' | 'constraint' status: 'active' | 'completed' | 'abandoned' | 'tentative' | 'committed' | 'reversed' | 'relaxed' | 'removed' activeSince: string progress?: number // 0–1, mostly for goals confidence?: 'low' | 'medium' | 'high' // mostly for decisions relatedBeliefs?: string[] relatedGoals?: string[] } ``` The unified shape across all three intent kinds. Filter by `kind` client-side, or use `goals()`/`decisions()` for kind-specific shapes. ### `GoalSummary` ```ts { id: string summary: string status: 'active' | 'completed' | 'abandoned' activeSince: string progress?: number // 1 if completed, 0 if abandoned relatedBeliefs?: string[] } ``` ### `RiskSummary` ```ts { id: string summary: string severity: 'low' | 'medium' | 'high' // impact if it occurs likelihood: 'low' | 'medium' | 'high' | 'certain' identifiedAt: string suggestion?: string relatedBeliefs?: string[] } ``` For an expected-impact ranking, map both labels to numbers (e.g. `low: 1, medium: 2, high: 3, certain: 4`) and sort by their product. Both fields are categorical labels in the summary shape; the `internals` field carries the underlying numeric scores if you need them. ### `InsightSummary` ```ts { id: string summary: string kind: 'contradiction' | 'missing_evidence' | 'ambiguity' | 'leap' status: 'active' | 'acknowledged' | 'dismissed' relatedBeliefs: string[] severity: 'low' | 'medium' | 'high' createdAt: string suggestion?: string } ``` Output from the clarity detector — these are *meta*-observations about the belief state itself, not beliefs. ### `ContradictionSummary` ```ts { id: string summary: string severity: 'low' | 'medium' | 'high' beliefs: [string, string] // pair of belief IDs in conflict suggestion?: string detectedAt?: string } ``` The pair is unordered semantically — `beliefs[0]` and `beliefs[1]` are both implicated. All eight methods require `apiKey` or `scopeToken` auth. `serviceToken` callers are rejected. See [Auth](/dev/sdk/auth). ## Moves: SDK Source: https://thinkn.ai/dev/sdk/moves Summary: List, generate, and act on recommended next moves. A move is an engine-recommended next action — the answer to "given what the agent currently believes, what should it investigate next?" — ranked by expected information gain. `beliefs.moves.*` wraps the engine's recommender. See [Moves (concept)](/dev/core/moves) for the model behind the surface; this page covers the SDK methods. `forecast(action)` and `cascade(action)` look similar but answer different questions. `forecast` projects the action's value on **the current agent's** belief state — "how much will this clarify *my* picture?" `cascade` projects the same action across **other agents' beliefs** via the influence matrix — "if I do this, how much churn does it create for the rest of the swarm?" Use forecast for self-directed planning; use cascade when you're coordinating multi-agent work. ## `beliefs.moves.list(options?)` Get the currently-ranked moves for the bound scope. Moves come back highest-priority first. ```ts const moves = await beliefs.moves.list({ topN: 3 }) for (const m of moves) { console.log(m.action, m.rationale, m.expectedDeltaH) } ``` Options: | Option | Type | What it does | |--------|------|--------------| | `topN` | `number` | Cap the returned slice. Server-side ranking is unchanged; this is a client-side trim. | Returns `ThinkingMove[]`. ## `beliefs.moves.generate(options)` Ask the recommender for a fresh move targeting a specific belief. Use this when you want a move *now* (e.g., the user just opened a belief detail page) rather than waiting for one to appear in `list()`. ```ts const result = await beliefs.moves.generate({ beliefId: 'belief-abc123', includeJustification: true, }) if (result.move) { showRecommendation(result.move, result.move.justification) } else if (result.reason === 'belief_complete') { showDoneState() } ``` Options: | Option | Type | What it does | |--------|------|--------------| | `beliefId` | `string` | **Required.** Belief to target. | | `targetId` | `string` | Alias for `beliefId`. `beliefId` wins if both are set. | | `includeJustification` | `boolean` | Attach the engine's full justification payload to the move. | | `sessionId` | `string` | Bind to a specific session for analytics. | Returns: ```ts { success: true move: ThinkingMoveWithJustification | null target: ResolvedCanonicalTarget reason?: 'belief_complete' // present when no move was generated durationMs: number } ``` `move: null` with `reason: 'belief_complete'` is the engine signaling "this belief is in good shape — nothing to recommend right now." ## `beliefs.moves.act(moveId, action, options?)` Record a user action on a move. The engine learns from accept/snooze/dismiss signals to improve future ranking. ```ts await beliefs.moves.act(move.id, 'accept') await beliefs.moves.act(move.id, 'snooze') await beliefs.moves.act(move.id, 'dismiss') ``` `action` is one of `'accept'`, `'snooze'`, `'dismiss'`. Any other value throws `TypeError`. Returns: ```ts { success: true move: ThinkingMove // updated with new status / resolvedAt durationMs: number } ``` ## `beliefs.moves.rank(options?)` Engine-ranked next-best moves over the current scope. Each entry surfaces the composite ranking score, expected info-gain, cost, and a cost-normalized ratio so callers can budget-cap on any axis. ```ts const ranked = await beliefs.moves.rank({ topN: 3, budget: 0.05 }) for (const m of ranked) { console.log(`${m.action}/${m.subType} → q=${m.qValue} cost=${m.cost} voi=${m.valueOfInformation}`) } ``` Options: `topN?` (default 5, max 50), `budget?` (filters out moves whose `cost` exceeds budget before ranking), `agentId?`, `signal?`. Returns `MoveRankingSummary[]`: ```ts { id, summary, targetId targetKind: 'claim' | 'goal' | 'gap' action: string // 'gather_evidence', 'clarify', ... subType: string // 'design_test', 'tradeoff_mapping', ... qValue: number // composite ranking score (higher = better) expectedInfoGain: number // expected info-gain from executing this move cost: number // USD / tokens / effort units valueOfInformation: number // info-gain / max(cost, 0.01) executor: 'agent' | 'user' | 'both' confidence: 'low' | 'medium' | 'high' } ``` ## `beliefs.moves.forecast(action, options?)` Project the expected value of a candidate action on the current belief state. The engine runs its predictive model forward from the current state and returns one summary. ```ts const summary = await beliefs.moves.forecast('gather_evidence', { depth: 3, rollouts: 50 }) console.log(`score=${summary.score} confidence=${summary.confidence}`) console.log(`will sharpen: ${summary.willAnswer.join(', ')}`) ``` Options: `depth?` (max 5), `rollouts?` (max 200), `maxTopics?`, `agentId?`, `signal?`. Returns `ForecastSummary` — same shape as `beliefs.forecast.predict` documented below. ## `beliefs.moves.cascade(action, options?)` Predict how a candidate action will ripple through *other* agents' beliefs via the fit influence matrix. Use this for multi-agent coordination — knowing whether your move will cause downstream churn before you make it. ```ts const cascade = await beliefs.moves.cascade('gather_evidence', { targetBeliefId: 'b-market-size', magnitude: 0.3, }) for (const shift of cascade.willShift) { if (shift.severity !== 'none') { console.warn(`Agent ${shift.agent}: ${shift.summary}`) } } ``` Options: `targetBeliefId?` (defaults to most-uncertain active belief), `magnitude?` (0–1, default 0.2), `maxAgents?`, `agentId?`, `signal?`. Returns `CascadeSummary`: ```ts { id, summary /** Aggregate cascade risk: 0 = isolated, 1 = every known agent feels it. */ score: number willShift: Array<{ agent: string summary: string severity: 'none' | 'low' | 'medium' | 'high' affectedBeliefs?: string[] }> confidence: 'low' | 'medium' | 'high' why: string } ``` Cold-start workspaces return `score: 0` with `confidence: 'low'` — the influence matrix has no co-observation evidence yet. --- ## `beliefs.forecast.predict(actions, options?)` Free-form action forecasting. Where `moves.forecast(action)` evaluates one action against the engine's recommended-move vocabulary, `forecast.predict(actions[])` runs the same predictive model against an arbitrary list of caller-supplied actions and returns one summary per input action, in input order. ```ts const forecasts = await beliefs.forecast.predict( ['gather_evidence_apac', 'design_test_market_size', 'reframe_question'], { horizon: 3, rollouts: 50 }, ) const ranked = [...forecasts].sort((a, b) => b.score - a.score) console.log(`Best action: ${ranked[0].summary} (score=${ranked[0].score})`) ``` Options: | Option | Default | What it does | |--------|---------|--------------| | `horizon` | `1` | Rollout depth per action (max 5). | | `rollouts` | `30` | Independent rollouts per action (max 200). | | `maxTopics` | — | Cap on belief topics surfaced in `willAnswer`. | | `agentId` | bound agent | Run the forecast as a different agent. | | `signal` | — | `AbortSignal` for cancellation. | Returns `ForecastSummary[]`: ```ts { id: string summary: string /** 0–1 expected value. Higher = more useful. */ score: number /** Plain-English belief topics most likely to sharpen under this action. */ willAnswer: string[] /** Confidence in the forecast itself, not the action. */ confidence: 'low' | 'medium' | 'high' /** Short human explanation. */ why: string suggestion?: string relatedBeliefs?: string[] } ``` `confidence` reflects how much evidence the engine's predictive model has accumulated for similar actions in this workspace — distinct from `score`. A high-`score` action with `confidence: 'low'` means "this looks great, but we haven't seen this action before, so the score is extrapolation rather than a track record." On a fresh workspace with no archived deltas, every forecast comes back with `confidence: 'low'` and a low `score`. That's the honest answer — the model has no evidence yet. Forecasts typically reach `confidence: 'medium'` after roughly 5–10 `after()` calls in the workspace, and `'high'` once dozens of similar actions have been observed. --- ## `ThinkingMove` shape ```ts { id: string targetId: string targetEntityType?: 'claim' | 'goal' | 'gap' | 'risk' | string targetEntityId?: string action: 'clarify' | 'gather_evidence' | 'resolve_uncertainty' | 'compare_paths' | string rationale: string expectedDeltaH: number // expected uncertainty reduction from acting on this move status: 'suggested' | 'accepted' | 'snoozed' | 'dismissed' | string suggestedModality?: string // hint about how to surface (e.g., 'inline', 'banner') qValue?: number // recommender's internal score, when available executor?: 'agent' | 'user' | 'both' createdAt: string updatedAt?: string resolvedAt?: string } ``` `expectedDeltaH` is the recommender's estimate of how much uncertainty this move reduces if accepted — it's the `value` field in the [concept doc](/dev/core/moves). The moves namespace requires `apiKey` or `scopeToken` auth. `serviceToken` callers cannot invoke `moves.*`. See [Auth](/dev/sdk/auth). ## Trust & tool reliability Source: https://thinkn.ai/dev/sdk/trust Summary: Override agent and source trust at runtime, and track which tools produce useful evidence. `beliefs.trust.*` lets the user adjust how much weight an agent or evidence source carries during fusion. Every override is a stated rating `(confidence, strength)` that the engine applies at fusion time — see [behavioral contracts](/dev/internals/contracts) for the predictability guarantee. ## When to use it - A user disables an agent: set `confidence: 0, strength: 100, lock: true`. - A user trusts a domain expert agent above the default: `confidence: 0.95, strength: 50`. - A source category (e.g. social media) should attenuate weight: set on `{ kind: 'source', id: 'social' }`. Without an override, agents start from the engine's calibrated prior. Overrides replace that prior for the targeted entity only — every other agent and source is unaffected. ## `beliefs.trust.set(target, options)` Idempotent upsert. ```ts await beliefs.trust.set( { kind: 'agent', id: 'risk-bot' }, { confidence: 0.4, strength: 25 }, ) // Hard-disable an unreliable source (locked overrides never drift): await beliefs.trust.set( { kind: 'source', id: 'rumor-mill' }, { confidence: 0.0, strength: 100, lock: true }, ) ``` Parameters: | Field | Type | What it does | |-------|------|--------------| | `target.kind` | `'agent' \| 'source'` | Which entity type. | | `target.id` | `string` | Entity identifier. | | `options.confidence` | `number` | Mean of the user prior, in `[0, 1]`. | | `options.strength` | `number` | How sure you are in this rating. Higher = harder for learned data to drift the override. Use **10** for a weak preference (the engine can still adjust based on evidence), **100** for a confident rating, **500** for an immovable position you don't want learning to override. | | `options.lock` | `boolean` | When `true`, the engine never drifts this rating with newly-learned data. | Returns the persisted `TrustOverride`. ## `beliefs.trust.list(options?)` ```ts const all = await beliefs.trust.list() const agentsOnly = await beliefs.trust.list({ kind: 'agent' }) ``` Returns `TrustOverride[]`. ## `beliefs.trust.get(target)` ```ts const override = await beliefs.trust.get({ kind: 'agent', id: 'risk-bot' }) if (override) console.log(override.confidence, override.strength) ``` Returns `TrustOverride | null` (null when no override exists). ## `beliefs.trust.unset(target)` Remove an override. The entity reverts to the engine's calibrated prior at the next fusion step. ```ts const { removed } = await beliefs.trust.unset({ kind: 'agent', id: 'risk-bot' }) ``` Returns `{ removed: boolean }`. ## `TrustEntity` and `TrustOverride` shapes ```ts interface TrustEntity { kind: 'agent' | 'source' id: string } interface TrustOverride { entity: TrustEntity confidence: number // [0, 1] strength: number // ≥ 0 locked: boolean updatedAt: string } ``` The trust namespace requires `apiKey` or `scopeToken` auth. `serviceToken` callers cannot mutate user-scoped trust. See [Auth](/dev/sdk/auth). `set()` validates inputs synchronously — `confidence` must be in `[0, 1]`, `strength` must be non-negative, and `target.kind` must be `'agent'` or `'source'`. Invalid inputs throw `TypeError` before any network call. --- ## Tool reliability priors `beliefs.tools.*` records and reads running estimates of *tool* reliability — distinct from the agent/source trust above. Where trust overrides are user-stated ratings the engine applies at fusion, tool priors are *learned* estimates: the engine tracks, per `(tool, contextClass)` pair, how often each tool produces useful evidence, so the agent can pick the right tool for the job. `beliefs.tools.observe(envelope)` is **not** the same as the top-level `beliefs.observe(envelope)`. The top-level method runs the full belief-extraction pipeline on free-form content. `tools.observe` records a single success/failure outcome — orders of magnitude lighter, and only for tool-reliability tracking. ### `beliefs.tools.observe(envelope)` Record a single tool outcome. Updates the running estimate in place and returns the new summary. ```ts const prior = await beliefs.tools.observe({ tool: 'web_search', success: true, contextClass: 'market-research', weight: 1.0, }) console.log(`web_search rate now ${prior.rate} (${prior.confidence})`) ``` Envelope: | Field | Type | What it does | |-------|------|--------------| | `tool` | `string` | **Required.** Tool identifier. | | `success` | `boolean` | **Required.** Did the tool produce useful evidence? | | `contextClass` | `string` | Optional context label (e.g. `'exploratory-research'`). | | `weight` | `number` | Optional weight (default 1.0). | | `agentId` | `string` | Override the bound agent. | | `signal` | `AbortSignal` | Cancellation. | Returns `ToolPriorSummary`. ### `beliefs.tools.priors(options?)` List current priors in scope. Filter to narrow. ```ts // Every prior in scope: const all = await beliefs.tools.priors() // Just one tool: const search = await beliefs.tools.priors({ tool: 'web_search' }) // Tool + context combo: const filtered = await beliefs.tools.priors({ tool: 'github_search', contextClass: 'code-review', }) ``` Options: `tool?`, `contextClass?`, `limit?`, `agentId?`, `signal?`. Returns `ToolPriorSummary[]`. ### `ToolPriorSummary` ```ts { id: string summary: string tool: string contextClass: string // empty string when uncategorized /** Mean success rate, 0–1. */ rate: number confidence: 'low' | 'medium' | 'high' | 'certain' /** 90% uncertainty interval on the mean. */ credibleInterval: { low: number; high: number } /** Total observations accumulated. */ observations: number suggestion?: string } ``` `rate` is "on average, this tool produces useful evidence `rate × 100`% of the time." `confidence` reflects how many observations back the estimate — `low` below 5 observations, `medium` 5–20, `high` 20+. `credibleInterval` narrows as observations accumulate. A common pattern: before calling a tool, fetch its prior. If `confidence === 'low'` and `rate < 0.3`, consider an alternative or attach a fallback. After the call, record the outcome with `tools.observe()` so the prior keeps learning. ## Streaming Source: https://thinkn.ai/dev/sdk/streaming Summary: SSE-based live updates for belief state and extraction pipelines. The SDK exposes two streaming primitives over Server-Sent Events: a state stream (`subscribe` / `events`) and a per-request extraction stream (`streamExtraction`). Both share one transport — retries, abort propagation, and frame validation live in `HttpTransport`, so consumer code only sees parsed event objects. ## `beliefs.subscribe(handler, options?)` Push state changes into a callback. Returns a `Subscription` with `unsubscribe()` and a `done` promise that resolves when the stream closes. ```ts const sub = beliefs.subscribe( (event) => { if (event.type === 'belief_records_updated') { renderRecords(event.beliefRecords) } else if (event.type === 'belief_records_stale') { requestFullRefresh(event.reason) } }, { onError: (err) => console.error(err), onClose: () => console.log('stream closed'), }, ) // Tear down when the React effect / job ends: sub.unsubscribe() await sub.done ``` Options: | Option | Default | What it does | |--------|---------|--------------| | `onError` | `console.error` | Called when the underlying stream errors. | | `onClose` | — | Called once after a clean close. | | `dropHeartbeats` | `true` | Filter heartbeat frames before invoking `handler`. | | `signal` | — | `AbortSignal`; aborting cancels the SSE connection and resolves `done`. | The `signal` option pairs cleanly with React effect cleanup or React Query's abort handling — pass the same controller and you don't need a `finally` block. ## `beliefs.events(options?)` Same stream, async-iterable face. Use this when you prefer `for await` or RxJS-style pipelines over the callback shape. ```ts const ac = new AbortController() for await (const event of beliefs.events({ signal: ac.signal })) { if (event.type === 'belief_records_updated') { renderRecords(event.beliefRecords) } } ``` Aborting the signal ends the iteration cleanly. ## `BeliefStreamEvent` Three frame types come off the state stream: ```ts type BeliefStreamEvent = | { type: 'belief_records_updated' sessionId: string spaceId: string viewId: string beliefRecords: BeliefRecord[] groupIndex: number totalGroups: number timestamp: string } | { type: 'belief_records_stale' sessionId: string spaceId: string reason: string timestamp: string } | { type: 'heartbeat'; timestamp: string } ``` Heartbeats are dropped by default — set `dropHeartbeats: false` if you need keep-alive visibility for connection health UIs. `belief_records_stale` is a hint to refresh from `read()` or `snapshot()` — the engine has detected something it cannot incrementally project. ## `beliefs.streamExtraction(request, handler?, options?)` Per-request extraction of beliefs from a content payload. Yields `BeliefExtractionStreamChunk` frames as the engine extracts; ends with a `complete` chunk or an `error` chunk. ```ts const stream = beliefs.streamExtraction({ content: longTranscript, surface: 'voice', }) for await (const chunk of stream) { if (chunk.type === 'belief_event') { appendIncrementalBelief(chunk.event) } else if (chunk.type === 'complete') { finalize(chunk.eventCount) } else if (chunk.type === 'error') { showError(chunk.message) } } // Cancel mid-extraction: stream.cancel() ``` You can pass an optional `handler` callback in addition to iterating — both run on every frame. Useful when one consumer wants the iterable for control flow and another wants a fire-and-forget callback (e.g., an analytics hook). ## `BeliefExtractionStreamChunk` ```ts type BeliefExtractionStreamChunk = | { type: 'belief_event' index: number event: { id: string baseEvent: string semanticLabel: string text: string actor: string sourceMessageIds: string[] } } | { type: 'complete'; lastProcessedMessageId: string; eventCount: number } | { type: 'error'; message: string } ``` Stream lifecycle: zero or more `belief_event` chunks → exactly one `complete` *or* one `error` chunk → stream ends. ## `beliefs.drift.watch(handler, options?)` SSE stream of per-agent reliability drift events. The engine snapshots a baseline at stream start, then emits a `DriftEvent` per (agent, evidence type) on each polling tick, with a `driftDetected` boolean derived from a drift threshold scaled to the baseline's own uncertainty. ```ts const sub = beliefs.drift.watch( (event) => { if (event.type === 'drift' && event.driftDetected) { alert(`${event.agentId} drift on ${event.evidenceType}: shift=${event.meanShift.toFixed(3)} > ci=${event.ciHalfWidth.toFixed(3)}`) } }, { targetAgentId: 'researcher', pollIntervalMs: 30_000 }, ) // Tear down when done: sub.unsubscribe() await sub.done ``` Options: | Option | Default | What it does | |--------|---------|--------------| | `targetAgentId` | — | Stream only this agent's events. | | `pollIntervalMs` | `10000` | Polling interval (min 1000, max 300000). | | `zThreshold` | `1.645` | Drift-threshold z-score (1.645 = 95% one-sided). | | `dropHeartbeats` | `true` | Filter heartbeat frames before invoking `handler`. | | `onError` | `console.error` | Stream-error callback. | | `onClose` | — | Called once on clean close. | | `signal` | — | `AbortSignal`. | ## `beliefs.drift.events(options?)` Same stream, async-iterable face. Same options minus `onError`/`onClose`/`handler`. ```ts for await (const event of beliefs.drift.events()) { if (event.type === 'drift' && event.driftDetected) { handleDrift(event) } } ``` ## `DriftStreamEvent` ```ts type DriftStreamEvent = | { type: 'drift' agentId: string evidenceType: string timestamp: string /** Engine-rated reliability at baseline (0–1). */ baselineMean: number /** Engine-rated reliability now (0–1). */ currentMean: number /** |currentMean - baselineMean|. */ meanShift: number /** Engine-computed 95% confidence interval half-width on the baseline. */ ciHalfWidth: number /** Engine-internal divergence metric (opaque scalar, see note below). */ klDivergence: number /** True when meanShift > ciHalfWidth — drift past baseline noise. */ driftDetected: boolean observationCount: number } | { type: 'heartbeat'; timestamp: string } ``` `driftDetected` scales the threshold to the baseline's own uncertainty, so you don't have to pick scalars yourself. Use `meanShift` for magnitude (most interpretable), `driftDetected` for routing, and `klDivergence` only for cross-agent comparison — it's an engine-internal scalar with no fixed unit. Both streams require `apiKey` or `scopeToken` auth. See [Auth](/dev/sdk/auth) for setup details. ## Scoping & Isolation Source: https://thinkn.ai/dev/sdk/scoping Summary: How namespace, writeScope, thread, and contextLayers shape belief state. The beliefs SDK has four scoping controls that determine how memory is isolated and shared: - `namespace` — the developer-facing workspace boundary - `writeScope` — the authoritative layer you mutate - `thread` — the bound thread ID for thread-scoped memory - `agent` — who contributed the mutation `contextLayers` then controls what `before()` and `read()` merge back into the prompt context. ### Quick decision tree - **Single app or prototype?** → `writeScope: 'space'`, share one namespace. - **Multi-agent collaborating on the same problem?** → `writeScope: 'space'`, distinct `agent` values, shared `namespace`. - **Per-conversation chat memory?** → `writeScope: 'thread'` (the default), bind `thread: conversationId`. - **Background worker with its own scratchpad?** → `writeScope: 'agent'`, distinct `agent` per worker. ## Namespace Namespaces are your top-level isolation boundary. Beliefs in different namespaces never interact. ```ts const projectA = new Beliefs({ apiKey, namespace: 'project-alpha', writeScope: 'space', }) const projectB = new Beliefs({ apiKey, namespace: 'project-beta', writeScope: 'space', }) ``` **Default:** `'default'` **Use for:** per-customer isolation, per-project separation, or per-environment separation. ## Authoritative Write Scopes ### `thread` Per-conversation or per-task memory. This is the SDK default. ```ts const beliefs = new Beliefs({ apiKey, namespace: 'support', thread: 'conv-a', writeScope: 'thread', }) ``` - Requires a bound `thread` - Best for chat apps, workflow runs, and task-specific reasoning - Default read layers: `['self', 'agent', 'space']` — `'self'` is the current thread, `'agent'` is this agent's durable memory, `'space'` is the shared namespace state. Reads merge all three. ### `agent` Durable per-agent memory inside a namespace. ```ts const researcher = new Beliefs({ apiKey, namespace: 'market-map', agent: 'researcher', writeScope: 'agent', }) ``` - Best for long-lived worker identity or agent-specific scratchpads - Keeps one agent's memory separate from another's - Default read layers: `['self', 'space']` ### `space` One shared memory for the whole namespace. ```ts const beliefs = new Beliefs({ apiKey, namespace: 'team-alpha', writeScope: 'space', }) ``` - Best for the simplest prototype or shared-team state - All callers in the namespace read and write the same authoritative layer - Default read layers: `['self']` ## Thread Binding If you use `writeScope: 'thread'`, bind a thread either in the constructor or later with `withThread()`. ### Bind in the constructor ```ts const beliefs = new Beliefs({ apiKey, namespace: 'support', thread: conversationId, writeScope: 'thread', }) ``` ### Bind later with `withThread()` ```ts const baseBeliefs = new Beliefs({ apiKey, namespace: 'support', writeScope: 'thread', }) const beliefs = baseBeliefs.withThread(conversationId) ``` This is useful when the framework gives you the thread or session ID at request time. ## Agent Identity `agent` answers "who said this?" It affects attribution and how contributions are weighted when sources disagree. It does **not** by itself decide whether memory is shared. ```ts const researcher = new Beliefs({ apiKey, namespace: 'team-alpha', agent: 'researcher', writeScope: 'space', }) const reviewer = new Beliefs({ apiKey, namespace: 'team-alpha', agent: 'reviewer', writeScope: 'space', }) ``` These two agents share the same authoritative state because they share the same `namespace` and `writeScope: 'space'`. ## Context Layers `contextLayers` controls what `before()` and `read()` merge together. | Write scope | Default read layers | What gets merged | |-------------|---------------------|------------------| | `thread` | `['self', 'agent', 'space']` | Current thread + this agent's memory + namespace-wide state | | `agent` | `['self', 'space']` | This agent's durable memory + namespace-wide state | | `space` | `['self']` | Just the namespace-wide shared state | The available layer values are `'self'`, `'agent'`, `'space'`, `'studio'`, and `'org'` — `'studio'` and `'org'` are reserved for hosted-platform integrations and don't apply to most app-builder consumers. You can override the defaults when you need a narrower or wider context: ```ts const beliefs = new Beliefs({ apiKey, namespace: 'support', thread: conversationId, writeScope: 'thread', contextLayers: ['self', 'space'], }) ``` ## Design Patterns ### Fastest prototype Use one shared namespace-wide state. ```ts const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, namespace: 'prototype', writeScope: 'space', }) ``` ### Chat application Use per-conversation memory. ```ts function createBeliefs(userId: string, conversationId: string) { return new Beliefs({ apiKey: process.env.BELIEFS_KEY, namespace: userId, thread: conversationId, writeScope: 'thread', }) } ``` ### Durable per-agent memory with shared background ```ts const beliefs = new Beliefs({ apiKey, namespace: 'research-team', agent: 'analyst', writeScope: 'agent', }) ``` This keeps one agent's working memory separate while still reading shared namespace context. ### Shared workspace or debate ```ts const optimist = new Beliefs({ apiKey, namespace: 'market-debate', agent: 'optimist', writeScope: 'space', }) const skeptic = new Beliefs({ apiKey, namespace: 'market-debate', agent: 'skeptic', writeScope: 'space', }) ``` All participants write into the same shared state, so contradictions and supports are visible to everyone. ### Environment isolation ```ts const ENV = process.env.NODE_ENV ?? 'development' const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, namespace: `${ENV}-${projectId}`, writeScope: 'space', }) ``` ## Rules of Thumb | Question | Recommendation | |----------|----------------| | Do I want separate memory per conversation? | Use `writeScope: 'thread'` and bind a thread | | Do I want one shared state for a whole project or team? | Use `writeScope: 'space'` | | Do I want each agent to keep its own durable memory? | Use `writeScope: 'agent'` with distinct `agent` values | | Do I need to scope one project away from another? | Use different `namespace` values | | Do I need broader background context than the current write scope? | Override `contextLayers` | ## Auth Source: https://thinkn.ai/dev/sdk/auth Summary: API keys and short-lived scope tokens. The SDK supports two authentication modes for app builders: long-lived `apiKey` (the default) and short-lived `scopeToken` (per-request HS256 JWT, intended for browser and edge runtimes). ## `apiKey` — server-side Use this from any trusted server runtime — Node, Workers, container backends, agent runtimes. The key is a `bel_live_…` token tied to your account. ```ts import Beliefs from 'beliefs' const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, namespace: 'project-alpha', writeScope: 'space', }) ``` The key is sent as a `Bearer` header on every request. Get it from **Profile > API Keys** in the Studio dashboard. See [Install](/dev/start/install) for the full setup walkthrough. Treat `apiKey` like any account credential: never embed in client bundles, never commit to source control, rotate immediately if leaked. Use `scopeToken` for browser/edge contexts. ## `scopeToken` — browser, edge, untrusted runtimes When you cannot put an `apiKey` on the device — browsers, edge functions, third-party plugins — mint a short-lived HS256 JWT on your server and hand it to the client. The SDK signs a fresh token from the configured claims on every request, so you never need to refresh tokens manually. ```ts import Beliefs from 'beliefs' const beliefs = new Beliefs({ scopeToken: { secret: process.env.BELIEFS_SCOPE_TOKEN_SECRET!, claims: { scopeType: 'space', scopeId: currentWorkspace.id, actorUserId: currentUser.id, // optional — who is acting sessionId: currentSession.id, // optional — for session pinning }, }, namespace: 'project-alpha', }) await beliefs.before('What did the user just say?') ``` Claims: | Field | Required | What it does | |-------|----------|--------------| | `scopeType` | yes | `'space'`, `'studio'`, `'org'`, or `'user'` — the scope kind. | | `scopeId` | yes | The id of that scope (e.g. workspace id when `scopeType: 'space'`). | | `actorUserId` | no | Who is acting. Pass when you want every change attributed to a specific user. | | `sessionId` | no | Pin to a session for analytics and cross-session isolation. | | `visibleSpaceIds` | no | Array of additional space ids the actor can read across. | | `exp` | no | Per-token expiry override (Unix seconds). The SDK applies a sensible default if omitted. | Two optional fields on the outer `scopeToken` config tune token lifetime: - **`audience`** — restricts the minted token to a specific API endpoint. The engine rejects the token if the audience doesn't match. Use this to narrow tokens by deployment. - **`ttlSeconds`** — overrides the default token lifetime (the engine sets a sensible default if omitted). Shorten this for high-sensitivity sessions; lengthen if you're seeing churn from repeated mints. How it works: 1. Your server provisions a `secret` (32+ bytes of random) and stores it alongside any per-user/per-session claims. 2. The SDK accepts the secret and claims at construction time. 3. On each request, the SDK mints a fresh HS256 JWT from those claims and sends it as a `Bearer` token. 4. The engine verifies the signature against the shared secret and applies the claims as the request's scope. The secret only stays safe when the SDK client is constructed on your server. If you instantiate the client in a browser, the secret is embedded in the bundle and effectively public — provision a session-scoped secret you can revoke on logout, or proxy SDK calls through a server endpoint that holds the long-lived secret. **Mode switching is automatic.** When you provide `scopeToken`, the SDK ignores any `apiKey` for that instance. ## When to use which | Runtime | Mode | Why | |---------|------|-----| | Node server, agent worker, container | `apiKey` | Long-lived credential, simplest. | | Next.js Route Handler, Cloudflare Worker, Vercel Function | `apiKey` | Same as above; the runtime is trusted. | | Browser (React, Vue, Svelte) | `scopeToken` | Avoids embedding a long-lived account credential. | | Edge functions invoked by an untrusted client | `scopeToken` | Per-request scope narrowing via claims. | | Third-party plugin / extension | `scopeToken` | Scope and revoke per session. | ## Errors `BetaAccessError` (HTTP 401/403) — key missing, invalid, or revoked. The SDK surfaces a `signupUrl` for self-service requests: ```ts import Beliefs, { BetaAccessError } from 'beliefs' try { await beliefs.before(input) } catch (err) { if (err instanceof BetaAccessError) { console.log(err.signupUrl) } } ``` `BeliefsError` with code `auth/missing_key` — the SDK was constructed without any auth at all. The SDK has a third internal mode (`serviceToken`) used by Studio's BFF. It is not part of the public app-builder surface and should not be used by external integrators. --- # Adapters ## Claude Agent SDK Source: https://thinkn.ai/dev/adapters/claude-agent-sdk Summary: Use beliefs with the Anthropic Claude Agent SDK. Automatic belief extraction from agent turns. ## Hooks Adapter (recommended) ```bash npm i beliefs ``` The adapter integrates with the Claude Agent SDK's hook system. It captures tool results via `PostToolUse` hooks and injects belief context at `SessionStart`: ```ts import { query } from '@anthropic-ai/claude-agent-sdk' import Beliefs from 'beliefs' import { beliefsHooks } from 'beliefs/claude-agent-sdk' const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, agent: 'research-agent', namespace: 'claude-sdk', writeScope: 'thread', }) // Pass hooks to the Claude Agent SDK query options const result = await query({ prompt: 'Research the competitive landscape for AI dev tools', options: { hooks: beliefsHooks(beliefs), }, }) ``` `beliefsHooks` registers: - **`SessionStart`** — calls `beliefs.before()` and injects context via `additionalContext` - **`PostToolUse`** — calls `beliefs.after(toolResult, {tool})` for each tool invocation If the client is thread-scoped, `beliefsHooks()` resolves the thread automatically from Claude's `session_id` by default. ### Capture Modes ```ts beliefsHooks(beliefs, { capture: 'tools' }) // each tool call result (default) beliefsHooks(beliefs, { capture: 'all' }) // tool results + text responses ``` ### Configuration ```ts beliefsHooks(beliefs, { capture: 'all', includeContext: true, toolFilter: 'search|Read', // regex to filter which tools trigger extraction }) ``` | Option | Default | Description | |--------|---------|-------------| | `capture` | `'tools'` | What to extract beliefs from | | `includeContext` | `true` | Inject belief context at session start | | `toolFilter` | — | Regex matched against tool names. Example: `'search\|Read'` extracts only from search and Read; internal tools like `Bash` would be skipped. | | `resolveThreadId` | `input.session_id` | Override how thread IDs are derived. Pass a `(input) => threadId` function when your thread keying differs from Claude's session id (e.g., per-user threads). | When you use `beliefsHooks(...)`, the adapter owns the lifecycle: `SessionStart` calls `before()`, `PostToolUse` calls `after()`. Calling `beliefs.before()` or `beliefs.after()` yourself in the same query produces duplicate extraction and double-counts evidence. Use one path or the other, not both. Use `writeScope: 'thread'` when each Claude session should keep its own memory. Use `writeScope: 'space'` when all sessions in a namespace should share one state. --- ## Without the adapter (manual `before`/`after`) If you're not using the Claude Agent SDK's hook system — for example, you're using `@anthropic-ai/sdk` directly — wrap your calls manually: ```ts import Anthropic from '@anthropic-ai/sdk' import Beliefs from 'beliefs' const client = new Anthropic() const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, agent: 'research-agent', namespace: 'claude-sdk', writeScope: 'space', }) async function research(question: string) { const context = await beliefs.before(question) const message = await client.messages.create({ model: 'claude-sonnet-4-20250514', max_tokens: 4096, system: context.prompt, messages: [{ role: 'user', content: question }], }) const text = message.content .filter(b => b.type === 'text') .map(b => b.text) .join('') const delta = await beliefs.after(text) return { text, delta } } ``` ### With tool results Feed each tool result separately so beliefs update mid-turn: ```ts const context = await beliefs.before(question) const message = await client.messages.create({ model: 'claude-sonnet-4-20250514', max_tokens: 4096, system: context.prompt, messages: [{ role: 'user', content: question }], tools: myTools, }) for (const block of message.content) { if (block.type === 'tool_use') { const result = await executeTool(block.name, block.input) await beliefs.after(JSON.stringify(result), { tool: block.name }) } else if (block.type === 'text') { await beliefs.after(block.text) } } ``` ### Multi-turn loop Run multiple turns and let clarity drive when to stop: ```ts async function deepResearch(question: string) { for (let turn = 0; turn < 10; turn++) { const context = await beliefs.before(question) if (context.clarity > 0.8) { return context.beliefs } const message = await client.messages.create({ model: 'claude-sonnet-4-20250514', max_tokens: 4096, system: context.prompt, messages: [{ role: 'user', content: question }], }) const text = message.content .filter(b => b.type === 'text') .map(b => b.text) .join('') await beliefs.after(text) } } ``` ## Vercel AI SDK Source: https://thinkn.ai/dev/adapters/vercel-ai Summary: Use beliefs with the Vercel AI SDK. Middleware-based integration for streamText and generateText. ## Use Today with the Core SDK The core `beliefs` package works with the Vercel AI SDK right now. Wrap your `generateText` or `streamText` calls with `before`/`after`: ```bash npm i beliefs ai @ai-sdk/anthropic ``` ### With generateText ```ts import { generateText } from 'ai' import { anthropic } from '@ai-sdk/anthropic' import Beliefs from 'beliefs' const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, agent: 'research-agent', namespace: 'vercel-ai', writeScope: 'space', }) async function research(question: string) { const context = await beliefs.before(question) const { text } = await generateText({ model: anthropic('claude-sonnet-4-20250514'), system: context.prompt, prompt: question, }) const delta = await beliefs.after(text) console.log(`clarity: ${delta.clarity}, changes: ${delta.changes.length}`) return text } ``` ### With streamText ```ts import { streamText } from 'ai' import { anthropic } from '@ai-sdk/anthropic' import Beliefs from 'beliefs' const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, agent: 'research-agent', namespace: 'vercel-ai', writeScope: 'space', }) async function researchStream(question: string) { const context = await beliefs.before(question) const result = streamText({ model: anthropic('claude-sonnet-4-20250514'), system: context.prompt, prompt: question, }) let fullText = '' for await (const chunk of result.textStream) { process.stdout.write(chunk) fullText += chunk } const delta = await beliefs.after(fullText) return { text: fullText, delta } } ``` Call `after()` exactly once per turn, after the stream completes. Do not call it on partial chunks — each call triggers extraction and fusion. Calling per-chunk creates duplicate beliefs from incomplete text. For Next.js route handlers, use the `onFinish` callback shown below. ### With Tool Results Feed tool results individually so beliefs update as evidence arrives: ```ts import { generateText, tool } from 'ai' import { anthropic } from '@ai-sdk/anthropic' import { z } from 'zod' const context = await beliefs.before(question) const { text, toolResults } = await generateText({ model: anthropic('claude-sonnet-4-20250514'), system: context.prompt, prompt: question, tools: { search: tool({ description: 'Search the web', parameters: z.object({ query: z.string() }), execute: async ({ query }) => searchWeb(query), }), }, maxSteps: 5, }) for (const result of toolResults) { await beliefs.after(JSON.stringify(result.result), { tool: result.toolName }) } await beliefs.after(text) ``` ### In a Next.js Route Handler ```ts import { streamText } from 'ai' import { anthropic } from '@ai-sdk/anthropic' import Beliefs from 'beliefs' const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, agent: 'chat-agent', namespace: 'chat', writeScope: 'space', }) export async function POST(req: Request) { const { messages } = await req.json() const lastMessage = messages[messages.length - 1]?.content ?? '' const context = await beliefs.before(lastMessage) const result = streamText({ model: anthropic('claude-sonnet-4-20250514'), system: context.prompt, messages, onFinish: async ({ text }) => { await beliefs.after(text) }, }) return result.toDataStreamResponse() } ``` --- ## Middleware Adapter ```bash npm i beliefs ``` The adapter integrates through the Vercel AI SDK's middleware system. Wrap your model with `beliefsMiddleware` for automatic belief extraction: ```ts import { generateText, wrapLanguageModel } from 'ai' import { anthropic } from '@ai-sdk/anthropic' import Beliefs from 'beliefs' import { beliefsMiddleware } from 'beliefs/vercel-ai' const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, agent: 'research-agent', namespace: 'vercel-ai', writeScope: 'space', }) const { text } = await generateText({ model: wrapLanguageModel({ model: anthropic('claude-sonnet-4-20250514'), middleware: beliefsMiddleware(beliefs), }), prompt: 'Research the competitive landscape for AI dev tools', }) ``` ### Capture Modes The `capture` option controls what the middleware feeds back to `after()`: - **`'response'`** (default) — extract beliefs only from the model's final text response. Lightest mode; one `after()` call per turn. - **`'tools'`** — extract beliefs from each tool result as it returns. Best when tools fetch external data (web search, DB queries) you want tracked individually. - **`'all'`** — extract from both tool results *and* the final response. Most comprehensive; one `after()` call per tool result plus one for the final text. ```ts beliefsMiddleware(beliefs, { capture: 'response' }) // final response (default) beliefsMiddleware(beliefs, { capture: 'tools' }) // each tool call result beliefsMiddleware(beliefs, { capture: 'all' }) // both ``` ### Configuration ```ts beliefsMiddleware(beliefs, { capture: 'all', includeContext: true, }) ``` | Option | Default | Description | |--------|---------|-------------| | `capture` | `'response'` | What to extract beliefs from: `'response'`, `'tools'`, or `'all'` | | `includeContext` | `true` | Inject belief context into system prompt via `before()` | | `resolveThreadId` | — | Required when `beliefs` uses `writeScope: 'thread'` and no thread is already bound | If you keep the SDK default `writeScope: 'thread'`, either bind the thread ahead of time with `beliefs.withThread(threadId)` or pass `resolveThreadId` so the middleware can scope each invocation correctly. ```ts import { streamText, wrapLanguageModel } from 'ai' import { anthropic } from '@ai-sdk/anthropic' import Beliefs from 'beliefs' import { beliefsMiddleware } from 'beliefs/vercel-ai' export async function POST(req: Request) { const { messages, conversationId } = await req.json() const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, namespace: 'support', writeScope: 'thread', }) const model = wrapLanguageModel({ model: anthropic('claude-sonnet-4-20250514'), middleware: beliefsMiddleware(beliefs, { resolveThreadId: () => conversationId, }), }) const result = streamText({ model, messages }) return result.toDataStreamResponse() } ``` ## React Source: https://thinkn.ai/dev/adapters/react Summary: React hooks for building belief-aware interfaces. @beliefs/react is in development. This page describes the planned API. ## What It Provides React hooks for reading belief state in your components. Subscribe to claims, track clarity, and display confidence, with real-time updates as beliefs change. ## Planned Hooks ### useBeliefs Returns the current belief snapshot. Auto-updates when beliefs change. ```tsx function Dashboard() { const { claims, gaps, clarity } = useBeliefs() return (

Clarity: {(clarity * 100).toFixed(0)}%

{claims.length} claims, {gaps.length} gaps

) } ``` ### useClaim Subscribe to a specific claim by ID. Returns confidence, evidence count, and last updated timestamp. ```tsx function ClaimBadge({ claimId }: { claimId: string }) { const { text, confidence, evidenceCount } = useClaim(claimId) return ( {text} - {(confidence * 100).toFixed(0)}% ) } ``` ### useClarity Returns the current clarity score and its components. ```tsx function ClarityIndicator() { const { score, readiness } = useClarity() return (
Clarity: {(score * 100).toFixed(0)}% Readiness: {readiness}
) } ``` ## Request Early Access If you are building a belief-aware UI and want early access to the React hooks, [request access](/dev/beta). ## DevTools Source: https://thinkn.ai/dev/adapters/devtools Summary: A visual inspector for your agent's belief state. @beliefs/devtools is in development. This page describes the planned tool. ## What It Provides A browser-based inspector for your agent's belief state. See claims, confidence, evidence, contradictions, the ledger, and clarity in real time as your agent runs. ## Planned Features - **Claim timeline.** Watch claims appear and update across turns. - **Confidence history.** See how confidence changes as evidence accumulates. - **Ledger viewer.** Trace any belief back to its origin. - **Clarity breakdown.** Inspect each component of the clarity score. - **Contradiction highlighter.** See conflicts as they are detected. ## Integration One line in your development setup: ```ts import { BeliefDevTools } from '@beliefs/devtools' // Wrap your beliefs instance const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, agent: 'research-agent', namespace: 'devtools-demo', writeScope: 'space', }) BeliefDevTools.attach(beliefs) ``` The inspector opens in a separate browser panel. It does not affect your agent's runtime. ## Request Early Access If you want to try DevTools before the public release, [request access](/dev/beta). --- # Use Cases ## Finance Source: https://thinkn.ai/dev/cases/finance Summary: Investment theses, risk assessment, and market analysis, where stale beliefs cost real money. ## What Is at Stake In finance, the cost of a stale assumption is capital allocated against a thesis that the evidence no longer supports. It is a risk that was flagged in one analysis but buried in another. It is a market shift that no one saw because the models were still operating on last quarter's assumptions. Financial agents process vast amounts of data: earnings reports, SEC filings, market signals, analyst commentary, macroeconomic indicators. The challenge is maintaining a coherent, current view of what that information means. ## What Beliefs Make Visible ### Thesis evolution under new evidence An investment thesis starts as a belief: "This sector will outperform based on margin expansion." As data arrives, quarterly earnings, competitor moves, regulatory changes, the thesis should evolve. Without beliefs, the thesis lives in a static memo. With beliefs, every piece of evidence either strengthens, weakens, or contradicts it. ``` ┌──────────────────────────────────────────────────────────────┐ │ INVESTMENT THESIS OVER TIME │ │ │ │ Q1: "SaaS margins will expand" 88% conf │ 2 sources│ │ Q2: Competitor cuts prices → 71% conf (-17%) │ │ Q3: Input costs rise unexpectedly → 58% conf (-13%) │ │ Q4: New regulation favors incumbents → 67% conf (+9%) │ │ │ │ Each step records *which* evidence shifted confidence and │ │ by how much. A portfolio manager can trace any position │ │ back to the specific events that moved it — not just the │ │ original pitch. │ └──────────────────────────────────────────────────────────────┘ ``` ### Contradictions across sources An equity research agent reads a bullish analyst report and a bearish SEC filing on the same company in the same week. Without beliefs, both sit in context with equal weight. With beliefs, the contradiction is flagged. The system knows the sources disagree and can surface the specific claims that conflict. ### Risk factors that decay A risk assessment from six months ago is not the same as one from yesterday. Temporal decay ensures that old risk evaluations lose certainty over time, creating pressure to reassess rather than silently carrying forward stale ratings. ### Cross-asset pattern detection When beliefs are explicit across asset classes, an agent can detect patterns that span equities, credit, and macro: contradictions between what the bond market believes and what the equity market prices in. These cross-domain tensions are invisible to systems that treat each analysis as an isolated document. ## What Agents Can See That We Cannot A human analyst tracks 15-20 positions deeply. A belief-aware agent can maintain structured beliefs across hundreds of securities simultaneously, tracking confidence, evidence chains, contradictions, and decay across a universe too large for any individual to hold. The agent does not just recall that it read a filing. It knows that filing contradicted the analyst consensus, that the contradiction has not been resolved, and that the claim's evidence has decayed since Q2. ```ts const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, agent: 'equity-research', namespace: 'tech-sector-review', writeScope: 'space', }) await beliefs.add('ACME gross margins contracted 200bps', { confidence: 0.92, evidence: 'ACME Q3 10-Q filing', }) await beliefs.add('Sector-wide margin expansion thesis weakening', { confidence: 0.65, evidence: 'ACME Q3 10-Q filing', }) ``` ## Health Source: https://thinkn.ai/dev/cases/health Summary: Clinical reasoning, diagnosis, and research, where making assumptions visible saves lives. ## What Is at Stake In healthcare, the cost of a hidden assumption is a missed diagnosis. A drug interaction no one tracked. A treatment plan built on evidence that was superseded by a newer study. A clinical trial where contradictory findings were buried across different reports. Medical reasoning is fundamentally epistemic. It is about what a clinician believes to be true about a patient, how confident they are, and what evidence supports or contradicts that belief. When this reasoning is implicit, critical information falls through the gaps. ## What Beliefs Make Visible ### Differential diagnosis as belief state A diagnostic agent does not produce a single answer. It maintains a set of competing hypotheses, each with confidence, supporting evidence, and contradicting observations. As test results arrive, some hypotheses strengthen and others weaken. ``` ┌──────────────────────────────────────────────────────────────┐ │ DIFFERENTIAL DIAGNOSIS │ │ │ │ ● Hypothesis A: Type 2 diabetes 72% │ 4 supporting │ │ ├─ Elevated HbA1c (measurement) │ 1 contradicting │ │ ├─ Family history (user-assertion) │ │ │ ├─ BMI > 30 (measurement) │ │ │ └─ Normal fasting glucose (contradicts)│ │ │ │ │ ● Hypothesis B: Pre-diabetes 61% │ 3 supporting │ │ ├─ Borderline HbA1c │ │ │ ├─ Normal fasting glucose (supports) │ │ │ └─ Age and risk factors │ │ │ │ │ ● Hypothesis C: Stress response 28% │ 1 supporting │ │ └─ Recent life event │ 2 contradicting │ │ │ │ Gap: "Oral glucose tolerance test not performed" │ │ This gap would resolve the A vs B distinction. │ └──────────────────────────────────────────────────────────────┘ ``` Without explicit belief tracking, the agent collapses early to a single best guess and reasons forward from it. With beliefs, every competing hypothesis stays live, every piece of evidence is linked to the hypotheses it supports or refutes, and the next-best test surfaces automatically — because the system knows which gap would most reduce uncertainty between the leading candidates. (The percentages above are independent confidence scores per hypothesis, not a partition over a single distribution — multiple hypotheses can be partially supported at once.) ### Drug interaction monitoring A patient takes five medications. A new study suggests an interaction between two of them. An agent with beliefs can cross-reference the patient's medication list against an evolving evidence base, where each interaction claim carries a confidence score, evidence chain, and timestamp. A three-year-old interaction warning superseded by newer research carries less weight through temporal decay. ### Clinical trial reconciliation Multiple trials studying the same intervention can produce conflicting results. Belief state infrastructure lets an agent maintain structured beliefs across trials, tracking where results corroborate, where they conflict, and what methodological differences explain the divergence. The system can surface: "Trial A (n=500) and Trial B (n=2000) disagree on efficacy. Trial B has higher evidence weight due to sample size and methodology." ## What Agents Can See That We Cannot A physician holds a mental model of perhaps 20-30 patients in depth. A belief-aware clinical agent can maintain structured beliefs across an entire patient population, tracking how evidence evolves for each patient, detecting patterns across cohorts that no individual clinician could observe. When beliefs are explicit, the agent can detect that a subtle lab value pattern across 200 patients correlates with an outcome that would be invisible in any single chart review. The agent maintains structured beliefs across more data than any human can hold, with uncertainty tracked at every step. ### The is/ought firewall in clinical context The is/ought boundary is especially critical in healthcare. A patient saying "I want to avoid surgery" is a preference. It should not increase the agent's confidence that surgery is unnecessary. A diagnostic finding is evidence. It should update the clinical picture. The SDK enforces this distinction automatically. ```ts const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, agent: 'clinical-agent', namespace: 'patient-123', writeScope: 'space', }) await beliefs.add('HbA1c is 6.8%', { confidence: 0.95, evidence: 'Lab panel 2024-03-15', }) await beliefs.add('Fasting glucose is normal', { confidence: 0.92, evidence: 'Lab panel 2024-03-15', }) ``` ## Engineering Source: https://thinkn.ai/dev/cases/engineering Summary: Security analysis and system design, where hidden assumptions cause vulnerabilities. ## What Is at Stake In engineering, every codebase carries an implicit assumption: there are no security vulnerabilities. That assumption is rarely stated, almost never tracked, and rarely verified against current reality. A dependency picks up a new CVE. An API endpoint loses its authorization check during a refactor. An input is sanitized in one path but not another. These are beliefs about the system's security posture, and when they go unexamined, breaches follow. Security decisions rest on assumptions: about input validation, about access control, about dependency integrity, about how components interact at trust boundaries. When these assumptions are implicit, they compound. When they are explicit, agents can gather evidence to prove or refute them. ## What Beliefs Make Visible ### Security assumptions as trackable beliefs Every system carries implicit security beliefs: "All user inputs are sanitized." "No dependency has a critical CVE." "The auth middleware covers every state-changing endpoint." These are assumptions with varying levels of evidence. ``` ┌──────────────────────────────────────────────────────────────┐ │ SECURITY POSTURE │ │ │ │ ● "No critical CVEs in dependencies" 74% │ last audit 30d │ │ ● "All API routes require auth" 81% │ middleware scan │ │ ● "SQL injection mitigated" 92% │ parameterized │ │ ● "No secrets in source control" 65% │ 3mo old scan │ │ └─ ⚠ Decayed -- last scanned 90 days ago │ │ │ │ Gap: "No SSRF analysis on new webhook handler" │ │ Gap: "Rate limiting untested on file upload endpoint" │ │ │ │ The security assumptions are explicit, measurable, │ │ and tracked over time. When a 3-month-old scan │ │ decays, the system flags it for re-verification. │ └──────────────────────────────────────────────────────────────┘ ``` ### Vulnerability investigation A security agent starts with the assumption "there are no vulnerabilities in the codebase" and then gathers evidence to challenge it. A dependency scan finds a high-severity CVE in a transitive dependency. A static analysis tool flags an endpoint missing authorization. Each finding is evidence that refutes the original assumption, and the system tracks exactly how confidence shifted. Post-incident, the ledger shows when "no critical CVEs in dependencies" was last validated, which scans supported it, and which advisory contradicted it before the exploit. ### Cross-boundary contradiction detection A microservice architecture has dozens of implicit trust boundaries. Service A assumes Service B validates its inputs. Service B assumes callers are already authenticated upstream. These assumptions live in different teams' heads. When both turn out to be false at the same boundary, untrusted user input reaches the database without anyone validating it — the gap is invisible in either codebase alone, and it's exactly what attackers find. Belief state infrastructure makes these cross-boundary assumptions explicit and detectable. An agent reviewing code, configs, and security policies can surface: "Service A's assumption that Service B validates input conflicts with Service B's reliance on upstream authentication." ## What Agents Can See That We Cannot A single security engineer holds the context for their area. A belief-aware agent can maintain structured assumptions across an entire codebase, tracking dependency claims, access control beliefs, input validation hypotheses, and secret management posture across every service, with evidence and decay. When a dependency is updated, the agent can trace which downstream security assumptions might be invalidated. When a new CVE advisory contradicts a previous "no known vulnerabilities" claim, the conflict is flagged across every service that depends on that package. These cross-cutting concerns are precisely what humans miss because no individual holds the full picture. ### Temporal decay in security Security assumptions decay faster than most. A dependency audit from last week is relevant. A penetration test from a year ago, before three major refactors, is nearly worthless. Temporal decay models this naturally. Dependency scans and secret audits decay faster than architectural invariants like "we use parameterized queries," and the system reflects this. ```ts const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, agent: 'security-agent', namespace: 'security-review', writeScope: 'space', }) await beliefs.add('No critical CVEs in production dependencies', { confidence: 0.88, evidence: 'npm audit + Snyk scan 2024-03-10', }) await beliefs.add('All API endpoints enforce authentication', { confidence: 0.81, evidence: 'Middleware coverage scan 2024-03-10', }) // Three months later, the scan evidence has decayed // The agent flags it for re-verification before release ``` ## Science Source: https://thinkn.ai/dev/cases/science Summary: Hypothesis tracking, experimental design, and discovery, where beliefs push the frontier of what is known. ## What Is at Stake Scientific progress is the systematic refinement of belief under uncertainty. In belief infrastructure, the structure mirrors the practice: a hypothesis is a structured claim with confidence; an experiment is evidence that updates the claim; a publication is a citation source whose weight can be compared across studies. But the scale of modern science has outpaced the tools we use to track what is believed. Thousands of papers are published daily. Contradictory findings accumulate across journals. Hypotheses that were disproven in one subfield continue to drive experiments in another. The beliefs of the scientific community are distributed across millions of documents, with no structured model of what is currently supported, what is contested, and what remains unknown. This is where belief state infrastructure changes what is possible. ## What Beliefs Make Visible ### Hypothesis tracking across experiments A research program runs dozens of experiments over months or years. Each experiment produces evidence that updates one or more hypotheses. Without beliefs, the evidence lives in lab notebooks, papers, and slide decks. With beliefs, every hypothesis is a structured claim with confidence, evidence chains, and decay. ``` ┌──────────────────────────────────────────────────────────────┐ │ RESEARCH PROGRAM: Gene X and Disease Y │ │ │ │ ● "Gene X expression correlates with Disease Y" │ │ Confidence: 74% │ 8 experiments │ │ ├─ Exp 1: Cell line study (supports, n=50) │ │ ├─ Exp 3: Mouse model (supports, n=200) │ │ ├─ Exp 5: Human cohort (supports, n=1200) │ │ ├─ Exp 7: Different cell line (contradicts, n=80) │ │ └─ Exp 8: Replication attempt (supports, n=150) │ │ │ │ ● "Mechanism is through Pathway Z" │ │ Confidence: 52% │ 3 experiments │ │ ├─ Exp 2: Pathway inhibition study (supports) │ │ ├─ Exp 4: Proteomic analysis (neutral) │ │ └─ Exp 6: Alternative pathway found (contradicts) │ │ │ │ Gap: "No human tissue validation of pathway mechanism" │ │ Gap: "No longitudinal study beyond 6 months" │ │ │ │ The correlation is strengthening. │ │ The mechanism is genuinely uncertain, and the system │ │ knows the difference. │ └──────────────────────────────────────────────────────────────┘ ``` The two-channel model is especially powerful here. The correlation claim has high knowledge certainty (many experiments) and moderate decision resolution (mostly supporting, one contradiction). The mechanism claim has low knowledge certainty and low decision resolution. It needs fundamentally different next experiments, and the system can surface this distinction. ### Literature contradiction detection An agent reviewing papers across a field can maintain beliefs about key claims and detect when new publications contradict the established view. The agent tracks the *confidence and evidence weight* behind each claim and detects when a high-quality study contradicts a position that was previously well-supported. This is where agents see what we cannot. A human researcher tracks their subfield deeply. A belief-aware agent can maintain structured beliefs across an entire discipline, detecting cross-domain contradictions that span specialties no individual researcher bridges. ### Experimental design from uncertainty When beliefs are explicit, the next experiment is the action that would reduce the most uncertainty in the most important hypothesis. ``` ┌──────────────────────────────────────────────────────────────┐ │ NEXT EXPERIMENT SELECTION │ │ │ │ Hypothesis with highest uncertainty × importance: │ │ "Mechanism is through Pathway Z" (52% conf, high impact) │ │ │ │ Highest-value gap: │ │ "No human tissue validation of pathway mechanism" │ │ │ │ Recommended: Human tissue pathway inhibition study │ │ Expected info gain: high (would resolve the mechanism │ │ question in either direction) │ │ │ │ This is not guessing what to study next. │ │ It is directing attention toward the unknown that │ │ matters most. │ └──────────────────────────────────────────────────────────────┘ ``` ## What Agents Can See That We Cannot The frontier of knowledge is the edge of what we still believe is possible. A belief-aware agent maintaining structured hypotheses across an entire field can detect: - **Convergent evidence from unrelated subfields.** A finding in materials science that corroborates a hypothesis in bioengineering, a connection that no single researcher would draw because they operate in different communities. - **Systematic blind spots.** An entire class of experiments that has never been run because the field's shared assumptions did not suggest it. The gap is invisible until the assumptions are made explicit. - **Load-bearing assumptions.** A foundational belief that dozens of downstream hypotheses depend on, but that has not been directly tested in a decade. Temporal decay surfaces these: the older the evidence, the more the system highlights the need for re-verification. These are not hypothetical. They are the natural consequence of making beliefs explicit at a scale that no human can maintain. ## Swarm Coherence in Deep Science Imagine 50 agents working on drug discovery: some analyzing molecular structures, some reviewing clinical literature, some modeling protein interactions, some scanning patent filings. Each develops partial beliefs about mechanisms, efficacy, and safety. Without shared epistemic state, you get 50 independent perspectives. Some overlap. Some contradict. Some address gaps that others do not know exist. The human researchers cannot hold the full picture because no individual can read 50 agents' outputs and reconcile them. With belief state infrastructure, every agent contributes to the same structured belief space: ``` ┌──────────────────────────────────────────────────────────────┐ │ SWARM COHERENCE │ │ │ │ 50 agents ──▶ shared belief state ──▶ fused world view │ │ │ │ Each agent: │ │ ● Sees what others have found (fused claims) │ │ ● Sees where others disagree (contradictions) │ │ ● Sees what no one has investigated (gaps) │ │ ● Directs its work toward highest info gain │ │ │ │ The swarm converges through evidence, not consensus. │ │ Disagreements are tracked, not suppressed. │ │ The agents themselves identify limiting beliefs: │ │ assumptions that, if wrong, invalidate entire branches │ │ of the research. │ └──────────────────────────────────────────────────────────────┘ ``` The agents do not need to agree. They need a shared substrate where their different perspectives can interact, where contradictions become visible, and where the evidence determines which beliefs strengthen and which weaken. A limiting belief, like "Pathway Z is the mechanism," is automatically identified as load-bearing when dozens of downstream hypotheses depend on it. The system surfaces it: this belief has moderate confidence, high impact, and three contradicting observations. It is the single point of failure in the research program. Resolve it before building further. ## The Deeper Point To truly discover the unknown, to push beyond the current frontier, systems must be able to model what they believe, examine those beliefs, and direct their attention toward what would change their understanding most. This is what it means to be maximally truth-seeking: maintaining a structured, evolving model of what is believed to be true, rigorously updating it as evidence changes, and using uncertainty itself as a compass toward what has not yet been seen. Beliefs that can be named. Assumptions that can be examined. Evidence that can be attached. Confidence that can change. Contradictions that can be surfaced. Unknowns that can be made legible. The beliefs we cannot see are often the ones that limit us most. ```ts const beliefs = new Beliefs({ apiKey: process.env.BELIEFS_KEY, agent: 'research-agent', namespace: 'gene-x-study', writeScope: 'space', }) await beliefs.add('Gene X correlates with Disease Y', { confidence: 0.74, evidence: 'Experiment 8: replication study, n=150', }) await beliefs.add('Replication in independent cohort confirms correlation', { confidence: 0.88, evidence: 'Experiment 8: replication study, n=150', }) // The system knows: the correlation is strong, // the mechanism is uncertain, and the next experiment // that matters most is the one that resolves the mechanism. ``` --- # Internals ## How it works Source: https://thinkn.ai/dev/internals/how-it-works Summary: The lifecycle of a belief — from observation to fused state, with audit and decay along the way. A mental model of what happens when you call `before` and `after`. The behaviors the engine is required to honor — what you build against — live on the [contracts](/dev/internals/contracts) page. ## The lifecycle Every piece of information that enters the system follows the same path: ``` observation ──▶ extraction ──▶ fusion ──▶ persistence + audit │ │ │ structured merged into ledger entry claims out world state for replay ``` You don't manage this lifecycle yourself. Calling `before` and `after` drives it. The runtime mutates state atomically and serially, so every mutation is durable and observable the moment it lands. That's what lets an agent course-correct mid-turn — if the first tool result contradicts a hypothesis, the next call already sees the updated state. The runtime processes updates on two timescales. **Real-time** updates merge as evidence arrives, so later actions in the same turn operate on the newest understanding. **Background** processing runs more thorough analysis between turns: relationship detection, contradiction analysis, reassessment of the overall picture. Both feed the same belief state. ## Fusion: combining contributions When multiple agents — or multiple turns of the same agent — submit beliefs about the same claim, the engine merges them by trust weight. Higher-trust contributors move the fused state more; lower-trust contributors still contribute but with proportionally less pull. The fused state sharpens when sources agree and stays uncertain when they disagree. Each agent and source carries a reliability weight. The engine starts with a calibrated baseline based on observed reliability and you override it at runtime via [`beliefs.trust.set()`](/dev/sdk/trust). Trust knobs behave predictably — lowering an agent's weight attenuates its contributions proportionally without affecting any other source. Fusion is order-independent: combining the same set of contributions in any order produces the same result. Retries after a peer's write don't change the outcome. ## Decay: aging evidence Without decay, agents act on stale analyses indefinitely — a six-month-old market estimate would carry the same weight as last week's verified data. Decay closes that gap: every belief's evidence weight shrinks over time, so old claims lose influence unless refreshed. Stale claims surface for re-verification rather than silently dominating. Decay rates are configurable per workspace: - **Fast** — market sentiment, competitive intelligence, security posture: anything where last month's analysis is probably wrong now. - **Standard** (default) — market sizing, product positioning, strategic analyses: slow-moving but not static. - **Slow** — regulatory environments, fundamental research, architectural invariants: evidence stays relevant for quarters or years. - **None** — ground-truth observations and immutable historical facts. Use sparingly. Decay applies on read, so the runtime always works with time-adjusted values. Decayed beliefs aren't deleted — they stay in the snapshot at reduced weight, so a UI can render them as muted/needs-re-verification rather than hiding them outright. ## Evidence: types and the is/ought firewall Different evidence types carry different weight at fusion time, calibrated so quality matters more than volume. A single verified measurement moves confidence more than several inferences. | Type | Typical source path | |------|---------------------| | `measurement` | Tool results from APIs, databases, instrumentation | | `citation` | Tool results with cited sources; explicit `add(text, { source })` | | `user-assertion` | `after(userMessage)` from a user-facing surface | | `expert-judgment` | `add(text, { evidence })` with attributed reasoning | | `inference` | `after(agentOutput)` extraction (default for free-form agent text) | | `assumption` | `add(text, { type: 'assumption' })` | The engine assigns the type based on the source path during extraction. You can override with the `evidence` option on `add()` when you know better. **The is/ought firewall** is the most important design choice in evidence handling. Factual evidence updates beliefs; normative information (preferences, goals, desires) does not. | Input | Effect | |-------|--------| | "The TAM is $5B" | Updates the market size belief | | "Customer X reported a SOC2 audit failure on 2025-09-12" | Updates compliance/risk beliefs | | "I want to target enterprise" | Recorded as a goal (intent) | | "We've decided to target SOC2-compliant buyers" | Recorded as a constraint (intent) | | "Gartner reports 34% growth" | Updates the growth-rate belief | Without this separation, a user repeating "I want X" would gradually inflate the agent's confidence that X is *true* — preferences masquerading as evidence. The firewall keeps factual claims and normative intent on separate tracks. See [Intent](/dev/core/intent) for how the normative side is handled. ## Ledger: the audit trail Every belief mutation lands in an append-only ledger. There's no in-place editing, no silent overwrite, no merge that erases history. If a belief exists in any state today, the ledger says how it got there. Each entry captures what changed, who changed it, the state before and after, and a human-readable reason. Supersession is recorded as a new entry referencing the old one; deletions land as tombstone entries rather than erasing history. ```ts // Workspace-wide trail const all = await beliefs.trace() // One belief's history const history = await beliefs.trace('claim_market_size') for (const entry of history) { console.log(`${entry.timestamp} | ${entry.action}`) if (entry.confidence) { console.log(` ${entry.confidence.before} → ${entry.confidence.after}`) } if (entry.reason) console.log(` reason: ${entry.reason}`) } ``` For replay-shaped reads — "what did the world look like at time T?" — use [`beliefs.stateAt({ asOf })`](/dev/sdk/core-api#beliefsstateatoptions). It walks the ledger and rebuilds state for you. The ledger is what makes calibration analysis possible (compare stated confidence with eventual outcomes), what makes debugging confidence shifts tractable ("why did this belief drop from 85% to 72%?"), and what makes audit trails possible without reconstruction. ## Behavioral contracts Source: https://thinkn.ai/dev/internals/contracts Summary: What the engine guarantees — the behaviors you can build against. Eight guarantees about how the engine behaves under your code. Each is testable, audited, and load-bearing for the SDK above. A regression on any of these is release-blocking. The implementation behind these guarantees evolves as the engine improves. The guarantees themselves are stable — the surface you build against doesn't shift underneath you. --- ## 1. Retries and no-ops are safe Replaying the same `(idempotencyKey, scope)` on `add`, `after`, or `observe` produces the same state. Empty inputs (`after('')`, an empty `BeliefDelta`) leave state unchanged. **What this means for you:** at-least-once delivery from queues, webhooks, or flaky networks doesn't double-count evidence. Defensive patterns ("call `after()` every turn even if nothing happened") are zero-cost and zero-risk. --- ## 2. Fusion is order-independent Combining the same set of contributions in any order produces the same result. **What this means for you:** retrying a failed `after()` after a peer's write doesn't change the outcome. Multi-agent pipelines have no hidden ordering bug class. --- ## 3. Trust knobs behave predictably Lowering an agent's or source's trust attenuates its contributions proportionally without affecting any other agent or source. Locked overrides stay where you set them; the engine's learning never drifts them. **What this means for you:** `beliefs.trust.set({ kind: 'agent', id: 'unreliable-scout' }, { confidence: 0.1, strength: 50 })` reduces that scout's pull at fusion time without surprising side effects elsewhere. --- ## 4. Older evidence carries less weight Evidence is downweighted by a freshness factor as time passes, scaled to the workspace's configured decay rate. Stale claims surface for re-verification rather than silently dominating fresh ones. **What this means for you:** the system creates pressure to refresh — old analyses lose their grip without being deleted, and new evidence wins on equal footing. --- ## 5. Confidence labels are calibrated When the SDK reports `confidence: 'high'`, those events resolve true at roughly the rate the label implies. Calibration is enforced in CI; regressions don't ship. **What this means for you:** the labels are honest. You can route on `'high' / 'medium' / 'low'` without building your own calibration layer on top. --- ## 6. Supersession is a clean cut When belief B explicitly supersedes belief A, A leaves the active candidate set. `read()` and `list()` no longer return A; `trace()` still surfaces it for audit. **What this means for you:** an agent updating its position on a claim doesn't leave the prior position competing for attention. Audit history is preserved separately from current state. --- ## 7. Belief shapes don't contaminate each other Beliefs of different shapes (binary, categorical, numeric) compose safely. Adding a categorical claim doesn't perturb a binary one. **What this means for you:** multi-modal world models are safe — your numeric measurements aren't at risk from a new yes/no claim landing in the same workspace. --- ## 8. Confidence and evidence count are tracked separately A claim at 70% with 100 supporting observations is a different signal from a claim at 70% with 2 observations. The SDK exposes both — see [Clarity](/dev/core/clarity) for the two-channel model. **What this means for you:** you can distinguish "we haven't investigated yet" from "we've investigated extensively and the answer is genuinely close" — they demand opposite next actions. --- ## What's deliberately not promised - **Extraction model choice.** The model behind `after()` and `observe()` may change between releases. Only the *shape* of the resulting `BeliefDelta` is contracted. - **Absolute confidence numbers across version bumps.** Calibration shifts when models swap; the calibration *quality* is bounded, not the exact numbers. - **Cost or token usage.** Telemetry is intentionally not part of the public SDK contract. - **Implementation details.** How fusion combines contributions, how decay scales evidence, how confidence is computed — these evolve as the engine improves. Build against the guarantees, not the implementation. If you find a case where SDK behavior appears to violate one of these contracts, file it as a P0.