beliefs.tools.* records and reads running estimates of tool reliability. The engine learns, per (tool, contextClass) pair, how often the tool produces useful evidence — so the agent can pick the right tool for the job and downweight unreliable ones.
Disambiguation
beliefs.tools.observe(envelope) is not the same as the top-level beliefs.observe(envelope). The top-level method runs the full belief-extraction pipeline on free-form content. tools.observe records a single Bernoulli outcome (success/failure) — orders of magnitude lighter, and only for tool-reliability tracking.
beliefs.tools.observe(envelope)
Record a single tool outcome. Updates the running estimate in place and returns the new summary.
1const prior = await beliefs.tools.observe({
2 tool: 'web_search',
3 success: true,
4 contextClass: 'market-research',
5 weight: 1.0,
6})
7
8console.log(`web_search rate now ${prior.rate} (${prior.confidence})`)Envelope:
| Field | Type | What it does |
|---|---|---|
tool | string | Required. Tool identifier. |
success | boolean | Required. Did the tool produce useful evidence? |
contextClass | string | Optional context label (e.g. 'exploratory-research'). |
weight | number | Optional weight (default 1.0). |
agentId | string | Override the bound agent. |
signal | AbortSignal | Cancellation. |
Returns ToolPriorSummary (see below).
beliefs.tools.priors(options?)
List current priors in scope. Filter to narrow.
1// Every prior in scope:
2const all = await beliefs.tools.priors()
3
4// Just one tool:
5const search = await beliefs.tools.priors({ tool: 'web_search' })
6
7// Tool + context combo:
8const filtered = await beliefs.tools.priors({
9 tool: 'github_search',
10 contextClass: 'code-review',
11})Options: tool?, contextClass?, limit?, agentId?, signal?.
Returns ToolPriorSummary[].
ToolPriorSummary
1{
2 id: string
3 summary: string
4 tool: string
5 contextClass: string // empty string when uncategorized
6 /** Mean success rate, 0–1. */
7 rate: number
8 confidence: 'low' | 'medium' | 'high' | 'certain'
9 /** 90% uncertainty interval on the mean. */
10 credibleInterval: { low: number; high: number }
11 /** Total observations accumulated. */
12 observations: number
13 suggestion?: string
14 internals?: { alpha?, beta?, successes?, failures? }
15}rate is "on average, this tool produces useful evidence rate × 100% of the time." confidence is derived from how many observations back the estimate — low below 5 observations, medium 5–20, high 20+. credibleInterval is a 90% uncertainty interval that narrows as observations accumulate.
Using priors to route
A common pattern: before calling a tool, fetch its prior. If confidence === 'low' and rate < 0.3, consider an alternative or attach a fallback. After the call, record the outcome with tools.observe() so the prior keeps learning.