thinkn
  • Product
    Manifesto
    The reason we exist
    World Studioprivate beta
    See and manage your world model
    Belief SDKinvite only
    Add belief states to your AI system
    Request Access →Join the private beta waitlist
  • Docs
  • FAQ
  • Docs
  • FAQ
Sign In
Welcome
  • Start Here
  • Install
  • Quickstart
  • FAQ
  • Enterprise
  • Engineering
  • Research
  • Investing
  • Legal
  • Support
  • Operations
  • Finance
  • Health
cases/operations.mdx

Operations

An ops agent that acts on the line's live state and catches the failure before it trips.

On the plant floor, the world model is the agent's live read on the line: which assets are healthy, where a failure is building, and how sure it is, with every call traceable to evidence a human can correct before a breaker trips. On Press #3, vibration has climbed for a week, and a plant floor will not let an agent act on a hunch. A belief state that lives outside the model, bounded by the safety limits you set, gives the ops agent one current picture of the line and keeps it inside those limits. A log tells you what happened; the world model tells you what is true on the line right now and what to do about it.

The world

The environment is the operation: presses and conveyors, motors and bearings, the spares bin, open work orders, and the SOPs that govern when each asset runs. Entities are the assets and parts. Relations are feeds (line 4 feeds the packer), depends-on (the packer depends on upstream throughput), and is-spare-for (bin B17 is a spare for the press-3 bearing). Ground truth comes off the floor, from sensors, meters, and signed-off work orders, not from anyone's recollection of what they did last shift.

What streams in

Observations are the signals the line emits as it runs:

  • sensor and telemetry streams: bearing temperature, vibration spectra, cycle throughput
  • work orders and PM (preventive maintenance) records
  • inventory and ERP movements
  • inspection logs and incident reports

Every reading folds in as evidence about the line's condition. A fresh vibration sample sharpens the wear claim; a two-day-old ERP count ages down on its decay schedule until someone confirms it on the floor.

The belief state

The state of the line is a set of claims. Each one carries an explicit posterior, its provenance, and a decay schedule, and sits alongside the gaps the agent knows are still open.

1┌──────────────────────────────────────────────────────────────┐
2│  LINE STATE - Press #3                                       │
3│                                                              │
4│  ● "Bearing temp within spec"           90% │ sensor (live)  │
5│  ● "Vibration trending up over 7 days"  77% │ telemetry      │
6│    └─ matches pre-failure signature of WO-2231               │
7│  ● "Spare bearing in stock"             68% │ ERP, 2d old    │
8│  ● "Last PM completed on schedule"      95% │ work order     │
9│                                                              │
10│  Gap: "No vibration baseline since the Feb rebuild"          │
11│  Contradiction: ERP shows 1 spare; floor count says 0        │
12└──────────────────────────────────────────────────────────────┘

Vibration climbing into a known pre-failure signature is a belief the agent can act on before the bearing seizes. The agent is 77% sure, not certain, so it knows to confirm rather than halt. And the ERP-versus-floor disagreement shows up here, on the screen, instead of at the moment a tech reaches for a spare that isn't on the shelf.

What we're after

The intent is simple to state: keep the line running within safety and spec. Success means uptime stays inside its limits, no safety or regulatory threshold gets breached, and maintenance lands before a part fails rather than after. The limiting factor is whatever unknown about the line's state is most likely to change the next call, which on Press #3 right now is the spare count, not the bearing temperature. This matters because the engine ranks moves against this intent, not against raw uncertainty. A gap the agent could close but that wouldn't change a decision sits below a smaller gap that would.

GoalKeep the line running within safety and spec.
SuccessUptime inside limits, no safety or regulatory breach, maintenance done before failure.
Limiting factorThe most decision-relevant unknown about the line's state.

The policy

Policy is where you tell the agent what it must never trade away for throughput:

Policy bucketOn a production line
InvariantsSafety limits and lockout/tagout procedures hold without exception. Stay inside regulatory and environmental thresholds.
Source hierarchyA live sensor weighs heavier than a manual log entry; a completed work order weighs heavier than a verbal "it's done."
ConventionsHow maintenance windows get scheduled, how anomalies get reported.
AvoidDeferring a flagged safety condition; running past a maintenance interval to make a quota.

The actions

Actions here reach the physical world, so they're banded by what they can move:

ActionSafety classEffect
flag-anomaly, reportinfoRead-only. Raises a signal; touches nothing physical.
schedule-maintenance, reorder-partmutatesMoves the plan or the inventory.
halt-line, override-interlockneeds-approvalGated on a qualified human.

These bands are a modeling lens you encode today, not a rule the engine enforces for you. You express the invariants, the source hierarchy, and these safety classes through how the agent observes and how it chooses to act. thinkⁿ does not yet take a registered policy and refuse the calls that violate it. Read the two tables above as the contract your agent is responsible for keeping. A declarative config surface that makes the engine the enforcer is on the roadmap. Until then the division of labor is clear: the engine supplies the picture and the audit trail, and a qualified human stays on the loop for anything that moves the physical world.

Plan & act

1import Beliefs from 'beliefs'
2
3const beliefs = new Beliefs({
4  apiKey: process.env.BELIEFS_KEY,
5  agent: 'ops-agent',
6  namespace: 'line:press-3',
7  writeScope: 'space',
8})
9
10// 1. Orient: read the current picture; the ranked next moves ride back on it
11const context = await beliefs.before('Decide whether Press #3 needs intervention this shift')
12const [move] = context.moves   // top-ranked, already in hand; no extra call
13// → { action: 'gather_evidence', subType: 'confirm-spare-count', valueOfInformation: 0.8, ... }
14
15// 2. Act: your agent dispatches the tool the move points to
16const count = await runTool(move.subType)   // 'confirm-spare-count' → your WMS bin scan
17
18// 3. Fold the result back in; it becomes the next observation
19await beliefs.after(JSON.stringify(count), { tool: 'wms', source: 'floor-scan' })

The verdict is the product. A memory store would hand the agent back the last vibration reading and the last spare count and leave the call to the model. thinkⁿ judges the call: Press #3 needs eyes this shift, the spare count is the thing to confirm first, and here is the confidence on each. Because every belief behind that call traces to the sensor reading or work order it came from, a supervisor can see why before approving. A halt-line recommendation never reaches them without the same evidence underneath it.

Emerging: plan several moves ahead

As a line builds history, the agent can forecast how an intervention plays out over the next few steps (beliefs.forecast.predict(['schedule-pm', 'run-to-failure', 'reorder-now'])). It reports low confidence honestly until it has seen enough action→outcome history to calibrate.

Finance

A portfolio as a world model: theses, evidence, and risk policy.

Learn more

World model

The full frame: environment, observations, belief, policy, actions.

Learn more
PreviousSupport
NextFinance

On this page

  • The world
  • What streams in
  • The belief state
  • What we're after
  • The policy
  • The actions
  • Plan & act