36 demo/benchmark pairs covering AI, agents, distributed systems, security, and more. Each feature has a -demo command (interactive demonstration) and a -bench command (performance benchmark).
Running Demos
tri <name>-demo
tri <name>-bench
Needle Check (Koschei's Immortality Test)
Every benchmark uses the Needle Check — a quality gate based on the golden ratio:
threshold=ϕ−1=0.6180339887...
The benchmark computes an improvement rate and compares it to the threshold:
| Status | Condition | Message |
|---|
| PASS (Immortal) | rate >0.618 | KOSCHEI BESSMERTEN! Igla ostra. (Koschei is immortal! Needle is sharp.) |
| WARN (Mortal) | 0< rate <0.618 | Uluchshenie est', no Igla tupitsya. (Improvement exists, but needle dulls.) |
| FAIL (Regression) | rate ≤0 | REGRESSIYA! Igla slomana. (Regression! Needle is broken.) |
Each benchmark calculates the improvement rate as a weighted composite of scenario metrics:
improvement_rate = (metric_1 + metric_2 + baseline_contribution) / 2.0
Where metrics are scenario-specific (hit rate, similarity score, agent count, throughput, etc.) and normalized to the [0,1] range.
Example Benchmark Output
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Scenario 1: Basic query
Hit rate: 0.85
Similarity: 0.92
Improvement: 0.887
Scenario 2: Complex reasoning
Hit rate: 0.72
Similarity: 0.88
Improvement: 0.800
Needle Check (phi^-1 = 0.618):
Average rate: 0.843
Status: KOSCHEI IS IMMORTAL! Needle is sharp.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Pre-Cycle Demos
Early features implemented before the numbered cycle system.
| Command | Aliases | Description |
|---|
tri tvc-demo | tvc | TVC Distributed Chat — 10,000-entry ternary vector corpus |
tri tvc-stats | — | TVC corpus configuration and status |
tri agents-demo | agents | Multi-Agent Coordination — Coordinator, Coder, Chat, Reasoner, Researcher |
tri agents-bench | — | Multi-Agent System Benchmark — 10 task scenarios |
tri context-demo | context | Long Context Engine — sliding window (20 messages) + summarization |
tri context-bench | — | Long Context Benchmark — 24-turn conversation simulation |
tri rag-demo | rag | RAG — Query → Embed → Retrieve → Augment → Generate |
tri rag-bench | — | RAG Retrieval Benchmark — similarity scoring |
tri voice-demo | voice, mic | Voice I/O (TTS + STT) integration |
tri voice-bench | mic-bench | Voice I/O performance benchmark |
tri sandbox-demo | sandbox | Code Execution Sandbox — safe execution with 5s timeout |
tri sandbox-bench | — | Sandbox Benchmark — Zig, Python, JS, Shell execution |
tri stream-demo | stream, pipeline | Streaming Output — token-by-token with 256-token buffer |
tri stream-bench | pipeline-bench | Streaming Benchmark — char/token/chunk/SSE modes |
tri finetune-demo | finetune | Fine-Tuning Engine — local model adaptation |
tri finetune-bench | — | Fine-Tuning Benchmark — learning rate convergence |
tri batched-demo | batched | Batched Work-Stealing — parallel batch scheduler |
tri batched-bench | — | Batched Stealing Benchmark — throughput/latency |
tri priority-demo | priority | Priority Queue — task scheduling |
tri priority-bench | — | Priority Queue Benchmark |
tri deadline-demo | deadline | Deadline Scheduling — SLA enforcement |
tri deadline-bench | — | Deadline Scheduling Benchmark |
Cycle 20: Vision
| Command | Aliases | Description |
|---|
tri vision-demo | vision, eye | Local Vision — image → ternary embedding → scene detection → caption |
tri vision-bench | eye-bench | Vision Benchmark — 80 COCO semantic categories |
Cycle 26: Multi-Modal
| Command | Aliases | Description |
|---|
tri multimodal-demo | multimodal, mm | Multi-Modal Unified Engine — text + vision + voice + code → VSA |
tri multimodal-bench | mm-bench | Multi-Modal Benchmark — cross-modal fusion |
| Command | Aliases | Description |
|---|
tri tooluse-demo | tooluse, tools | Multi-Modal Tool Use — function calling with multi-modal inputs |
tri tooluse-bench | tools-bench | Tool Use Benchmark — invocation and result handling |
Cycle 30-33: Unified Agents
| Command | Aliases | Cycle | Description |
|---|
tri unified-demo | unified, agent | 30 | Unified Multi-Modal Agent — all modalities in single agent |
tri unified-bench | agent-bench | 30 | Unified Agent Benchmark |
tri autonomous-demo | auto, autonomous | 31 | Autonomous Agent — self-directed with goal planning |
tri autonomous-bench | autonomous-bench | 31 | Autonomous Agent Benchmark |
tri orchestration-demo | orch, orchestrate | 32 | Multi-Agent Orchestration — task distribution |
tri orchestration-bench | orchestrate-bench | 32 | Orchestration Benchmark |
tri mm-orch-demo | mmo, mm-orch | 33 | MM Multi-Agent Orchestration — multi-modal coordination |
tri mm-orch-bench | mm-orch-bench | 33 | MM Orchestration Benchmark |
Cycle 34-37: Memory & Distribution
| Command | Aliases | Cycle | Description |
|---|
tri memory-demo | memory, mem | 34 | Agent Memory & Cross-Modal Learning |
tri memory-bench | mem-bench | 34 | Memory Benchmark |
tri persist-demo | persist, save | 35 | Persistent Memory & Disk Serialization |
tri persist-bench | save-bench | 35 | Persistence Benchmark — I/O performance |
tri spawn-demo | spawn, pool | 36 | Dynamic Agent Spawning & Load Balancing |
tri spawn-bench | pool-bench | 36 | Spawn Benchmark — spawn latency |
tri cluster-demo | cluster, nodes | 37 | Distributed Multi-Node Agents |
tri cluster-bench | nodes-bench | 37 | Cluster Benchmark — network latency/throughput |
Cycle 39-45: Scheduling & Plugins
| Command | Aliases | Cycle | Description |
|---|
tri worksteal-demo | worksteal, steal | 39 | Adaptive Work-Stealing Scheduler |
tri worksteal-bench | steal-bench | 39 | Work-Stealing Benchmark |
tri plugin-demo | plugin, ext | 40 | Plugin & Extension System |
tri plugin-bench | ext-bench | 40 | Plugin Benchmark |
tri comms-demo | comms, msg | 41 | Agent Communication Protocol |
tri comms-bench | msg-bench | 41 | Communication Benchmark — message throughput |
tri observe-demo | observe, otel | 42 | Observability & Tracing System |
tri observe-bench | otel-bench | 42 | Observability Benchmark — tracing overhead |
tri consensus-demo | consensus, raft | 43 | Consensus & Coordination Protocol (Byzantine-tolerant) |
tri consensus-bench | raft-bench | 43 | Consensus Benchmark |
tri specexec-demo | specexec, spec | 44 | Speculative Execution Engine |
tri specexec-bench | spec-bench | 44 | Speculative Execution Benchmark |
tri governor-demo | governor, gov | 45 | Adaptive Resource Governor |
tri governor-bench | gov-bench | 45 | Resource Governor Benchmark |
Cycle 46-52: Advanced Systems
| Command | Aliases | Cycle | Description |
|---|
tri fedlearn-demo | fedlearn, fl | 46 | Federated Learning Protocol |
tri fedlearn-bench | fl-bench | 46 | Federated Learning Benchmark — convergence |
tri eventsrc-demo | eventsrc, es | 47 | Event Sourcing & CQRS Engine |
tri eventsrc-bench | es-bench | 47 | Event Sourcing Benchmark — event throughput |
tri capsec-demo | capsec, sec | 48 | Capability-Based Security Model |
tri capsec-bench | sec-bench | 48 | Security Benchmark — authorization overhead |
tri dtxn-demo | dtxn, txn | 49 | Distributed Transaction Coordinator (ACID) |
tri dtxn-bench | txn-bench | 49 | Transaction Benchmark — throughput |
tri cache-demo | cache, memo | 50 | Adaptive Caching & Memoization |
tri cache-bench | memo-bench | 50 | Cache Benchmark — hit rates |
tri contract-demo | contract, sla | 51 | Contract-Based Agent Negotiation |
tri contract-bench | sla-bench | 51 | Contract Benchmark — negotiation speed |
tri workflow-demo | workflow, wf | 52 | Temporal Workflow Engine |
tri workflow-bench | wf-bench | 52 | Workflow Benchmark — latency |
Summary
| Category | Pairs | Cycles |
|---|
| Pre-cycle (AI, scheduling) | 11 | — |
| Vision | 1 | 20 |
| Multi-Modal | 1 | 26 |
| Tool Use | 1 | 27 |
| Unified Agents | 4 | 30-33 |
| Memory & Distribution | 4 | 34-37 |
| Scheduling & Plugins | 7 | 39-45 |
| Advanced Systems | 7 | 46-52 |
| Total | 36 pairs | 72 commands |
See Also
- Pipeline — Golden Chain development cycle (where demos are verified)
- TVC Learning — TVC corpus architecture (used in tvc-demo)
- Sacred Constants — The ϕ−1=0.618 threshold explained