RFC-0003 — Heartbeat schema
Owner: Serving Lead
What it defines
Recurring liveness + capability advertisement an operator emits.
Cadence
- Off-chain heartbeat (gateway-bound): every 12 s, WebSocket from
worker-control-plane. - On-chain heartbeat (liveness anchor): once per epoch (360 blocks ≈ 36 min at the current 6 s block time), extrinsic to
pallet-operator-stake::heartbeat.
Fields
(operator_id, capabilities[], current_load, kv_cache_pressure, last_completed_job_id, attestation_freshness, watchdog_state, version, signature) plus per-capability (base_model_id, adapter_ids[], quantization, max_context, max_concurrent, deterministic_mode) and per-load (active, queue_depth, p50/p99 TTFT/ITL, gpu_memory_used, gpu_util).
Bandwidth budget
~84 GiB/day total catalog traffic worst case (10 KiB × 1000 ops × 100 gateway replicas). Aggregated via per-region pub/sub.
Validator-watcher signals
Sudden capability churn (drops 10+ adapters in 1h), load anomaly (p99 TTFT spikes 5× with flat util), geo region change without re-attestation, stale firmware vs CRL. Signals feed IR playbook §8.2 #11.