Trigger.dev splits agent durability into context logs + VM snapshots, drops replay

AI Engineer

Eric Allam argues replay-based durable execution breaks down for long-running agents that clone repos and hold in-memory state. Trigger.dev's Firecracker-based implementation uses an append-only context log for code compatibility and VM snapshots for execution state, hitting sub-second snapshots and 200ms restores at scale.

Arize escapes context window trap with head-tail truncation and sub-agent delegation

AI Engineer

Naive LLM summarization was too inconsistent; full truncation broke reasoning. The working fix: keep the first and last 100 tokens while storing the middle in a retrievable memory store, plus offloading data-heavy tasks like search to sub-agents so the main conversation stays lightweight. Long-session eval (testing turn 11 after 10 loaded turns) caught context bugs before users hit them.

In case you missed them

Black Forest Labs trains multimodal generators without external encoders using Self Flow

AI Engineer

Self Flow uses dual noise streams—one heavily noised, one lightly noised—to jointly learn generation and representation in a single model, eliminating external vision encoders. Converges faster, fixes anatomy and text artifacts, and generalizes across images, video, audio, and robot action prediction.

KPIs that hit targets still mislead when aggregation bias and incentive effects go unmeasured

Fabric User Group Switzerland

Yannis organizes dashboard failure modes into three buckets—measurement illusions (Simpson's paradox, mix shift, lagging indicators), behavioral traps (Goodhart's Law, Cobra effect, outcome bias), and system/time traps (local optimization, short-term bias)—then proposes a four-question checklist to run before any metric reaches an executive dashboard.