Full-duplex speech-to-speech models lose to cascaded systems on reliability, tooling, and cost

AI Engineer

Zeghidour draws a line between naturalness and production readiness: full-duplex models like Moshi handle interruptions and paralinguistic cues but lack tool use, observability, and scalable economics. Cascaded STT-LLM-TTS pipelines stay dominant; Gradium's on-device TTS (Phonon) targets the cost problem by eliminating API round-trips.

In case you missed them

Databricks: proprietary data, not model choice, is the moat in agentic financial services

Databricks

Jamin Nakai argues frontier models are commoditizing fast, so financial institutions' edge shifts to data strategy—grounding agents in proprietary context via a governed lakehouse. Covers agentic sprawl risks, RBC Capital Markets and Mastercard as real examples, and a hypothetical Enron fraud-detection case to show what context-aware reasoning unlocks.

Black Forest Labs trains multimodal generators without external encoders using Self Flow

AI Engineer

Self Flow uses dual noise streams—one heavily noised, one lightly noised—to jointly learn generation and representation in a single model, eliminating external vision encoders. Converges faster, fixes anatomy and text artifacts, and generalizes across images, video, audio, and robot action prediction.

Stripe proposes UCP and Machine Payments Protocol to give AI agents safe, authorized purchase flows

Stripe Developers

Universal Commerce Protocol replaces HTML scraping with API-driven checkout and cryptographically enforced spending-limit tokens; Machine Payments Protocol revives HTTP 402 for per-request settlement on digital goods and API calls. Both protocols support fiat and crypto, with panelists from Block and Alchemy pressing for open standards over proprietary silos.

AI red teaming can't eliminate prompt injection — only shrink the blast radius

NDC Conferences

Transformer architecture makes prompt injection structurally unavoidable, so NDC's session shifts focus to creative adversarial testing: jailbreaks, context poisoning, crescendo attacks, and adversarial suffixes. Covers Crop Duster and Tapper for AI-powered red teaming, and argues current vendor tooling misses real business risks like agent misbehavior and data exfiltration.