Databricks drops Kubernetes load balancing for client-side power-of-two-choices, cuts fleet 20%

USENIX

Kubernetes balances connections, not requests; with gRPC over HTTP/2, some pods received orders-of-magnitude more traffic, causing SLO violations. Databricks built an XDS-based endpoint discovery service enabling client-side scoring on pending requests, latency, and error rate, achieving even distribution and a 20% fleet size reduction with no proxy overhead.

Stripe proposes UCP and Machine Payments Protocol to give AI agents safe, authorized purchase flows

Stripe Developers

Universal Commerce Protocol replaces HTML scraping with API-driven checkout and cryptographically enforced spending-limit tokens; Machine Payments Protocol revives HTTP 402 for per-request settlement on digital goods and API calls. Both protocols support fiat and crypto, with panelists from Block and Alchemy pressing for open standards over proprietary silos.

AI red teaming can't eliminate prompt injection — only shrink the blast radius

NDC Conferences

Transformer architecture makes prompt injection structurally unavoidable, so NDC's session shifts focus to creative adversarial testing: jailbreaks, context poisoning, crescendo attacks, and adversarial suffixes. Covers Crop Duster and Tapper for AI-powered red teaming, and argues current vendor tooling misses real business risks like agent misbehavior and data exfiltration.

In case you missed them

Luma unifies text, image, video, and audio in one transformer backbone to add reasoning to generation

Stanford Online

Amit Jain outlines why video encodes 3D geometry through time, making it richer training data than images, then explains how Luma's single shared-latent-space transformer enables multi-turn dialogue and iterative refinement — capabilities absent from diffusion-only or modality-siloed architectures.

Factory runs software projects for 16 days autonomously via serial agents and validation contracts

AI Engineer

Factory's Missions system chains planner, worker, and validator agents serially—avoiding conflicts from parallelization—with a correctness contract defined before coding begins. Workers inherit clean state from predecessors; validators span linting, type-checking, and live user-testing. Longest production run: 16 days.