Stanford Online
Maps the generative AI value chain across semiconductors, infrastructure, and applications, showing $350B in new revenue concentrated at Nvidia despite 10x application growth over two years. Covers why near-zero marginal cost breaks down when serving users burns GPU compute, and what conditions—custom ASICs, inference dominance, hyperscaler integration—could reprice the stack.
Stanford Online
Controlled experiments on facts, syllogisms, and encodings show fine-tuned models fail to reverse relations or compose logical chains, while the same models nearly ace both tasks given the data in context. Three mitigations tested: offline data augmentation, episodic retrieval at inference time, and RL-driven regeneration, each trading training cost for inference cost.
In case you missed them
DeepLearning.AI
Marc Brooker maps agent failures into four quadrants by frequency and severity, argues only low-frequency, low-consequence errors have real enterprise TAM, and outlines AWS investments in correct-by-construction frameworks (Hydro, Cedar), automated reasoning, and deterministic agent steering to get there.
AI Engineer
Britain's No. 10 Data Science Team runs a market-rate fellowship recruiting from labs, big tech, and YC founders—never career civil servants—and embeds them directly in departments. Early deployments include an Extract platform built with DeepMind to automate planning applications, with spin-offs now placing engineers inside prisons and scaling across 400K public-sector workers.
AI Engineer
Ash Prabaker and Andrew Wilson detail three failure modes for long-horizon agents—context limits, poor planning, and self-evaluation bias—and show how a GAN-inspired generator-evaluator pattern with Playwright-driven rubric testing enables 5-6+ hour runs. Concrete example: a retro game maker that solo single-session runs failed to complete.