Front page · Friday, June 12, 2026

autonomous DevOps agents at scale

Linktree's CI optimization agent merges 80+ PRs, cuts build times 60% GDG Melbourne
TL;DW
CI optimization agent merged 80+ PRs across 20+ services; fastest win reduced one pipeline from 22 to 8 minutes (64% improvement).
Used specialized agents for different tasks: Haiku for pre-screening and review, Sonnet for implementation—matching model capability to task scope reduced costs and improved accuracy.
Playbooks are markdown files defining trigger conditions and fix recipes; kept small, targeted (e.g., parallelize tests, cache dependencies, retry flaky steps) to enable focused PRs.
Three-agent funnel: pre-screener identifies violations, implementation agent builds fixes, reviewer agent validates—reduces 120 scanned repos to ~40 merged PRs per cycle.
Verification loop automatically fixes CI failures by parsing logs, then validates success by checking build times and confirming no regressions in latest builds.
Created custom MCP wrappers (backend-for-frontend pattern) instead of generic APIs to reduce token overhead calling Buildkite—critical for agent efficiency.
Auto-research technique: log every agent turn, run another AI to find repeated failures and inefficiencies; discovered agent was unintentionally using sub-agents, fixed after analysis.
AI cost to AWS bill comparison: optimizing CI runtime reduces EC2 spend far more than agent calls cost, creating net savings plus higher developer productivity.
Prefer small, targeted PRs with human review over massive changes; all agent work goes through GitHub PR review to maintain transparency and bounded blast radius.
Tried one large agent doing everything initially—unbounded scope caused token explosion and poor results; specialization fixed both quality and efficiency.

A multi-agent system autonomously identifies pipeline bottlenecks and submits fixes via GitHub PRs, achieving a 50% acceptance rate across 20+ services. Specialized agents handle pre-screening, implementation, and review; switching from a generalist to specialized architecture cut token costs, and one pipeline dropped from 22 to 8 minutes.