AgentFlow DailyNew

The AI signal,filtered from the noise.

The agents, models, and labs that actually matter — one sharp email every morning.

Free forever · No hype, no filler · Unsubscribe anytime

Latest from AgentFlow

The Future Is Domain-Specific Agents: Composition Over Inheritance

Justin Schroeder makes the case that the future of AI is not one giant agent stuffed with every tool, but many tiny domain-specific agents composed together. Piling MCP servers and skills onto a single agent is inheritance: it inflates the context window and gives diminishing returns. Composition, spinning up small isolated agents (a Figma agent, a Gmail agent) that talk to a coordinator in plain English, delivers over 80% token efficiency, runs tasks with cheaper small models, and enforces hard capability limits for security. Here is why composition beats inheritance, the real cost and security math, and why 2027 becomes the year of multi-agent orchestration.

Jul 12, 2026

AI Infrastructure

How LLMs Actually Generate Text: The Full Inference Pipeline

Every token an LLM writes travels the same nine stages: tokenize, embed, run the transformer stack, project to logits, sample, verify, detokenize, stream. The part nobody explains well is that inference splits into two very different phases. Prefill reads your whole prompt in parallel and is compute-bound. Decode writes one token at a time and is memory-bound, gated by a KV cache that grows with every token. Here is the whole pipeline in plain language, why decode is the slow part, and how FlashAttention, quantization, and speculative decoding attack the exact bottleneck that makes it slow.

Jul 7, 2026

AI Models

Claude Sonnet 5: Almost Opus 4.8, at a Third of the Price

Anthropic shipped Claude Sonnet 5 (codename Fennec) on June 30, 2026, and the pitch is simple: near-Opus-4.8 quality at mid-tier prices. It is the most agentic Sonnet yet, ships a 1M-token context window, scores 82.1% on SWE-bench, and lands at $2/$10 per million tokens through August 31. Here is what actually shipped, the real benchmark gap to Opus 4.8, the new tokenizer that quietly changes your bill, the safety gains, and whether you should make it your default driver.

Jun 30, 2026

The AI signal,filtered from the noise.

Latest from AgentFlow

The Future Is Domain-Specific Agents: Composition Over Inheritance

How LLMs Actually Generate Text: The Full Inference Pipeline

Claude Sonnet 5: Almost Opus 4.8, at a Third of the Price

Everything that matters in AI,straight to your inbox.

Everything that matters in AI,
straight to your inbox.