Layered mountain ridges receding into dawn mist

Blog

Notes on engineering, AI agents, and what I learn building things.

Reasoning Bench: Testing GPT-5.5 and Opus 4.8 on Short Reasoning Prompts

I ran frontier models through 94 short reasoning prompts to see where they still trip up. The scores mostly tied, but the misses, cost, and speed told the useful story.

Calvin · Jun 3, 2026 · 10 min read

Abstract layered cards and glowing threads flowing into a knowledge archive, representing an AI agent memory stack.

Building an AI Agent Memory Stack, Part 2: From Recall to Knowledge

Retrieval alone is not memory. After fixing pull-based recall for my AI agent, I added the layers that let knowledge stick.

Calvin · Apr 30, 2026 · 8 min read

Abstract translucent brain-like memory structure forming from digital fragments and glowing retrieval lines.

Building Persistent AI Agent Memory: What I Replaced and Why It's Better

I tried building a structured knowledge base for my AI agent. It went stale, became a maintenance burden, and made things worse. Here's what I replaced it with — and why pull-based memory is so much better.

Mar 25, 2026 · 9 min read

Abstract dark command-center scene with connected AI agent nodes and subtle lobster claw motifs representing an orchestrated AI agent team.

What Building an AI Agent Team Actually Looks Like

A first-person look at what it's really like to build and operate a team of AI agents day-to-day — what works, what's still broken, and what surprised me.

Calvin · Mar 13, 2026 · 13 min read