Notes from the atelier.
Field-tested writing on agentic AI, MLOps, cloud-native infrastructure, and the unglamorous engineering that keeps systems alive in production. 13 pieces so far.
- 2026.04.18
On retrieval that survives the on-call rotation.
Why 'RAG' is a research toy until you treat ingestion, eval, and observability as first-class concerns. Field-tested patterns from three production deployments.
9 min read → - 2026.03.02
Why your agent loops — and the structural fixes.
Most agent loops are a symptom of bad memory contracts, not bad prompts. A short taxonomy and the four interventions that actually help.
12 min read → - 2026.01.17
Harness, in plain English.
What it actually does, when it's worth the friction, and the three places small teams should not adopt it.
6 min read → - 2025.11.04
The boring parts of MLOps.
Schemas, artifacts, and the audit trail. The unsexy infrastructure that lets a model retraining pipeline survive a quarterly review.
14 min read → - 2025.09.21
Three diagrams that pay rent.
The architecture, sequence, and topology drawings I redraw on every engagement. Templates included.
4 min read → - 2025.03.05
Building production-ready Lambda Extensions
Best practices for Lambda Extensions that survive prod: lifecycle, telemetry, failure modes — from six years of serverless.
3 min read → - 2025.01.12
Building a production clinical AI pipeline on AWS — raw data to deployed model
End-to-end clinical AI pipeline on AWS: ingest, label, train, deploy. The boring infra decisions that decide whether the model ships.
1 min read → - 2024.02.11
L1, L2 and L3 CDK constructs — and when to use each
A practical mental model for the three CDK construct levels and when each one is the right tool.
9 min read → - 2023.07.01
Pick the latest S3 prefix and unzip it inside Databricks
Short ops recipe — find the most recent S3 prefix and decompress on the fly inside a Databricks notebook.
5 min read → - 2023.03.14
SNS · SQS · Step Functions — when to use what
Decision guide for the three AWS messaging/workflow primitives, with realistic failure scenarios.
7 min read → - 2022.11.14
AWS AppSync with a Lambda Authorizer via CDK v2 nested stacks
Stand up AppSync with a Lambda authorizer using a nested-stack CDK pattern that scales cleanly.
10 min read → - 2021.10.02
A least-privilege S3 bucket in CDK TypeScript
No wildcards, no surprises — a tight bucket policy recipe in CDK TypeScript.
5 min read → - 2021.07.19
Server-side pagination with Node.js, Prisma and Postgres
A no-magic walkthrough of cursor-based server-side pagination using Prisma against Postgres.
4 min read →