Blog

Thoughts on AI agent observability, developer tools, and building in public.

RSS
2026-04-09 · 9 min read

Detecting AI Hallucinations in Production with Trace Analysis

Hallucinations are the silent killers of AI agent reliability. Most teams only discover them from user complaints. Here's how to use trace analysis to detect hallucinations before they reach your users — with output verification spans, confidence scoring, and retrieval comparison tracing.

Read more →
2026-04-09 · 9 min read

How to Choose an AI Observability Tool in 2026

Evaluating AI observability tools? Most comparisons list features without helping you decide. Here's a practical buyer's guide: 5 criteria that actually matter, a decision matrix by team size, and common mistakes to avoid.

Read more →
2026-04-09 · 8 min read

How Much Does It Cost to Run AI Agents? A Token Economics Guide

Running AI agents in production costs more than most teams expect. Token costs compound quickly across retries, context overflows, and unnecessary tool calls. Here's how to calculate realistic costs, identify hidden cost patterns, and use tracing to keep your bill predictable.

Read more →
2026-04-09 · 9 min read

OpenTelemetry for AI Agents: Why Standard APM Falls Short

OpenTelemetry is great at instrumenting web services. But AI agents fail in ways that standard spans and metrics were never designed to capture. Here's what OTEL gets right, five things it misses, and how purpose-built agent observability fills the gaps.

Read more →
2026-04-09 · 8 min read

5 Metrics Every AI Agent Team Should Track

Most teams monitoring AI agents track the wrong things. Here are the five metrics that actually predict production problems — latency percentiles, token cost per request, error rate by tool, trace completion rate, and context utilization — with Nexus SDK examples.

Read more →
2026-04-09 · 11 min read

AI Observability Tools Compared: The 2026 Guide

Langfuse, LangSmith, Helicone, Braintrust, Arize Phoenix, AgentOps, or Nexus? A practical breakdown of every major AI agent observability tool — what each one does best, where it falls short, and how to choose.

Read more →
2026-04-07 · 9 min read

How to Debug AI Agents in Production

AI agents fail in non-obvious ways: tool call errors that cascade silently, context windows that overflow mid-task, loops that spin without terminating. Here's a practical debugging playbook with trace-first strategies and Nexus SDK examples.

Read more →
2026-04-07 · 6 min read

How to Monitor Your AI Agents in Production

AI agents fail in production in ways that are invisible without observability. Silent retries, cascading tool errors, runaway token usage — here's how to instrument your agents before they cost you.

Read more →