Observability for Vercel AI SDK: Tracing streamText, generateObject, and AI Agents
The Vercel AI SDK makes it easy to add streamText, generateObject, and multi-step tool calls to Next.js apps — but streaming errors mid-stream, invisible tool call failures, and accumulating token costs are hard to debug without trace visibility. Here's how to instrument Vercel AI SDK apps with Nexus.
The Vercel AI SDK’s three core primitives
The Vercel AI SDK gives you three main generation functions: generateText (one-shot text generation), streamText (streaming generation to the browser), and generateObject (structured JSON output). All three support tool calls, which is where agentic behavior lives.
Adding AI features to a Next.js app creates observability challenges that don't exist in traditional web apps:
- Streaming errors are hard to surface:
streamTexterrors can occur mid-stream, after the HTTP response has started. They don't show up in standard server error logs. - Tool call failures are invisible: when a tool call throws, the SDK retries or surfaces the error as a message rather than a hard failure. Without tracing, you don't know which tool calls failed.
- Token costs accumulate invisibly:
streamTextusage data is only available after the stream completes, and it's easy to lose track of costs per user or per feature. - Multi-step tool loops are opaque: when using
maxStepsfor multi-step tool execution, you can't tell how many steps ran or which step was the bottleneck.
Instrumenting generateText and streamText
Both generateText and streamText return result objects with usage and finishReason. Wrap both in a Nexus trace:
// app/api/chat/route.ts
import { generateText, streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { NexusClient } from 'nexus-sdk';
const nexus = new NexusClient({ apiKey: process.env.NEXUS_API_KEY! });
export async function POST(req: Request) {
const { messages, userId } = await req.json();
const trace = await nexus.startTrace({
agentId: 'nextjs-chat-agent',
name: `chat: ${messages[messages.length - 1]?.content?.slice(0, 60) ?? 'message'}`,
status: 'running',
startedAt: new Date().toISOString(),
metadata: {
userId,
messageCount: messages.length,
environment: process.env.NODE_ENV ?? 'development',
},
});
const t0 = Date.now();
try {
const result = streamText({
model: openai('gpt-4o'),
messages,
tools: { /* your tools */ },
maxSteps: 5,
onFinish: async ({ text, finishReason, usage, steps }) => {
await nexus.endTrace(trace.traceId, {
status: finishReason === 'stop' ? 'success' : 'error',
latencyMs: Date.now() - t0,
metadata: {
finishReason,
promptTokens: usage.promptTokens,
completionTokens: usage.completionTokens,
totalTokens: usage.totalTokens,
steps: steps.length,
},
});
},
});
return result.toDataStreamResponse();
} catch (error) {
await nexus.endTrace(trace.traceId, {
status: 'error',
latencyMs: Date.now() - t0,
error: error instanceof Error ? error.message : String(error),
});
throw error;
}
}
Tracing generateObject for structured outputs
generateObject is commonly used for extraction and classification tasks. Add spans to track schema validation success rates and per-call latency:
import { generateObject } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';
const classificationSchema = z.object({
category: z.enum(['billing', 'technical', 'feature-request', 'other']),
urgency: z.enum(['low', 'medium', 'high']),
summary: z.string(),
});
async function classifyTicket(content: string, userId: string) {
const trace = await nexus.startTrace({
agentId: 'ticket-classifier',
name: `classify: ${content.slice(0, 60)}`,
status: 'running',
startedAt: new Date().toISOString(),
metadata: { userId, inputLength: content.length },
});
const t0 = Date.now();
try {
const { object, usage } = await generateObject({
model: openai('gpt-4o-mini'),
schema: classificationSchema,
prompt: `Classify this support ticket: ${content}`,
});
await nexus.endTrace(trace.traceId, {
status: 'success',
latencyMs: Date.now() - t0,
metadata: {
category: object.category,
urgency: object.urgency,
promptTokens: usage.promptTokens,
completionTokens: usage.completionTokens,
},
});
return object;
} catch (error) {
await nexus.endTrace(trace.traceId, {
status: 'error',
latencyMs: Date.now() - t0,
error: error instanceof Error ? error.message : String(error),
});
throw error;
}
}
Tracing multi-step tool calls
When using maxSteps, each tool call triggers another LLM call. Emit a span for each step to see how many iterations ran and where latency accumulated:
const result = streamText({
model: openai('gpt-4o'),
messages,
tools: {
searchWeb: {
description: 'Search the web',
parameters: z.object({ query: z.string() }),
execute: async ({ query }) => {
const t0 = Date.now();
try {
const result = await performSearch(query);
await nexus.addSpan(traceId, {
name: 'tool:searchWeb',
status: 'success',
latencyMs: Date.now() - t0,
metadata: { query: query.slice(0, 80), resultCount: result.length },
});
return result;
} catch (error) {
await nexus.addSpan(traceId, {
name: 'tool:searchWeb',
status: 'error',
latencyMs: Date.now() - t0,
error: error instanceof Error ? error.message : String(error),
});
throw error;
}
},
},
},
maxSteps: 5,
onFinish: async ({ steps, usage, finishReason }) => {
await nexus.endTrace(traceId, {
status: finishReason === 'stop' ? 'success' : 'error',
latencyMs: Date.now() - t0,
metadata: {
totalSteps: steps.length,
toolCallCount: steps.filter(s => s.toolCalls?.length).length,
totalTokens: usage.totalTokens,
finishReason,
},
});
},
});
Cost visibility per user and feature
The most actionable metadata for Vercel AI SDK apps is cost attribution — knowing which users and which features are consuming the most tokens. Add this to your trace metadata:
onFinish: async ({ usage }) => {
// Estimate cost based on model pricing
const COST_PER_1K_INPUT = 0.0025; // gpt-4o input
const COST_PER_1K_OUTPUT = 0.01; // gpt-4o output
const estimatedCostUsd =
(usage.promptTokens / 1000) * COST_PER_1K_INPUT +
(usage.completionTokens / 1000) * COST_PER_1K_OUTPUT;
await nexus.endTrace(trace.traceId, {
status: 'success',
latencyMs: Date.now() - t0,
metadata: {
userId,
feature: 'chat', // 'chat' | 'summarize' | 'classify'
model: 'gpt-4o',
promptTokens: usage.promptTokens,
completionTokens: usage.completionTokens,
estimatedCostUsd: Math.round(estimatedCostUsd * 10000) / 10000,
},
});
}
Next steps
With Vercel AI SDK traces in Nexus, you can see token costs per user, latency by feature, and error rates per route. The Nexus Pro plan adds webhook alerts for error rate spikes and integrates with Slack — useful for catching streaming errors that never reach your Next.js error boundary.
Sign up for a free Nexus account or read our Vercel AI SDK integration guide.
Trace your Vercel AI SDK app
Free tier, no credit card required. TypeScript SDK, works in Next.js App Router.