2026-04-19 · 8 min read

Observability for Vercel AI SDK: Tracing streamText, generateObject, and AI Agents

The Vercel AI SDK makes it easy to add streamText, generateObject, and multi-step tool calls to Next.js apps — but streaming errors mid-stream, invisible tool call failures, and accumulating token costs are hard to debug without trace visibility. Here's how to instrument Vercel AI SDK apps with Nexus.

The Vercel AI SDK’s three core primitives

The Vercel AI SDK gives you three main generation functions: generateText (one-shot text generation), streamText (streaming generation to the browser), and generateObject (structured JSON output). All three support tool calls, which is where agentic behavior lives.

Adding AI features to a Next.js app creates observability challenges that don't exist in traditional web apps:

Instrumenting generateText and streamText

Both generateText and streamText return result objects with usage and finishReason. Wrap both in a Nexus trace:

// app/api/chat/route.ts
import { generateText, streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { NexusClient } from 'nexus-sdk';

const nexus = new NexusClient({ apiKey: process.env.NEXUS_API_KEY! });

export async function POST(req: Request) {
  const { messages, userId } = await req.json();

  const trace = await nexus.startTrace({
    agentId: 'nextjs-chat-agent',
    name: `chat: ${messages[messages.length - 1]?.content?.slice(0, 60) ?? 'message'}`,
    status: 'running',
    startedAt: new Date().toISOString(),
    metadata: {
      userId,
      messageCount: messages.length,
      environment: process.env.NODE_ENV ?? 'development',
    },
  });

  const t0 = Date.now();

  try {
    const result = streamText({
      model: openai('gpt-4o'),
      messages,
      tools: { /* your tools */ },
      maxSteps: 5,
      onFinish: async ({ text, finishReason, usage, steps }) => {
        await nexus.endTrace(trace.traceId, {
          status: finishReason === 'stop' ? 'success' : 'error',
          latencyMs: Date.now() - t0,
          metadata: {
            finishReason,
            promptTokens: usage.promptTokens,
            completionTokens: usage.completionTokens,
            totalTokens: usage.totalTokens,
            steps: steps.length,
          },
        });
      },
    });

    return result.toDataStreamResponse();
  } catch (error) {
    await nexus.endTrace(trace.traceId, {
      status: 'error',
      latencyMs: Date.now() - t0,
      error: error instanceof Error ? error.message : String(error),
    });
    throw error;
  }
}

Tracing generateObject for structured outputs

generateObject is commonly used for extraction and classification tasks. Add spans to track schema validation success rates and per-call latency:

import { generateObject } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';

const classificationSchema = z.object({
  category: z.enum(['billing', 'technical', 'feature-request', 'other']),
  urgency: z.enum(['low', 'medium', 'high']),
  summary: z.string(),
});

async function classifyTicket(content: string, userId: string) {
  const trace = await nexus.startTrace({
    agentId: 'ticket-classifier',
    name: `classify: ${content.slice(0, 60)}`,
    status: 'running',
    startedAt: new Date().toISOString(),
    metadata: { userId, inputLength: content.length },
  });

  const t0 = Date.now();
  try {
    const { object, usage } = await generateObject({
      model: openai('gpt-4o-mini'),
      schema: classificationSchema,
      prompt: `Classify this support ticket: ${content}`,
    });

    await nexus.endTrace(trace.traceId, {
      status: 'success',
      latencyMs: Date.now() - t0,
      metadata: {
        category: object.category,
        urgency: object.urgency,
        promptTokens: usage.promptTokens,
        completionTokens: usage.completionTokens,
      },
    });

    return object;
  } catch (error) {
    await nexus.endTrace(trace.traceId, {
      status: 'error',
      latencyMs: Date.now() - t0,
      error: error instanceof Error ? error.message : String(error),
    });
    throw error;
  }
}

Tracing multi-step tool calls

When using maxSteps, each tool call triggers another LLM call. Emit a span for each step to see how many iterations ran and where latency accumulated:

const result = streamText({
  model: openai('gpt-4o'),
  messages,
  tools: {
    searchWeb: {
      description: 'Search the web',
      parameters: z.object({ query: z.string() }),
      execute: async ({ query }) => {
        const t0 = Date.now();
        try {
          const result = await performSearch(query);
          await nexus.addSpan(traceId, {
            name: 'tool:searchWeb',
            status: 'success',
            latencyMs: Date.now() - t0,
            metadata: { query: query.slice(0, 80), resultCount: result.length },
          });
          return result;
        } catch (error) {
          await nexus.addSpan(traceId, {
            name: 'tool:searchWeb',
            status: 'error',
            latencyMs: Date.now() - t0,
            error: error instanceof Error ? error.message : String(error),
          });
          throw error;
        }
      },
    },
  },
  maxSteps: 5,
  onFinish: async ({ steps, usage, finishReason }) => {
    await nexus.endTrace(traceId, {
      status: finishReason === 'stop' ? 'success' : 'error',
      latencyMs: Date.now() - t0,
      metadata: {
        totalSteps: steps.length,
        toolCallCount: steps.filter(s => s.toolCalls?.length).length,
        totalTokens: usage.totalTokens,
        finishReason,
      },
    });
  },
});

Cost visibility per user and feature

The most actionable metadata for Vercel AI SDK apps is cost attribution — knowing which users and which features are consuming the most tokens. Add this to your trace metadata:

onFinish: async ({ usage }) => {
  // Estimate cost based on model pricing
  const COST_PER_1K_INPUT = 0.0025;   // gpt-4o input
  const COST_PER_1K_OUTPUT = 0.01;    // gpt-4o output

  const estimatedCostUsd =
    (usage.promptTokens / 1000) * COST_PER_1K_INPUT +
    (usage.completionTokens / 1000) * COST_PER_1K_OUTPUT;

  await nexus.endTrace(trace.traceId, {
    status: 'success',
    latencyMs: Date.now() - t0,
    metadata: {
      userId,
      feature: 'chat',          // 'chat' | 'summarize' | 'classify'
      model: 'gpt-4o',
      promptTokens: usage.promptTokens,
      completionTokens: usage.completionTokens,
      estimatedCostUsd: Math.round(estimatedCostUsd * 10000) / 10000,
    },
  });
}

Next steps

With Vercel AI SDK traces in Nexus, you can see token costs per user, latency by feature, and error rates per route. The Nexus Pro plan adds webhook alerts for error rate spikes and integrates with Slack — useful for catching streaming errors that never reach your Next.js error boundary.

Sign up for a free Nexus account or read our Vercel AI SDK integration guide.

Trace your Vercel AI SDK app

Free tier, no credit card required. TypeScript SDK, works in Next.js App Router.