Observability for Vercel AI SDK: Tracing streamText, generateObject, and AI Agents

The Vercel AI SDK makes it easy to add streamText, generateObject, and multi-step tool calls to Next.js apps — but streaming errors mid-stream, invisible tool call failures, and accumulating token costs are hard to debug without trace visibility. Here's how to instrument Vercel AI SDK apps with Nexus.

The Vercel AI SDK’s three core primitives

The Vercel AI SDK gives you three main generation functions: generateText (one-shot text generation), streamText (streaming generation to the browser), and generateObject (structured JSON output). All three support tool calls, which is where agentic behavior lives.

Adding AI features to a Next.js app creates observability challenges that don't exist in traditional web apps:

Streaming errors are hard to surface: streamText errors can occur mid-stream, after the HTTP response has started. They don't show up in standard server error logs.
Tool call failures are invisible: when a tool call throws, the SDK retries or surfaces the error as a message rather than a hard failure. Without tracing, you don't know which tool calls failed.
Token costs accumulate invisibly: streamText usage data is only available after the stream completes, and it's easy to lose track of costs per user or per feature.
Multi-step tool loops are opaque: when using maxSteps for multi-step tool execution, you can't tell how many steps ran or which step was the bottleneck.

Instrumenting generateText and streamText

Both generateText and streamText return result objects with usage and finishReason. Wrap both in a Nexus trace:

// app/api/chat/route.ts
import { generateText, streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { NexusClient } from 'nexus-sdk';

const nexus = new NexusClient({ apiKey: process.env.NEXUS_API_KEY! });

export async function POST(req: Request) {
  const { messages, userId } = await req.json();

  const trace = await nexus.startTrace({
    agentId: 'nextjs-chat-agent',
    name: `chat: ${messages[messages.length - 1]?.content?.slice(0, 60) ?? 'message'}`,
    status: 'running',
    startedAt: new Date().toISOString(),
    metadata: {
      userId,
      messageCount: messages.length,
      environment: process.env.NODE_ENV ?? 'development',
    },
  });

  const t0 = Date.now();

  try {
    const result = streamText({
      model: openai('gpt-4o'),
      messages,
      tools: { /* your tools */ },
      maxSteps: 5,
      onFinish: async ({ text, finishReason, usage, steps }) => {
        await nexus.endTrace(trace.traceId, {
          status: finishReason === 'stop' ? 'success' : 'error',
          latencyMs: Date.now() - t0,
          metadata: {
            finishReason,
            promptTokens: usage.promptTokens,
            completionTokens: usage.completionTokens,
            totalTokens: usage.totalTokens,
            steps: steps.length,
          },
        });
      },
    });

    return result.toDataStreamResponse();
  } catch (error) {
    await nexus.endTrace(trace.traceId, {
      status: 'error',
      latencyMs: Date.now() - t0,
      error: error instanceof Error ? error.message : String(error),
    });
    throw error;
  }
}

Tracing generateObject for structured outputs

generateObject is commonly used for extraction and classification tasks. Add spans to track schema validation success rates and per-call latency:

import { generateObject } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';

const classificationSchema = z.object({
  category: z.enum(['billing', 'technical', 'feature-request', 'other']),
  urgency: z.enum(['low', 'medium', 'high']),
  summary: z.string(),
});

async function classifyTicket(content: string, userId: string) {
  const trace = await nexus.startTrace({
    agentId: 'ticket-classifier',
    name: `classify: ${content.slice(0, 60)}`,
    status: 'running',
    startedAt: new Date().toISOString(),
    metadata: { userId, inputLength: content.length },
  });

  const t0 = Date.now();
  try {
    const { object, usage } = await generateObject({
      model: openai('gpt-4o-mini'),
      schema: classificationSchema,
      prompt: `Classify this support ticket: ${content}`,
    });

    await nexus.endTrace(trace.traceId, {
      status: 'success',
      latencyMs: Date.now() - t0,
      metadata: {
        category: object.category,
        urgency: object.urgency,
        promptTokens: usage.promptTokens,
        completionTokens: usage.completionTokens,
      },
    });

    return object;
  } catch (error) {
    await nexus.endTrace(trace.traceId, {
      status: 'error',
      latencyMs: Date.now() - t0,
      error: error instanceof Error ? error.message : String(error),
    });
    throw error;
  }
}

Tracing multi-step tool calls

When using maxSteps, each tool call triggers another LLM call. Emit a span for each step to see how many iterations ran and where latency accumulated:

const result = streamText({
  model: openai('gpt-4o'),
  messages,
  tools: {
    searchWeb: {
      description: 'Search the web',
      parameters: z.object({ query: z.string() }),
      execute: async ({ query }) => {
        const t0 = Date.now();
        try {
          const result = await performSearch(query);
          await nexus.addSpan(traceId, {
            name: 'tool:searchWeb',
            status: 'success',
            latencyMs: Date.now() - t0,
            metadata: { query: query.slice(0, 80), resultCount: result.length },
          });
          return result;
        } catch (error) {
          await nexus.addSpan(traceId, {
            name: 'tool:searchWeb',
            status: 'error',
            latencyMs: Date.now() - t0,
            error: error instanceof Error ? error.message : String(error),
          });
          throw error;
        }
      },
    },
  },
  maxSteps: 5,
  onFinish: async ({ steps, usage, finishReason }) => {
    await nexus.endTrace(traceId, {
      status: finishReason === 'stop' ? 'success' : 'error',
      latencyMs: Date.now() - t0,
      metadata: {
        totalSteps: steps.length,
        toolCallCount: steps.filter(s => s.toolCalls?.length).length,
        totalTokens: usage.totalTokens,
        finishReason,
      },
    });
  },
});

Cost visibility per user and feature

The most actionable metadata for Vercel AI SDK apps is cost attribution — knowing which users and which features are consuming the most tokens. Add this to your trace metadata:

onFinish: async ({ usage }) => {
  // Estimate cost based on model pricing
  const COST_PER_1K_INPUT = 0.0025;   // gpt-4o input
  const COST_PER_1K_OUTPUT = 0.01;    // gpt-4o output

  const estimatedCostUsd =
    (usage.promptTokens / 1000) * COST_PER_1K_INPUT +
    (usage.completionTokens / 1000) * COST_PER_1K_OUTPUT;

  await nexus.endTrace(trace.traceId, {
    status: 'success',
    latencyMs: Date.now() - t0,
    metadata: {
      userId,
      feature: 'chat',          // 'chat' | 'summarize' | 'classify'
      model: 'gpt-4o',
      promptTokens: usage.promptTokens,
      completionTokens: usage.completionTokens,
      estimatedCostUsd: Math.round(estimatedCostUsd * 10000) / 10000,
    },
  });
}

Next steps

With Vercel AI SDK traces in Nexus, you can see token costs per user, latency by feature, and error rates per route. The Nexus Pro plan adds webhook alerts for error rate spikes and integrates with Slack — useful for catching streaming errors that never reach your Next.js error boundary.