Comparison
Nexus vs TruLens for AI Agent Observability
TruLens is an open-source LLM evaluation framework by TruEra — score your RAG pipeline's groundedness, context relevance, and answer relevance using feedback functions. Nexus is a real-time agent observability platform with live trace timelines, LLM cost attribution, and per-agent health dashboards. Evaluation and observability solve different problems. Here's when each is the right call.
TL;DR
Choose Nexus if…
- You need real-time visibility into live agent runs — spans as they happen
- You want LLM cost tracking, token usage, and latency per trace
- You need to debug agent failures with a full span waterfall
- You want per-agent health dashboards and error rate trends over time
- Free tier + $9/mo flat is your pricing ceiling
Choose TruLens if…
- You need to score RAG pipelines on groundedness, context relevance, and answer relevance
- You want feedback functions that use an LLM judge to evaluate output quality
- You're iterating on retrieval strategies and need offline eval metrics
- You want to compare pipeline configurations head-to-head in a leaderboard
- Offline batch evaluation is more important than live runtime tracing
Feature comparison
| Feature | Nexus | TruLens |
|---|---|---|
| Primary use case | Real-time AI agent observability | Offline RAG evaluation using feedback functions |
| Execution model | ✓ Real-time — spans ingest as they happen | Batch — scores computed after pipeline runs |
| RAG quality metrics | ✗ Not applicable | ✓ Groundedness, context relevance, answer relevance |
| LLM cost tracking | ✓ Per-trace and per-agent cost visibility | ✗ Not a core feature |
| Trace timeline view | ✓ Live span waterfall with timing | Session-level view — not a span waterfall |
| Agent health dashboard | ✓ Per-agent error rates, 7d trends | ✗ No agent-level health concept |
| Feedback functions / LLM judge | ✗ Not applicable | ✓ Built-in LLM-as-judge evaluation pipeline |
| Eval leaderboard | ✗ Not applicable | ✓ Compare pipeline configs head-to-head |
| Infrastructure overhead | None — fully managed SaaS | Self-hosted OSS — runs locally or on your infra |
| Webhook / email alerts | ✓ Included on Pro plan | ✗ Not a core feature |
| TypeScript SDK | ✓ First-class TypeScript support | Python only |
| Setup time | 5 min — one API call to start tracing | pip install + wrap your app + configure feedback fns |
| Pricing | Free tier + $9/mo Pro (flat rate) | Free (OSS) — TruEra Cloud available separately |
The honest take
TruLens is a genuinely strong open-source framework for RAG pipeline evaluation. If your primary question is “is my retrieval returning relevant chunks, and is my LLM grounding its answers in them?” — TruLens's feedback functions give you exactly that: automated LLM-as-judge scoring for groundedness, context relevance, and answer relevance, plus a leaderboard to compare retrieval strategies side by side.
Nexus is built for a different question: “what is my agent doing right now, and why did it fail?” Where TruLens scores pipeline runs offline after they complete, Nexus ingests structured spans in real time — every LLM call, tool invocation, and retrieval step captured in a nested trace waterfall as it happens. That means you can debug a production failure minutes after it occurs, not hours after a batch eval job finishes.
The key distinction is eval-first vs trace-first. TruLens is designed for the development loop: iterate on your RAG pipeline, run eval, compare scores, repeat. It assumes you have labeled or evaluatable outputs and care about quality metrics over time. Nexus is designed for the operations loop: your agent is running in production, something breaks, you need span-level visibility to understand exactly what happened — cost, latency, errors, and which step in the chain failed.
Teams building production RAG pipelines sometimes use both: TruLens during development to tune retrieval quality, Nexus in production to monitor runtime health. The two tools have minimal overlap — TruLens answers “how good is my pipeline?” and Nexus answers “what is my pipeline doing right now?”
Trace your RAG pipeline in real time
Real-time AI agent observability. Free tier, no credit card required. Start tracing your RAG pipeline in 5 minutes — full span waterfall, LLM cost tracking, and agent health dashboards included.