Tracing Flowise Chatflows: Observability for No-Code AI Agent Workflows
Flowise lets you build AI chatflows visually by connecting LangChain nodes in a drag-and-drop UI — but when a chatflow returns a wrong answer, a custom tool node throws silently, or a production chatflow starts hallucinating, Flowise's built-in logs don't tell you which node failed or why. Here's how to add full trace observability to Flowise chatflows using Nexus.
What Flowise is
Flowise is an open-source drag-and-drop UI for building LangChain-powered AI chatbots and agent workflows. Instead of writing Python or TypeScript directly, you connect nodes in a visual canvas — LLM nodes, memory nodes, retriever nodes, tool nodes — and Flowise handles the LangChain wiring underneath. The result is a chatflow: a reusable AI workflow you can call via a simple REST API.
A typical Flowise chatflow for a RAG-backed customer support bot looks like this:
- Chat Model node — configured with your OpenAI or Anthropic API key and model
- Retriever node — pulls relevant documents from a connected vector store
- Memory node — maintains conversation history across turns
- Tool nodes — custom JavaScript or Python functions the LLM can call
- Conversational Retrieval QA Chain — wires it all together into a chat interface
Flowise is popular for internal tools, customer support bots, and rapid prototyping because you can stand up a working chatflow in minutes without touching LangChain’s API directly. That speed comes with a tradeoff: Flowise’s built-in logging is minimal, and production failures are hard to diagnose.
Observability blind spots in Flowise chatflows
Flowise shows you a request log in its admin UI, but three failure modes are invisible without external instrumentation:
- Silent tool failures: Custom tool nodes that throw a JavaScript exception return an empty string to the LLM instead of surfacing the error. The LLM then invents an answer rather than admitting the tool failed. Without a span recording
tool.status: error, these failures are invisible. - Retrieval quality drift: When your vector store embeddings age or your document corpus changes, retrieval quality drops silently. The chatflow keeps running and returning answers — they’re just wrong. You won’t catch this from Flowise’s logs alone.
- Latency attribution: Flowise returns total request latency but doesn’t tell you whether the bottleneck is your retriever, your LLM call, or a slow custom tool. Without per-step timing, you can’t optimize the right thing.
Tracing Flowise chatflow API calls
The cleanest Flowise instrumentation pattern is a client-side wrapper: instead of calling the Flowise /api/v1/prediction/{chatflowId} endpoint directly from your application, you wrap the call with a Nexus trace. This gives you latency, error rate, and metadata for every chatflow invocation without modifying the Flowise server.
Install the dependencies:
pip install requests nexus-sdk
Here is a complete Python wrapper:
import os
import time
import requests
from nexus_sdk import NexusClient
nexus = NexusClient(api_key=os.environ["NEXUS_API_KEY"])
FLOWISE_URL = os.environ["FLOWISE_URL"] # e.g. http://localhost:3000
CHATFLOW_ID = os.environ["CHATFLOW_ID"] # from Flowise admin UI
def ask_chatflow(question: str, session_id: str) -> str:
"""Call a Flowise chatflow with full Nexus trace instrumentation."""
trace = nexus.start_trace({
"agent_id": f"flowise-{CHATFLOW_ID}",
"name": f"chatflow: {question[:60]}",
"status": "running",
"started_at": nexus.now(),
"metadata": {
"session_id": session_id,
"question": question[:300],
"chatflow_id": CHATFLOW_ID,
},
})
trace_id = trace["trace_id"]
t0 = time.time()
try:
response = requests.post(
f"{FLOWISE_URL}/api/v1/prediction/{CHATFLOW_ID}",
json={"question": question, "overrideConfig": {"sessionId": session_id}},
timeout=30,
)
response.raise_for_status()
elapsed_ms = int((time.time() - t0) * 1000)
result = response.json()
answer = result.get("text", "")
nexus.end_trace(trace_id, {
"status": "success",
"latency_ms": elapsed_ms,
"metadata": {
"answer_length": len(answer),
"source_documents": len(result.get("sourceDocuments", [])),
},
})
return answer
except requests.HTTPError as e:
nexus.end_trace(trace_id, {
"status": "error",
"latency_ms": int((time.time() - t0) * 1000),
"error": f"HTTP {e.response.status_code}: {e.response.text[:200]}",
})
raise
except Exception as e:
nexus.end_trace(trace_id, {
"status": "error",
"latency_ms": int((time.time() - t0) * 1000),
"error": str(e),
})
raise
Every chatflow call now produces a Nexus trace with the question, latency, answer length, and number of source documents retrieved. You can filter by source_documents: 0 to find queries where retrieval returned nothing — a leading indicator of hallucination.
Adding spans for individual steps
If your application calls multiple Flowise chatflows in sequence — for example, a router chatflow that classifies intent followed by a specialist chatflow — you can model each step as a span within a parent trace:
def handle_user_message(message: str, session_id: str) -> str:
"""Route to the right chatflow and record the full pipeline as one trace."""
trace = nexus.start_trace({
"agent_id": "flowise-router",
"name": f"pipeline: {message[:60]}",
"status": "running",
"started_at": nexus.now(),
"metadata": {"session_id": session_id},
})
trace_id = trace["trace_id"]
t0 = time.time()
try:
# Step 1: classify intent
t_classify = time.time()
classification = call_chatflow(CLASSIFIER_FLOW_ID, message)
intent = classification.get("text", "general").strip().lower()
nexus.add_span(trace_id, {
"name": "step:intent_classification",
"started_at": nexus.now(),
"status": "success",
"latency_ms": int((time.time() - t_classify) * 1000),
"metadata": {"intent": intent, "chatflow_id": CLASSIFIER_FLOW_ID},
})
# Step 2: route to specialist chatflow
t_specialist = time.time()
specialist_id = SPECIALIST_FLOWS.get(intent, DEFAULT_FLOW_ID)
result = call_chatflow(specialist_id, message, session_id)
answer = result.get("text", "")
nexus.add_span(trace_id, {
"name": "step:specialist_response",
"started_at": nexus.now(),
"status": "success",
"latency_ms": int((time.time() - t_specialist) * 1000),
"metadata": {
"intent": intent,
"chatflow_id": specialist_id,
"answer_length": len(answer),
"source_documents": len(result.get("sourceDocuments", [])),
},
})
nexus.end_trace(trace_id, {
"status": "success",
"latency_ms": int((time.time() - t0) * 1000),
"metadata": {"intent": intent, "answer_length": len(answer)},
})
return answer
except Exception as e:
nexus.end_trace(trace_id, {
"status": "error",
"latency_ms": int((time.time() - t0) * 1000),
"error": str(e),
})
raise
def call_chatflow(chatflow_id: str, question: str, session_id: str = "") -> dict:
response = requests.post(
f"{FLOWISE_URL}/api/v1/prediction/{chatflow_id}",
json={"question": question, "overrideConfig": {"sessionId": session_id}},
timeout=30,
)
response.raise_for_status()
return response.json()
Tracing custom tool nodes
Flowise lets you define custom tools as JavaScript functions that the LLM can call. These run inside the Flowise server process — not your application — so you can’t wrap them with the Python SDK. Instead, use the Nexus REST API directly from the tool function to add a span to the active trace.
The pattern requires passing the Nexus traceId into Flowise via the overrideConfig field, then reading it inside the tool:
// Flowise Custom Tool: fetch_order_status
// Add this JavaScript in the Flowise "Custom Tool" node
const NEXUS_API_KEY = $env.NEXUS_API_KEY;
const NEXUS_BASE_URL = "https://nexus.keylightdigital.dev";
async function fetchOrderStatus(orderId) {
const traceId = $vars.nexusTraceId; // passed via overrideConfig.vars
const t0 = Date.now();
try {
// Your actual tool logic
const response = await fetch(`https://api.yourstore.com/orders/${orderId}`, {
headers: { Authorization: `Bearer ${$env.STORE_API_KEY}` },
});
if (!response.ok) {
throw new Error(`Order API returned ${response.status}`);
}
const order = await response.json();
const latencyMs = Date.now() - t0;
// Record the tool call as a Nexus span
if (traceId) {
await fetch(`${NEXUS_BASE_URL}/v1/traces/${traceId}/spans`, {
method: "POST",
headers: {
"Content-Type": "application/json",
"Authorization": `Bearer ${NEXUS_API_KEY}`,
},
body: JSON.stringify({
name: "tool:fetch_order_status",
started_at: new Date(t0).toISOString(),
status: "success",
latency_ms: latencyMs,
metadata: {
order_id: orderId,
order_status: order.status,
tool: "fetch_order_status",
},
}),
});
}
return JSON.stringify({ status: order.status, updated_at: order.updatedAt });
} catch (err) {
const latencyMs = Date.now() - t0;
if (traceId) {
await fetch(`${NEXUS_BASE_URL}/v1/traces/${traceId}/spans`, {
method: "POST",
headers: {
"Content-Type": "application/json",
"Authorization": `Bearer ${NEXUS_API_KEY}`,
},
body: JSON.stringify({
name: "tool:fetch_order_status",
started_at: new Date(t0).toISOString(),
status: "error",
latency_ms: latencyMs,
error: err.message,
metadata: { order_id: orderId, tool: "fetch_order_status" },
}),
});
}
return JSON.stringify({ error: "Could not retrieve order status." });
}
}
return await fetchOrderStatus($input.orderId);
Pass the trace ID from your application when starting the chatflow call:
def ask_chatflow_with_tool_tracing(question: str, session_id: str) -> str:
trace = nexus.start_trace({
"agent_id": f"flowise-{CHATFLOW_ID}",
"name": f"chatflow: {question[:60]}",
"status": "running",
"started_at": nexus.now(),
"metadata": {"session_id": session_id},
})
trace_id = trace["trace_id"]
t0 = time.time()
try:
response = requests.post(
f"{FLOWISE_URL}/api/v1/prediction/{CHATFLOW_ID}",
json={
"question": question,
"overrideConfig": {
"sessionId": session_id,
"vars": {"nexusTraceId": trace_id}, # passed to tool nodes
},
},
timeout=30,
)
response.raise_for_status()
result = response.json()
answer = result.get("text", "")
nexus.end_trace(trace_id, {
"status": "success",
"latency_ms": int((time.time() - t0) * 1000),
"metadata": {"answer_length": len(answer)},
})
return answer
except Exception as e:
nexus.end_trace(trace_id, {
"status": "error",
"latency_ms": int((time.time() - t0) * 1000),
"error": str(e),
})
raise
With this pattern, a Nexus trace for a chatflow call shows both the top-level latency and individual spans for every tool the LLM invoked — including the tool status (success or error) and tool-specific metadata like order IDs or search queries.
TypeScript equivalent
If your application is TypeScript-based (Next.js, Express, Hono), the same wrapper pattern applies using the Nexus TypeScript SDK:
import { NexusClient } from 'nexus-sdk'
const nexus = new NexusClient({ apiKey: process.env.NEXUS_API_KEY! })
const FLOWISE_URL = process.env.FLOWISE_URL!
const CHATFLOW_ID = process.env.CHATFLOW_ID!
export async function askChatflow(question: string, sessionId: string): Promise<string> {
const trace = await nexus.startTrace({
agentId: `flowise-${CHATFLOW_ID}`,
name: `chatflow: ${question.slice(0, 60)}`,
status: 'running',
startedAt: nexus.now(),
metadata: { sessionId, question: question.slice(0, 300), chatflowId: CHATFLOW_ID },
})
const traceId = trace.traceId
const t0 = Date.now()
try {
const res = await fetch(`${FLOWISE_URL}/api/v1/prediction/${CHATFLOW_ID}`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
question,
overrideConfig: {
sessionId,
vars: { nexusTraceId: traceId },
},
}),
})
if (!res.ok) {
const text = await res.text()
throw new Error(`HTTP ${res.status}: ${text.slice(0, 200)}`)
}
const result = await res.json()
const answer: string = result.text ?? ''
const latencyMs = Date.now() - t0
await nexus.endTrace(traceId, {
status: 'success',
latencyMs,
metadata: {
answerLength: answer.length,
sourceDocuments: (result.sourceDocuments ?? []).length,
},
})
return answer
} catch (err) {
await nexus.endTrace(traceId, {
status: 'error',
latencyMs: Date.now() - t0,
error: err instanceof Error ? err.message : String(err),
})
throw err
}
}
Debugging chatflow failures in production
Three failure patterns show up most often in production Flowise chatflows, and each has a distinct trace signature:
- Retrieval returning zero documents: The chatflow answers from the LLM’s parametric knowledge rather than your vector store. Look for traces where
source_documents: 0. If these correlate with low-quality answers, your retriever query isn’t matching your corpus — check embedding model consistency and document chunking strategy. - Tool timeouts: A custom tool that calls an external API can hang if that API is slow. Flowise doesn’t surface tool-level timeouts in its UI. With Nexus spans from inside the tool, you can see
tool:fetch_order_statuslatency spikes that explain why the overall chatflow response was slow. - Rate limit cascades: When your chatflow hits an OpenAI rate limit, the entire request fails with a 429. The trace status will be
errorwith the rate limit message in the error field. A burst of rate limit errors at the same timestamp usually indicates a traffic spike — add retry logic with exponential backoff on the Flowise call.
# Detect zero-retrieval traces for quality monitoring
def ask_chatflow_with_quality_check(question: str, session_id: str) -> dict:
trace = nexus.start_trace({
"agent_id": f"flowise-{CHATFLOW_ID}",
"name": f"chatflow: {question[:60]}",
"status": "running",
"started_at": nexus.now(),
"metadata": {"session_id": session_id, "question": question[:300]},
})
trace_id = trace["trace_id"]
t0 = time.time()
try:
response = requests.post(
f"{FLOWISE_URL}/api/v1/prediction/{CHATFLOW_ID}",
json={"question": question, "overrideConfig": {"sessionId": session_id}},
timeout=30,
)
response.raise_for_status()
result = response.json()
answer = result.get("text", "")
source_docs = result.get("sourceDocuments", [])
grounded = len(source_docs) > 0
quality_warning = not grounded or len(answer.strip()) < 20
nexus.end_trace(trace_id, {
"status": "warning" if quality_warning else "success",
"latency_ms": int((time.time() - t0) * 1000),
"metadata": {
"answer_length": len(answer),
"source_documents": len(source_docs),
"grounded": grounded,
"quality_warning": quality_warning,
"zero_retrieval": not grounded,
},
})
return {"answer": answer, "grounded": grounded}
except requests.Timeout:
nexus.end_trace(trace_id, {
"status": "error",
"latency_ms": int((time.time() - t0) * 1000),
"error": "chatflow request timed out after 30s",
})
raise
except Exception as e:
nexus.end_trace(trace_id, {
"status": "error",
"latency_ms": int((time.time() - t0) * 1000),
"error": str(e),
})
raise
What to monitor in production
Once traces are flowing from your Flowise integration, three metrics are most actionable:
- Zero-retrieval rate: The percentage of chatflow calls where
source_documents: 0. A zero-retrieval rate above 10% usually means your embedding index is stale or your chunk size is wrong. Alert on a spike in this metric before users start complaining. - Tool error rate: How often custom tool spans come back with
status: error. Tool errors that the LLM hides (by returning a graceful fallback answer) are dangerous because they mask broken integrations. Set a webhook alert when tool error rate exceeds 5% over a 1-hour window. - P95 chatflow latency: Flowise chatflows involve at least two round trips (retrieval + LLM call), so latency is typically 1–5s. If P95 creeps above 8s, users experience noticeable delays. Trace the bottleneck by looking at which step has the longest
latency_msin your spans.
Next steps
Flowise’s visual UI is excellent for building chatflows quickly — but production observability requires instrumentation at the API boundary and inside custom tool nodes. The client-side wrapper gives you end-to-end latency and error rate with a few lines of code. Adding the Nexus REST API call inside tool nodes gives you the tool-level visibility you need to debug silent failures. Sign up for a free Nexus account to start capturing traces from your Flowise chatflows today.
Add observability to Flowise chatflows
Free tier, no credit card required. Full trace visibility in under 5 minutes.