If you've built production LangChain applications, you've probably lived this scenario. Your agent starts running, makes a few tool calls, then... nothing. It's stuck. The terminal shows no errors, the process is still alive, and your API bill is quietly climbing while the agent spins.
Where is it stuck? And why? Those two questions can eat an afternoon with traditional logging. With a recording of the run, they take about thirty seconds. Let's walk through a real debugging session.
Why Stuck Agents Are So Hard to Diagnose
LangChain agents get stuck for a handful of recurring reasons.
- Infinite loops, where the agent keeps calling the same tool with the same input
- Tool outputs that confuse the LLM into circular reasoning
- Rate limiting or silent API failures
- Prompts that don't tell the agent when to give up
The common thread is that the evidence lives inside the LLM exchanges. The agent's reasoning, the tool results it saw, and the decisions it made are all in the request and response payloads. Standard logs show you that calls happened. They rarely show you what the calls contained, and the payloads are exactly where the answer is.
The Setup, Once
Orchid records every LLM exchange your application makes by sitting between your app and the provider as a proxy. For Python, the integration is one import and one call at your entry point, before LangChain or your LLM clients load.
import orchid
# Patches httpx, requests, and aiohttp globally.
# All downstream LLM calls route through the proxy automatically.
orchid.init()
To group a run under a named session, wrap it in the session context manager.
with orchid.session("research-agent-001"):
agent = initialize_agent(tools, llm, agent="zero-shot-react-description")
result = agent.run("What are the emerging AI trends this year?")
Your LangChain code doesn't change. The SDK works at the HTTP transport level, so every call LangChain makes through the OpenAI or Anthropic client is captured, including streaming responses.
The 30-Second Triage
Now suppose that research agent hangs. Open the visualizer at http://localhost:4321 and click into the session. You see the exchange timeline, and the problem announces itself before you click anything.
The session shows seven LLM exchanges. The last five have nearly identical durations and token counts, evenly spaced. That visual rhythm is the signature of a loop. Five calls that look the same usually are the same.
Click one of the repeating exchanges. The full request payload opens. Scroll to the last messages in the conversation the agent sent to the model. There's the tool result it's reasoning about.
{
"role": "user",
"content": "Observation: Found 0 results for 'AI trends 2025'.\nThought: What should I do next?"
}
Now look at the response. The model's completion shows the circular reasoning directly.
{
"role": "assistant",
"content": "Thought: The search returned no results. I should search for 'AI trends 2025' to find the information.\nAction: search_tool\nAction Input: AI trends 2025"
}
That's the root cause, in the model's own words. The search returns empty results, the prompt has no instructions for handling an empty result, so the agent's most probable next action is to repeat the exact search that just failed. Forever.
Total clicks, three. No grep, no print statements, no re-running the agent with verbose logging and hoping it gets stuck again.
The Fix
Once you can read the agent's reasoning, the fix is usually obvious. Here the prompt needs an escape hatch.
system_message = """
If a search returns no results, rephrase the query with different terms.
If two attempts return no results, respond with "Unable to find
information on this topic" and stop.
"""
Re-run the agent and open the new session. The timeline now shows a clean arc. The first search returns empty, the second exchange shows the agent rephrasing, the third shows results coming back, and the run completes. You can compare the two sessions to confirm the behavior change, and the per-session cost totals show exactly what the looping run wasted versus the fixed one.
Or Skip the Clicking Entirely
Here's the part that wasn't possible a year ago. Orchid exposes the same recording through an MCP server, so your coding assistant can run this entire triage for you. Connect Cursor, VS Code, or Claude Code to the proxy and ask "why is my research agent stuck?" The assistant lists the sessions, spots the repeating exchanges, pulls the payloads, and quotes the circular reasoning back to you with a suggested prompt fix.
We walk through that workflow in Let Your AI Debug Your AI.
A Few Habits That Compound
Once your agent traffic is recorded, some practices pay off quickly.
- Name your sessions meaningfully. A session ID like
research-agent-prod-tuesdaybeatsdefault-sessionwhen you're hunting for last week's failure. - Watch session costs. Stuck agents burn API credits fast. Per-session cost totals make a runaway loop visible early. More in Know What Every Agent Run Costs.
- Turn the failure into a test. Export the stuck session as a fixture and replay it in CI to prove your prompt fix handles empty search results. The workflow is in Zero-Cost AI Testing.
Try It Yourself
Want to click through a stuck agent without setting anything up? Our interactive demo has a pre-loaded session you can explore in the browser.
To record your own LangChain project, install the SDK and point it at a running proxy.
pip install orchid-sdk
import os
import orchid
os.environ["ORCHID_PROXY_URL"] = "http://localhost:4320/v1"
os.environ["ORCHID_API_KEY"] = "your-orchid-api-key"
orchid.init()
The next time an agent hangs, you'll have the answer in the recording instead of an afternoon of guesswork. Have questions about debugging your LangChain agents? Reach out and we'd love to help.