Skip to content

Issue: UnboundLocalError in call_tool on Client Disconnect — Due to Uninitialized Variable #314

@TBice123123

Description

@TBice123123

🔁 Reproduction Steps

  1. Set up an MCP server using StreamableHTTP transport.
  2. Trigger a tool call that takes noticeable time to execute (e.g., >3–5 seconds — for ease of reproduction).
  3. From the frontend, initiate the request via fetch with SSE (e.g., in a browser).
  4. While the tool is still executing, cancel the request — by closing the tab, clicking “Stop”, or programmatically aborting the fetch.
  5. Observe the backend logs — you’ll see:
UnboundLocalError: cannot access local variable 'call_tool_result' where it is not associated with a value

💡 Note: The tool doesn’t need to be slow — this can happen with any tool if the client disconnects before the result is assigned. Long-running tools just make it easier to trigger.


⚠️ Problem & Root Cause

The crash occurs in code like this:

async def call_tool(...):
    if session is None:
        ...
        call_tool_result = await session.call_tool(...)  # ← May be skipped if exception 
    return _convert_call_tool_result(call_tool_result)   # ← CRASH: variable not bound
  • call_tool_result is never initialized at the function scope — it’s only assigned inside the else block.
  • When the client disconnects:
    • The underlying read_stream closes.
    • session.call_tool(...) may raise an exception (e.g., anyio.EndOfStream, BrokenResourceError) — but due to async task group behavior, the exception may not propagate synchronously.
    • As a result, the assignment to call_tool_result is skipped entirely.
  • When the code reaches return _convert_call_tool_result(call_tool_result), Python throws UnboundLocalError — because the variable was never bound in this scope.

This is not a race condition — it’s a logic error in variable binding and error handling.


🚨 Impact

  • Causes unhandled 500 errors or process crashes in production.
  • Breaks user experience — instead of “Request cancelled”, users see generic failures or timeouts.
  • Makes the system fragile under real-world conditions: network drops, impatient users, mobile clients, etc.
  • Violates “fail gracefully” principle — should return an error message, not crash.

✅ Proposed Solution

To fix this with minimal, safe, and production-ready changes:

1. Initialize call_tool_result to None at the start of the function:

async def call_tool(...):
    call_tool_result = None  # ← Prevents UnboundLocalError

    if session is None:
        ...
    else:
        call_tool_result = await session.call_tool(...)

    return _convert_call_tool_result(call_tool_result)

→ Guarantees the variable is always bound — no more crashes.

2. Update _convert_call_tool_result to handle None:

def _convert_call_tool_result(result: CallToolResult | None) -> tuple[str | list[str], list[NonTextContent] | None]:
    if result is None:
        return "❌ Tool call failed: connection closed or no response received", None
    # ... existing logic

→ Now, LangChain Agents/Chains receive a meaningful error string instead of crashing. They can log it, show it to users, or retry.


Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions