Skip to content

Conversation

@anubhav756
Copy link
Contributor

@anubhav756 anubhav756 commented Jul 23, 2025

Overview

This PR represents a major architectural refactor of the Cymbal Air app, transforming it from a complex, fragmented system into a streamlined, maintainable, and powerful one. The core of this effort was to consolidate on a unified architecture centered around the MCP Toolbox SDK and LangGraph.

Architecture Diagram

mermaid-diagram-2025-09-25-150021 (1)

Before

Previously, the application suffered from significant architectural complexity:

  • Three different orchestration patterns coexisted (LangChain, LangGraph, and native Vertex AI Function Calling), leading to inconsistent logic, duplicated effort, and a steep learning curve for new developers.

  • Tool logic was spread across multiple locations:

    • A dedicated langchain_tools/ directory for LangChain-specific tools.
    • Python functions defined directly within the orchestration/langgraph.py module.
    • A completely separate, custom-built retrieval_service/ directory that handled all RAG-related tools for policies and amenities.
  • The presence of the retrieval_service/ and multiple orchestration modules resulted in a deeply nested directory structure (llm_demo/), a high number of files, and tightly coupled components that were difficult to test and maintain.

After

This refactor has completely removed the legacy retrieval_service/ and langchain_tools/ directories, resulting in a significantly simpler and more robust system.

  • All tools are now defined declaratively in a single tools.yaml file. This central manifest is served by a dedicated Toolbox Service running on Cloud Run, which decouples the tool implementations from the main application logic.
  • We have consolidated on LangGraph as the single, standard orchestrator for the entire application. This works seamlessly with the toolbox-langchain SDK, which makes binding the tools from our Toolbox service to the LangGraph agent incredibly simple and efficient.
  • By replacing the entire retrieval_service/ with Toolbox, we have eliminated a massive amount of custom code. The application's directory structure has been flattened, the total number of files has been reduced, and the overall logic is far easier to understand and manage.

This transformation has not only cleaned up technical debt but has also laid the foundation for the powerful new features detailed below.

Key Changes & New Features

The app's core logic is now powered by the MCP Toolbox SDK, served from a dedicated Cloud Run instance.

Proactive Auth Flow

A multi-step validation flow has been introduced to prevent the agent from attempting to use authenticated tools without a logged-in user, creating a much smoother user experience.

mermaid-diagram-2025-09-25-165620 (1)

Vector Search

The app's search capabilities have been significantly upgraded by integrating vector search, a feature made simpler by the new architecture.

  • The search_amenities and search_policies tools now leverage AlloyDB to convert user queries into vectors, enabling a more intelligent, semantic search.
  • The embedding generation scripts have been corrected to use the embed_documents method, ensuring all embeddings are consistent and compatible with the latest and greatest gemini-embedding-001 model.

UX & State Management

Several fixes were implemented to create a more consistent and reliable user experience within the new LangGraph framework.

  • Refactored the booking and decline logic to eliminate hardcoded messages, ensuring the UI chat history is a direct reflection of the agent's internal state.
  • A confirmation message is now added to the agent's history after a successful booking, preventing the agent from incorrectly attempting to rebook the same flight.

Important

This integration depends on several new features in the MCP Toolbox:

This PR adds a tools file equivalent to all the tools in the Cymbal Air
app.

> [!NOTE]
> This tools file makes use of the new optional params feature
(`default: ""`) from
[#617](googleapis/genai-toolbox#617).
This PR removes LangChain tools and Vertex AI Function Calling
orchestration and consolidates on LangGraph.

It also flattens the directory structure and refactors to simplify some
parts of code for easier understanding.

> [!NOTE]
> The failure in the integration test is expected. This PR is part of a
series of changes, and the corresponding fix for this test is in a
subsequent PR.
> ### Reasoning
> We've intentionally split the work into smaller, focused PRs to make
the review process more manageable and efficient.
> ### Merge Plan
> All related PRs will be merged into the `toolbox-main` branch first.
We will ensure all tests are passing on `toolbox-main` before merging
the entire feature set into `main`.
> [!NOTE]
> The failure in the integration test is expected. This PR is part of a
series of changes, and the corresponding fix for this test is in a
subsequent PR.
> ### Reasoning
> We've intentionally split the work into smaller, focused PRs to make
the review process more manageable and efficient.
> ### Merge Plan
> All related PRs will be merged into the `toolbox-main` branch first.
We will ensure all tests are passing on `toolbox-main` before merging
the entire feature set into `main`.
* Remove client session management from the orchestration class
  * This is now managed by Toolbox SDK internally
* Simplify prompt creation
  * This is handled by the respective tools
* Tools descriptions, params, annotations, etc. are loaded from Toolbox
  * These are added through `bind_tools` from LangGraph
* This enables removal of the custom response message creation (which
was added as a `TODO`)
* Add logged in user's token to the `RunnableConfig`
* This is necessary so that the tools that require authentication can
read the user's token if available
* Simplify tools helper file by removing tools and helpers since those
are now handled by Toolbox SDK internally
* Add a Toolbox URL to connect to through integration tests
* Removes `ToolMessage` while inserting ticket.
  * This was causing an issue with `langchain-google-vertexai`

  ```
google.api_core.exceptions.InvalidArgument: 400 Please ensure that the
number of function response parts should be equal to number of function
call parts of the function call turn.
  ```
* Remove unused human and AI messages post book ticket flow.

## Diagram

![image](https://github.com/user-attachments/assets/46af0a74-3395-45a8-8ff4-f5466b034f17)

> [!IMPORTANT]
> This PR depends on a couple of features from Toolbox SDK:
> * Support for optional parameters
([#290](googleapis/mcp-toolbox-sdk-python#290))
> * Self-authenticated tools via `RunnableConfig`
([#291](googleapis/mcp-toolbox-sdk-python#291))

> [!NOTE]
> The failure in the integration test is expected. This PR is part of a
series of changes, and the corresponding fix for this test is in a
subsequent PR.
> ### Reasoning
> We've intentionally split the work into smaller, focused PRs to make
the review process more manageable and efficient.
> ### Merge Plan
> All related PRs will be merged into the `toolbox-main` branch first.
We will ensure all tests are passing on `toolbox-main` before merging
the entire feature set into `main`.
## Summary
This PR updates the Cloud Build configuration to automatically deploy a
Toolbox service on Cloud Run and securely connect it to the main
application.

## Changes

* The `integration.cloudbuild.yaml` for our integration environment now
includes steps to deploy the Toolbox service.
* The deployed Toolbox service's URL is passed as an environment
variable to the main application.
* Communication between the main app and the Toolbox service is secured
using service account-based authentication.
* The `ToolboxClient` now fetches the current service account's ID token
and includes it in the `client_headers` for all outgoing requests to the
Toolbox service.
The current eval framework is incompatible with our recent migration to
LangGraph for orchestration. Since LangGraph is now our standard, this
legacy eval code is obsolete and has been removed to avoid confusion.

> [!NOTE]
> A new eval suite, specifically designed for LangGraph, will be
developed separately.
This PR also simplifies github workflow configs.

> [!NOTE]
> We would be restoring scripts like `run_generate_embeddings.py`,
`models/`, etc. in #546. In a future PR, we would also reimplementing
data initialization/export scripts using Toolbox
This PR fixes issue causing different messages to show up on decline
ticket before and after page refresh.

We now return the actual response from the LLM instead of a hardcoded
response in case of LangGraph.

This PR also removes unused `config` parameter from graph nodes, and an
unused hardcoded booking decline AI message.
…549)

This PR addresses an issue where the agent could attempt to rebook a
flight immediately after a successful booking.

## Problem
Previously, after the `insert_ticket` tool successfully executed, a
confirmation message was not added to the agent's conversation history
from our custom node. This lack of explicit confirmation left the agent
without the necessary context to know the booking was complete. As a
result, the agent could incorrectly infer that the booking task had
failed and attempt to perform the booking again.

## Solution
This PR modifies the custom node to add a success message to the agent's
conversation history immediately after the `insert_ticket` tool
completes successfully. This message explicitly confirms that the flight
has been booked, providing the agent with the proper context to conclude
the booking process and not attempt a rebooking.

This PR also makes the booking success/decline messages consistent
between LangGraph and UI chat history. This is so that after a page
refresh, the messages remains consistent on the Chat UI.
@anubhav756 anubhav756 self-assigned this Jul 23, 2025
…hat history (#550)

## Summary
This PR addresses inconsistencies between the UI display and the agent's
internal LangGraph state during the ticket booking process. Previously,
hardcoded messages spread across the UI, the backend request handlers,
and agent nodes, led to a disjointed UX where chat history, user
prompts, and agent responses could become out of sync, particularly
after a page refresh.

The core of this change is the introduction of a single, internal helper
function (__booking_handler) that centralizes the logic for both
confirming and declining a ticket booking. This refactoring ensures that
all UI components and chat history entries are a direct reflection of
the agent's state.

## Changes
* Created a common `__booking_handler` to manage both booking "accept"
and "decline" actions, eliminating code duplication and streamlining
maintenance.

* Replaced hardcoded UI text for confirmations and responses with
messages sourced directly from the agent's graph state. This guarantees
consistency between what the user sees and the agent's conversational
history.

* The user's choice (e.g., `"Looks good to me. Book it!"`) and the
agent's final response are now reliably added to the user session's chat
history for both success and decline scenarios.

* Added a crucial comment explaining the rationale for injecting the
"decline" message to the langgraph state *before* invoking the agent.

* Removed unnecessary `params` from the request handlers for better code
hygiene.
This PR updates the existing workflow to replace the prebuilt tool node
with a new custom tool node. This new node is designed to intelligently
handle tool auth by reading auth headers from the provided
`RunnableConfig` by LangGraph.

The custom node inspects the auth requirements of the underlying core
tool within the `ToolboxTool`. If the tool requires authentication, the
node dynamically creates an authenticated copy of the tool by attaching
the necessary auth token getters using the `add_auth_token_getter` API.
This authenticated tool instance is then used for the call and
subsequently discarded. This same auth handling logic has also been
applied to the node responsible for ticket insertion.

> [!NOTE]
> The functionality introduced in these custom nodes will be abstracted
into the `ToolboxTool` itself in an upcoming release of the
`toolbox-langchain`
[#291](googleapis/mcp-toolbox-sdk-python#291).
This will simplify the workflow in the future by handling authentication
directly within the tool.
@anubhav756 anubhav756 changed the title Toolbox main feat!: Integrate Toolbox and Streamline Cymbal Air Jul 23, 2025
@anubhav756
Copy link
Contributor Author

/gcbrun

anubhav756 and others added 5 commits July 25, 2025 10:12
This PR addresses an issue in `run_generate_embeddings.py` where
document data was being embedded using the `embed_query` method instead
of the more appropriate `embed_documents` method.

### Description

Previously, the script for generating embeddings for our amenities and
policies datasets was using `embed_service.embed_query`. This method is
optimized for embedding user-generated search queries, not static
documents meant for retrieval.

This also created an inconsistency with our other script,
`run_generate_policy_dataset.py`, which correctly uses the
`embed_documents` method. As a result, the two scripts produced
different, non-interchangeable embeddings for the same source data.

This change updates `run_generate_embeddings.py` to use the
`embed_documents` method, ensuring two things:

1. We are now using the recommended method for creating document
embeddings.
1. Both scripts will now produce the exact same embeddings for the same
data, making our system more robust and predictable.

This PR also removes an unnecessary `head()` call.
This PR updates the static data for amenities and policies datasets with
new embeddings from Gemini Embedding model, which is added in #555.
…orting to/from dataset (#557)

This PR fixes the issues that occurred while importing and exporting the
amenities dataset to and from the datastource due to wrong parsing of
time values.
# Overview

This PR introduces vector search functionality for the
`search_amenities` and `search_policies` tools.

# Key Changes
* We now use AlloyDB for vector conversion, which allows the tools to
understand the semantic intent of user queries.
* The `search_amenities` and `search_policies` tools can now accept a
plain string query from the user and convert it into a vector for a more
intelligent search.
* The `google_ml_integration` plugin has been added during database
initialization. This is essential for ensuring compatibility with the
`gemini-embedding-001` model, which is used for vector embedding.

# Testing
* Used the `search_amenities` tool with a query like "Find a coffee shop
near gate B6"
* Use the `search_policies` tool with a query like "What is the policy
around pets?"
* Verify that the tools return semantically relevant results rather than
just keyword matches.

> [!NOTE]
> We anticipate further enhancements once we have support for Hybrid
Tools and Semantic Tools from MCP Toolbox.
# Overview
This PR introduces a multi-step validation flow into the agent's graph
to handle tools that require authentication. The primary goal is to
prevent the agent from attempting to use authenticated tools without
proper authentication.

# New Flow Diagram
![Untitled diagram _ Mermaid
Chart-2025-08-08-210230](https://github.com/user-attachments/assets/f6424653-34c4-47a8-ae43-5f0c0129bc8c)

# Problem
Previously, the agent would attempt to execute any tool the model chose,
without proactively checking for prerequisites. This led to tools
requiring a logged-in user fail late in the process, resulting in a poor
UX and wasted processing.

# Solution
This PR implements a new conditional logic flow within the graph to
address this issue.

## Proactive Auth Check
A new conditional edge has been added to check if a user is logged in
before any tool requiring authentication is called.
* An `__is_logged_in` helper function checks the `RunnableConfig` for
valid credentials.
* A `request_login_node` was added to inform the user when they need to
sign in.
* The graph's conditional edges from the `AGENT_NODE` now route through
this authentication check first.

## Prioritized and Sequential Routing Logic
The updated routing logic in `agent_should_continue` ensures a strict
order of operations for tools requiring both authentication and
confirmation, like `insert_ticket`.
* If no tool is called, the agent responds and ends.
* The graph always checks if any selected tool requires authentication.
If the user is not logged in, the flow is immediately directed to
`request_login_node` and stops.
* Only after authentication is confirmed does the graph check if the
tool requires user confirmation.
* If none of the above conditions is met, the graph proceeds to the
standard `tool_node`.

---------

Co-authored-by: Yuan Teoh <[email protected]>
@anubhav756 anubhav756 marked this pull request as ready for review September 15, 2025 13:26
@anubhav756 anubhav756 requested a review from a team as a code owner September 15, 2025 13:26
@anubhav756
Copy link
Contributor Author

/gcbrun

@anubhav756 anubhav756 merged commit 851c9c3 into main Sep 24, 2025
6 checks passed
@anubhav756 anubhav756 deleted the toolbox-main branch September 24, 2025 15:51
This was referenced Sep 24, 2025
anubhav756 pushed a commit that referenced this pull request Sep 26, 2025
🤖 I have created a release *beep* *boop*
---


##
[0.5.0](v0.4.0...v0.5.0)
(2025-09-26)


### ⚠ BREAKING CHANGES

* Integrate Toolbox and Streamline Cymbal Air
([#554](#554))
* **cloudsql-mysql:** Update app to use GA MySQL syntax
([#530](#530))
* updated flights dataset to 2025
([#524](#524))

### Features

* Integrate Toolbox and Streamline Cymbal Air
([#554](#554))
([851c9c3](851c9c3))
* updated flights dataset to 2025
([#524](#524))
([9d00186](9d00186))


### Miscellaneous Chores

* **cloudsql-mysql:** Update app to use GA MySQL syntax
([#530](#530))
([cf8af80](cf8af80))
* release 0.5.0
([#575](#575))
([f2b74f3](f2b74f3))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants