From a2c647e914fd4ba0caa50b80ac9f755bdeff2c16 Mon Sep 17 00:00:00 2001 From: bowlofarugula Date: Thu, 20 Nov 2025 02:20:57 -0800 Subject: [PATCH 01/21] add wasmcp blog post Signed-off-by: bowlofarugula --- content/blog/mcp-with-wasmcp.md | 330 ++++++++++++++++++++++++++++++++ 1 file changed, 330 insertions(+) create mode 100644 content/blog/mcp-with-wasmcp.md diff --git a/content/blog/mcp-with-wasmcp.md b/content/blog/mcp-with-wasmcp.md new file mode 100644 index 00000000..c56a4b06 --- /dev/null +++ b/content/blog/mcp-with-wasmcp.md @@ -0,0 +1,330 @@ +title = "Build MCP Servers with Wasmcp and Spin" +date = "2025-11-20T10:15:47Z" +template = "blog_post" +description = "Introducing a new approach to building MCP servers on the WebAssembly component model." +tags = ["agents", "ai", "llm", "mcp", "model"] + +[extra] +type = "post" +author = "Ian McDonald" + +--- + +Large language models (LLMs) are trained on vast heaps of data that they use to generate natural language responses to input queries. But that knowledge is static once training is over. They are unable to answer simple questions that require current data, like “What time is it?” or “What's the weather tomorrow in Atlanta?”. This highlights the gap between a simple model and an intelligent system that can actually *do* things and acquire new information, or context, dynamically. This is generally where the term *agent* starts to enter the conversation. + +All LLMs are dependent on functions, also called tools, to interact with the outside world beyond the prompt and to perform deterministic actions. Just like you might use a calculator to accurately crunch numbers, or a web browser to explore the internet, an LLM might use its own calculator and fetch tools in the same way. Even basic capabilities like reading a file from disk are implemented via tools. + +Without tools a language model is like someone sitting in an empty, windowless box with only their memories from an array of random encyclopedias, books, etc. (training data) to pull from. Our interactions with them are something along the lines of: A human slips a question written on a piece of paper under the door for the model to read, and the model slips back a response using only their prior knowledge and imagination. + +That's a far cry from the promise of autonomous systems that understand and act on the world in realtime, let alone transform it. + +Our first thought might be to implement a simple HTTP fetch tool for our target model. Now that model can search the internet in a loop against user queries and, voilà, we have an *agent*. Fresh data and the means of interacting with the current state of the world become available. + +That windowless box gets a desktop with a browser. + +Problem solved, right? Not quite… + +## Communication Hurdles + +**Problem 1**: The internal representation of tools is not standard across models. In other words, their hands look different. How do we build a hammer that each of them can grip? + +We’d need to implement a new version of our tool for OpenAI’s GPT models, and another for the Claude family, another for Gemini, etc. So M models x N tools = T total tool implementations. Consider that fetch is only one example, and we might want many different kinds of tools available for various tasks. + +Agent SDKs can solve this problem directly at the library level by implementing support for multiple models and exposing a common interface for tools over them. + +**Problem 2**: Even if tool implementations are not coupled to specific models, they become coupled to the specific agent SDKs used to implement them. Because models themselves have no built-in way of discovering and connecting to new tools over the wire, the tools must run alongside the same code that calls inference to implement the agent's loop. + +We want tools to be discoverable and accessible portably, potentially across the air, at scale. We need a layer of indirection between models and their tools. + +## The Model Context Protocol + +In November 2024, Anthropic suggested an open-source standard for connecting AI applications to external systems: The [Model Context Protocol (MCP)](https://modelcontextprotocol.io/docs/getting-started/intro). + +MCP defines a set of context [primitives](https://modelcontextprotocol.io/specification/draft/server) that are implemented as server features. + +| Primitive | Control | Description | Example | +| --------- | ---------------------- | -------------------------------------------------- | ------------------------------- | +| Prompts | User-controlled | Interactive templates invoked by user choice | Slash commands, menu options | +| Resources | Application-controlled | Contextual data attached and managed by the client | File contents, git history | +| Tools | Model-controlled | Functions exposed to the LLM to take actions | API POST requests, file writing | + +Beyond server features, MCP defines client-hosted features that servers can call directly. For example [elicitations](https://modelcontextprotocol.io/specification/2025-06-18/client/elicitation) can be implemented by a client to allow a server to directly prompt for user input during the course of a tool call, bypassing the model as an intermediary. + +These bidirectional features are possible because MCP is designed as an inherently bidirectional protocol based on [JSON-RPC](https://www.jsonrpc.org/specification). + +MCP is architected as two [layers](https://modelcontextprotocol.io/docs/learn/architecture#layers): Multiple interchangeable transports ([stdio](https://modelcontextprotocol.io/specification/2025-06-18/basic/transports#stdio), [Streamable HTTP](https://modelcontextprotocol.io/specification/2025-06-18/basic/transports#streamable-http), additional custom transports) can be used to serve the same underlying features. + +Since its release, MCP has become the only tool calling protocol with near-consensus adoption and broad client support. It continues to attract interest from both individual developers and organizations. For example, AWS [joined](https://aws.amazon.com/blogs/opensource/open-protocols-for-agent-interoperability-part-1-inter-agent-communication-on-mcp/) the MCP steering committee earlier this year in May, and OpenAI's new [apps](https://openai.com/index/introducing-apps-in-chatgpt/) are MCP-based. + +Many popular agents, including ChatGPT/Codex, Claude Code/Desktop, and Gemini CLI, already support MCP out-of-the-box. In addition, agent SDKs like the OpenAI Agents SDK, Google’s Agent Development Kit, and many others have adopted MCP either as their core tool calling mechanism or else as a first-class option. Inference APIs like the OpenAI [Responses API](https://platform.openai.com/docs/api-reference/responses/create#responses_create-tools) have also rolled out initial support for directly using remote MCP servers during inference itself. + +So why don't we see every org implementing MCP servers to integrate their applications and data with agents? + +## The Current State of MCP + +Local MCP servers are relatively simple to implement over the stdio transport. Official SDKs and advanced third party frameworks are available in nearly every programming language. But distributing an MCP server, either as a local installation or as a service over the network, presents a number of new challenges. + +1. Local MCP servers can be an attack vector for exploiting host resources, unless they run in a sandboxed environment. +2. Much of the value proposition of MCP, particularly its advanced bidirectional features and tool discovery mechanisms, is locked behind the need for stateful server sessions. This means that we either need servers to run as long-lived processes, keeping their session state directly onboard, or else we need to manage the infrastructure for external session state plus the server code that interacts with it, which may incur additional network latency and complexity. +3. Scaling and and response latency matter. We may not initially think of the response time of remote tool calls as being important, given inference itself (especially with thinking enabled) is generally slow anyway. But consider that in answering a single query, an agent may need to make many consecutive tool calls to one or more remote MCP servers. The latency of even a few hundred milliseconds for each tool call can quickly snowball to seconds of lag. In realtime use cases like a voice or stock trading agent, even small response delays for tool calls can translate to the success or failure of the overall interaction or goal. +4. Authorization is painful. While the MCP spec does define OAuth flows, authorization is not yet straightforward to implement in practice. Currently, it requires an authorizer that supports [Dynamic Client Registration](https://datatracker.ietf.org/doc/html/rfc7591). Support for a simplified flow via [OAuth Client ID Metadata Documents](https://datatracker.ietf.org/doc/draft-ietf-oauth-client-id-metadata-document/) is confirmed for the November 2025 spec release. + +Many projects and platforms, from new startups to established enterprises, are taking a crack at solutions to each of these problems. These implementations are often piecemeal and built on proprietary stacks. But there is a unique intersection of emerging technologies that stands to provide a more holistic, portable, and open foundation. + +## WebAssembly Components and Wasmcp + +While WebAssembly (Wasm) is commonly thought of as a browser technology, it has evolved into a versatile platform for building portable, secure, and efficient applications more generally. Wasm components are composable, self-contained binaries that can be compiled from various programming languages and run portably across a range of host devices while remaining sandboxed from host resources. + +The architectural goals of Wasm's [component model](https://component-model.bytecodealliance.org/) align clearly with MCP’s [server design principles](https://modelcontextprotocol.io/specification/2025-06-18/architecture#design-principles). MCP servers are intended to be progressively composed of features, which we can directly map to individual Wasm component binaries. We could author a few components covering various MCP tools, and some anothers for MCP resources, then compose them together as binaries into a complete MCP server component. + +Wasm components are inherently sandboxed from host resources with least privilege access by default, resulting in a light and secure way for agents to run untrusted code on a given machine. Moreover, individual components within that sandboxed process can only interop through explicit interfaces. A visual analogy for this idea might look like a bento box. + +We can push, pull, and compose component binaries from OCI registries just like container images. But unlike full container images or micro VMs, which bundle layers of the operating system and its dependencies, components encapsulate only their own functionality and can be mere kilobytes in size. Full servers can weigh in under 1MB. + +This means that dynamic composition of published artifacts can happen truly on-the-fly relative to other virtualization options. In only a few seconds we can pull a set of OCI-hosted component binaries that implement individual MCP [server features](https://modelcontextprotocol.io/docs/learn/server-concepts#core-server-features), compose them into a sandboxed MCP server, and start it up. We can also distribute fully composed server components on OCI registries in the same way. + +Existing MCP SDKs are fragmented across language ecosystems and generally require long-lived compute or external session backends to implement advanced bidirectional features over the network, if they are supported at all. By contrast, the component model opens the door to safely composing MCP features as component binaries against standard [WASI](https://wasi.dev/) interfaces that runtimes and platforms implement under the hood. This architecture allows for naturally progressive and dynamically configurable servers that expose advanced session-enabled features with portability across component runtimes and platforms. + +But first we need a way to author MCP server features as WebAssembly components, and we need the tooling to compose these components into functional, spec-compliant servers that run on portably across WebAssembly runtimes. + +Simply using existing MCP server SDKs on WebAssembly is increasingly possible, but this approach treats runtime compatibility as an obstacle to overcome, with basic parity as the final goal. Instead we want to leverage the strengths of the component model itself as a paradigm to enable the patterns we just explored. + +This is where [wasmcp](https://github.com/wasmcp/wasmcp) comes in. + +Wasmcp isn’t a runtime, and it’s not exactly an SDK. It’s a collection of WebAssembly components and tooling that work together to function as a polyglot framework for authoring MCP features as WebAssembly components. The result is a single MCP server as a WebAssembly component binary. + +Many MCP patterns that would otherwise require external gateways become possible in memory within a single binary via composition. + +These servers can run on any component-enabled runtime, like Spin. + +With wasmcp we can implement a polyglot MCP server composed of Python tools that use [Pandas](https://pandas.pydata.org/), TypeScript tools that use [Zod](https://zod.dev/), and performance critical tools or [Regorus](https://github.com/microsoft/regorus)-enabled authorization middleware in Rust. + +We can also interchangeably compose different transports and middlewares into the server binary. + +Because the [Spin](https://github.com/spinframework/spin) runtime implements [wasi:keyvalue](https://github.com/WebAssembly/wasi-keyvalue), we get a pluggable backend for MCP session state. This means the full range of spec features, including server-sent requests and notifications, work out of the box with compatible MCP clients over the Streamable HTTP transport. + +## Quickstart + +Install wasmcp via script to get the latest release binary. +``` +$ curl -fsSL https://raw.githubusercontent.com/wasmcp/wasmcp/main/install.sh | bash +``` +Or build it from source. +``` +$ cargo install --git https://github.com/wasmcp/wasmcp +``` + +Source / open a new terminal and then scaffold out a tool component with `wasmcp new`. Only basic dependencies and build tooling from Bytecode Alliance are included. TypesScript uses [jco](https://github.com/bytecodealliance/jco), Rust uses [wit-bindgen](https://github.com/bytecodealliance/wit-bindgen), and Python uses [componentize-py](https://github.com/bytecodealliance/componentize-py). + +wasmcp does not maintain any language-specific SDKs. The [WIT](https://component-model.bytecodealliance.org/design/wit.html) language describes the framework boundary. + +We'll target Rust for our first one. + +``` +$ wasmcp new my-first-tools --language rust +📦 Fetching WIT dependencies... +``` + +If you open up `my-first-tools/src/lib.rs`, you’ll see some boilerplate that you can fill in with your tool implementations. A single tool component can define multiple MCP tools. As we’ll see, multiple tool components can then be chained together and their tools aggregated. This pattern also applies to the other MCP primitives: [resources](https://github.com/wasmcp/wasmcp/blob/main/cli/templates/rust-resources/src/lib.rs) and [prompts](https://github.com/wasmcp/wasmcp/blob/main/cli/templates/rust-prompts/src/lib.rs) + +```rust +impl Guest for Calculator { + fn list_tools( + _ctx: RequestCtx, + _request: ListToolsRequest, + ) -> Result { + Ok(ListToolsResult { + tools: vec![ + Tool { + name: "add".to_string(), + input_schema: r#"{ + "type": "object", + "properties": { + "a": {"type": "number", "description": "First number"}, + "b": {"type": "number", "description": "Second number"} + }, + "required": ["a", "b"] + }"# + .to_string(), + options: Some(ToolOptions { + meta: None, + annotations: None, + description: Some("Add two numbers together".to_string()), + output_schema: None, + title: Some("Add".to_string()), + }), + }, + Tool { + name: "subtract".to_string(), + input_schema: r#"{ + "type": "object", + "properties": { + "a": {"type": "number", "description": "Number to subtract from"}, + "b": {"type": "number", "description": "Number to subtract"} + }, + "required": ["a", "b"] + }"# + .to_string(), + options: None, + }, + ], + next_cursor: None, + meta: None, + }) + } + + fn call_tool( + _ctx: RequestCtx, + request: CallToolRequest, + ) -> Result, ErrorCode> { + match request.name.as_str() { + "add" => Ok(Some(execute_operation(&request.arguments, |a, b| a + b))), + "subtract" => Ok(Some(execute_operation(&request.arguments, |a, b| a - b))), + _ => Ok(None), // We don't handle this tool + } + } +} +``` + +Now let’s build this component and compose it into a full MCP server. The `wasmcp compose server` command +1. Pulls the default [wasmcp framework components](https://github.com/wasmcp/wasmcp/tree/main/crates), like the MCP transport and related plumbing, from [GitHub Container Registry](https://github.com/orgs/wasmcp/packages?repo_name=wasmcp). +2. Plugs your component binary into the wasmcp framework components, producing a complete `server.wasm` component. + +This is accomplished with Bytecode Alliance’s [wac](https://github.com/bytecodealliance/wac) tooling, which you can also use directly for composition. + +Note that any of the framework-level components can also be interchanged with your own custom implementations, like a custom transport component. See `wasmcp compose server –help` for details. + +``` +$ make +$ wasmcp compose server target/wasm32-wasip2/release/my-first-tools.wasm -o server.wasm +``` + +Now that we have a complete `server.wasm` component, we can run it directly with `spin up`. + +``` +$ spin up -f server.wasm + +Serving http://127.0.0.1:3000 +Available Routes: + http-trigger1-component: http://127.0.0.1:3000 (wildcard) +``` + +Note that runtime configuration can be managed with a [spin.toml](https://spinframework.dev/v3/writing-apps) file. + +And _just like that_, we have a functional MCP server over the Streamable HTTP transport! Authorization for providers that implement Dynamic Client Registration is configurable via [environment variables](https://github.com/wasmcp/wasmcp/tree/main/docs). The `stdio` transport can also be used via [Wasmtime](https://github.com/bytecodealliance/wasmtime) directly. + +``` +$ wasmtime run server.wasm +``` + +You can now configure your favorite agent to use the MCP server. + +* [Antigravity](https://antigravity.google/docs/mcp) +* [ChatGPT (developer mode)](https://platform.openai.com/docs/guides/developer-mode) +* [Claude Code](https://code.claude.com/docs/en/mcp) +* [Claude Desktop](https://support.claude.com/en/articles/10949351-getting-started-with-local-mcp-servers-on-claude-desktop) +* [Codex](https://developers.openai.com/codex/mcp/) +* [Cursor](https://cursor.com/docs/context/mcp) +* [Gemini CLI](https://google-gemini.github.io/gemini-cli/docs/tools/mcp-server.html) +* [OpenAI Responses API](https://platform.openai.com/docs/api-reference/responses/create#responses_create-tools) +* [Visual Studio Code](https://code.visualstudio.com/docs/copilot/customization/mcp-servers) +* [Zed](https://zed.dev/docs/ai/mcp) + +## Unlimited Feature Composition + +The real power of the component model becomes apparent when adding another tool component to our server. We'll use Python this time. + +``` +$ wasmcp new python-tools –language python +$ cd python-tools # Develop +$ make # Builds to python-tools/python-tools.wasm +``` + +```python +class StringsTools(exports.Tools): + def list_tools( + self, + ctx: server_handler.RequestCtx, + request: mcp.ListToolsRequest, + ) -> mcp.ListToolsResult: + return mcp.ListToolsResult( + tools=[ + mcp.Tool( + name="reverse", + input_schema=json.dumps({ + "type": "object", + "properties": { + "text": {"type": "string", "description": "Text to reverse"} + }, + "required": ["text"] + }), + options=None, + ), + mcp.Tool( + name="uppercase", + input_schema=json.dumps({ + "type": "object", + "properties": { + "text": {"type": "string", "description": "Text to convert to uppercase"} + }, + "required": ["text"] + }), + options=mcp.ToolOptions( + meta=None, + annotations=None, + description="Convert text to uppercase", + output_schema=None, + title="Uppercase", + ), + ), + ], + meta=None, + next_cursor=None, + ) + + def call_tool( + self, + ctx: server_handler.RequestCtx, + request: mcp.CallToolRequest, + ) -> Optional[mcp.CallToolResult]: + if not request.arguments: + return error_result("Missing tool arguments") + + try: + args = json.loads(request.arguments) + except json.JSONDecodeError as e: + return error_result(f"Invalid JSON arguments: {e}") + + if request.name == "reverse": + return reverse_string(args.get("text")) + elif request.name == "uppercase": + return uppercase_string(args.get("text")) + else: + return None # We don't handle this tool +``` + +We compose our first and second tool components together by adding the paths to both tool component binaries in the `wasmcp compose server` arguments. Note that these local paths can be substituted for OCI registry artifacts. See `wasmcp compose server –help` for details. + +``` +$ wasmcp compose server ./my-first-tools/target/wasm32-wasip2/release/my-first-tools.wasm ./python-tools/python-tools.wasm -o polyglot.wasm + +``` + +Run `polyglot.wasm` with `spin up`. +``` +$ spin up -f polyglot.wasm + +Serving http://127.0.0.1:3000 +``` + +Now our server has four tools: `add`, `subtract`, `reverse`, and `uppercase`! Two are implemented in Python, and two in Rust. + +This example only scratched the surface of what we can potentially do with `wasmcp`. To see some of the more advanced patterns like custom middleware components and session-enabled features, check out the [examples](https://github.com/wasmcp/wasmcp/tree/main/examples). + +## An Open Foundation for AI Agents + +By building on two complementary open standards, MCP and the WebAssembly component model, we can expose new context to AI applications and agents in a portable and composable way. + +To distribute an MCP server over the network, we can target Spin-compatible cloud platforms like [Fermyon Wasm Functions](https://www.fermyon.com/wasm-functions), which will scale a server component efficiently across the global network edge with access to application-scoped key-value storage. [SpinKube](https://www.spinkube.dev/), which you can host on your own infrastructure, unlocks another level of flexibility. Any platform or runtime that directly supports the Wasm component model becomes a valid deployment target for the same component binary. A hypothetical MCP-specific hosting platform could even leverage this architecture to safely run user-submitted MCP features more directly. + +This story will only get better as Wasm components improve alongside active advances in language models. From 9f8791b33d509fa990b7cd8c18c051abb86d591f Mon Sep 17 00:00:00 2001 From: bowlofarugula Date: Thu, 20 Nov 2025 10:00:57 -0800 Subject: [PATCH 02/21] Add some more details and clarity about the relationship between wasmcp and spin up front. Signed-off-by: bowlofarugula --- content/blog/mcp-with-wasmcp.md | 44 ++++++++++++++++++++++----------- 1 file changed, 30 insertions(+), 14 deletions(-) diff --git a/content/blog/mcp-with-wasmcp.md b/content/blog/mcp-with-wasmcp.md index c56a4b06..eeff5cdc 100644 --- a/content/blog/mcp-with-wasmcp.md +++ b/content/blog/mcp-with-wasmcp.md @@ -10,6 +10,14 @@ author = "Ian McDonald" --- +[Spin](https://github.com/spinframework/spin) is an open source framework for building and running fast, secure, and composable cloud microservices with WebAssembly. + +[Wasmcp](https://github.com/wasmcp/wasmcp) is a [WebAssembly Component](https://component-model.bytecodealliance.org/) development kit for the [Model Context Protocol](https://modelcontextprotocol.io/docs/getting-started/intro). + +Together they form a polyglot toolchain for extending the capabilities of language models in a composable, portable, and secure way. + +## What are tools? + Large language models (LLMs) are trained on vast heaps of data that they use to generate natural language responses to input queries. But that knowledge is static once training is over. They are unable to answer simple questions that require current data, like “What time is it?” or “What's the weather tomorrow in Atlanta?”. This highlights the gap between a simple model and an intelligent system that can actually *do* things and acquire new information, or context, dynamically. This is generally where the term *agent* starts to enter the conversation. All LLMs are dependent on functions, also called tools, to interact with the outside world beyond the prompt and to perform deterministic actions. Just like you might use a calculator to accurately crunch numbers, or a web browser to explore the internet, an LLM might use its own calculator and fetch tools in the same way. Even basic capabilities like reading a file from disk are implemented via tools. @@ -146,13 +154,7 @@ impl Guest for Calculator { "required": ["a", "b"] }"# .to_string(), - options: Some(ToolOptions { - meta: None, - annotations: None, - description: Some("Add two numbers together".to_string()), - output_schema: None, - title: Some("Add".to_string()), - }), + options: None, }, Tool { name: "subtract".to_string(), @@ -269,13 +271,7 @@ class StringsTools(exports.Tools): }, "required": ["text"] }), - options=mcp.ToolOptions( - meta=None, - annotations=None, - description="Convert text to uppercase", - output_schema=None, - title="Uppercase", - ), + options=None, ), ], meta=None, @@ -321,6 +317,26 @@ Now our server has four tools: `add`, `subtract`, `reverse`, and `uppercase`! Tw This example only scratched the surface of what we can potentially do with `wasmcp`. To see some of the more advanced patterns like custom middleware components and session-enabled features, check out the [examples](https://github.com/wasmcp/wasmcp/tree/main/examples). +## Publishing to OCI Registries + +We can use [wkg](https://github.com/bytecodealliance/wasm-pkg-tools) to publish our server to an [OCI](https://opencontainers.org/) registry, like [GitHub Container Registry](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry), [Docker Hub](https://docs.docker.com/docker-hub/repos/manage/hub-images/oci-artifacts/), or [Amazon Elastic Container Registry](https://aws.amazon.com/ecr/). + +``` +$ wkg publish polyglot.wasm --package mygithub:basic-utils@0.1.0 +``` + +Anyone with read access to this artifact can then download and run the server using `wkg`. + +``` +$ wkg get mygithub:basic-utils@0.1.0 + +$ spin up -f mygithub:basic-utils@0.1.0.wasm + +Serving http://127.0.0.1:3000 +``` + +We can publish any individual component, or any sequence of composed MCP feature components and middleware, as a standalone artifact in the same way. This enables dynamic and flexible composition of reusable components across servers in a kind of recursive drag-and-drop way, supporting composition and distribution of pre-built patterns which are themselves further composable. See `wasmcp compose --help` for more details on creating standalone feature compositions. + ## An Open Foundation for AI Agents By building on two complementary open standards, MCP and the WebAssembly component model, we can expose new context to AI applications and agents in a portable and composable way. From f15e1e21ea48a20f64e70d2fe2962e5e60ffa78b Mon Sep 17 00:00:00 2001 From: bowlofarugula Date: Thu, 20 Nov 2025 10:18:48 -0800 Subject: [PATCH 03/21] nit portable -> efficient Signed-off-by: bowlofarugula --- content/blog/mcp-with-wasmcp.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/blog/mcp-with-wasmcp.md b/content/blog/mcp-with-wasmcp.md index eeff5cdc..648e3f3d 100644 --- a/content/blog/mcp-with-wasmcp.md +++ b/content/blog/mcp-with-wasmcp.md @@ -81,7 +81,7 @@ Many projects and platforms, from new startups to established enterprises, are t ## WebAssembly Components and Wasmcp -While WebAssembly (Wasm) is commonly thought of as a browser technology, it has evolved into a versatile platform for building portable, secure, and efficient applications more generally. Wasm components are composable, self-contained binaries that can be compiled from various programming languages and run portably across a range of host devices while remaining sandboxed from host resources. +While WebAssembly (Wasm) is commonly thought of as a browser technology, it has evolved into a versatile platform for building portable, secure, and efficient applications more generally. Wasm components are composable, self-contained binaries that can be compiled from various programming languages and run efficiently across a range of host devices while remaining sandboxed from host resources. The architectural goals of Wasm's [component model](https://component-model.bytecodealliance.org/) align clearly with MCP’s [server design principles](https://modelcontextprotocol.io/specification/2025-06-18/architecture#design-principles). MCP servers are intended to be progressively composed of features, which we can directly map to individual Wasm component binaries. We could author a few components covering various MCP tools, and some anothers for MCP resources, then compose them together as binaries into a complete MCP server component. From 53aa7fc054a45aa66cac0812d717a205d0abe60c Mon Sep 17 00:00:00 2001 From: bowlofarugula Date: Thu, 20 Nov 2025 10:21:36 -0800 Subject: [PATCH 04/21] nit dedup Signed-off-by: bowlofarugula --- content/blog/mcp-with-wasmcp.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/blog/mcp-with-wasmcp.md b/content/blog/mcp-with-wasmcp.md index 648e3f3d..36edd89e 100644 --- a/content/blog/mcp-with-wasmcp.md +++ b/content/blog/mcp-with-wasmcp.md @@ -81,7 +81,7 @@ Many projects and platforms, from new startups to established enterprises, are t ## WebAssembly Components and Wasmcp -While WebAssembly (Wasm) is commonly thought of as a browser technology, it has evolved into a versatile platform for building portable, secure, and efficient applications more generally. Wasm components are composable, self-contained binaries that can be compiled from various programming languages and run efficiently across a range of host devices while remaining sandboxed from host resources. +While WebAssembly (Wasm) is commonly thought of as a browser technology, it has evolved into a versatile platform for building applications more generally. Wasm components are composable, self-contained binaries that can be compiled from various programming languages and run portably and efficiently across a range of host devices while remaining sandboxed from host resources. The architectural goals of Wasm's [component model](https://component-model.bytecodealliance.org/) align clearly with MCP’s [server design principles](https://modelcontextprotocol.io/specification/2025-06-18/architecture#design-principles). MCP servers are intended to be progressively composed of features, which we can directly map to individual Wasm component binaries. We could author a few components covering various MCP tools, and some anothers for MCP resources, then compose them together as binaries into a complete MCP server component. From ee78ddf8a3ff63208b344c6e017f513f3743e994 Mon Sep 17 00:00:00 2001 From: bowlofarugula Date: Thu, 20 Nov 2025 10:49:06 -0800 Subject: [PATCH 05/21] Add a relevant link to the Berkeley Function-Calling Leaderboard, which is a great resource for learning about function/tool calling more deeply. Signed-off-by: bowlofarugula --- content/blog/mcp-with-wasmcp.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/blog/mcp-with-wasmcp.md b/content/blog/mcp-with-wasmcp.md index 36edd89e..02dbd01e 100644 --- a/content/blog/mcp-with-wasmcp.md +++ b/content/blog/mcp-with-wasmcp.md @@ -20,7 +20,7 @@ Together they form a polyglot toolchain for extending the capabilities of langua Large language models (LLMs) are trained on vast heaps of data that they use to generate natural language responses to input queries. But that knowledge is static once training is over. They are unable to answer simple questions that require current data, like “What time is it?” or “What's the weather tomorrow in Atlanta?”. This highlights the gap between a simple model and an intelligent system that can actually *do* things and acquire new information, or context, dynamically. This is generally where the term *agent* starts to enter the conversation. -All LLMs are dependent on functions, also called tools, to interact with the outside world beyond the prompt and to perform deterministic actions. Just like you might use a calculator to accurately crunch numbers, or a web browser to explore the internet, an LLM might use its own calculator and fetch tools in the same way. Even basic capabilities like reading a file from disk are implemented via tools. +All LLMs are dependent on calling external [functions](https://gorilla.cs.berkeley.edu/leaderboard.html), also called tools, to interact with the outside world beyond the prompt and to perform deterministic actions. Just like you might use a calculator to accurately crunch numbers, or a web browser to explore the internet, an LLM might use its own calculator and fetch tools in the same way. Even basic capabilities like reading a file from disk are implemented via tools. Without tools a language model is like someone sitting in an empty, windowless box with only their memories from an array of random encyclopedias, books, etc. (training data) to pull from. Our interactions with them are something along the lines of: A human slips a question written on a piece of paper under the door for the model to read, and the model slips back a response using only their prior knowledge and imagination. From f29bc6d114ba02f139b528beb5bbc85fc2a04d36 Mon Sep 17 00:00:00 2001 From: bowlofarugula Date: Thu, 20 Nov 2025 10:51:52 -0800 Subject: [PATCH 06/21] nit 'are dependent on' -> 'depend on' Signed-off-by: bowlofarugula --- content/blog/mcp-with-wasmcp.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/blog/mcp-with-wasmcp.md b/content/blog/mcp-with-wasmcp.md index 02dbd01e..189a422b 100644 --- a/content/blog/mcp-with-wasmcp.md +++ b/content/blog/mcp-with-wasmcp.md @@ -20,7 +20,7 @@ Together they form a polyglot toolchain for extending the capabilities of langua Large language models (LLMs) are trained on vast heaps of data that they use to generate natural language responses to input queries. But that knowledge is static once training is over. They are unable to answer simple questions that require current data, like “What time is it?” or “What's the weather tomorrow in Atlanta?”. This highlights the gap between a simple model and an intelligent system that can actually *do* things and acquire new information, or context, dynamically. This is generally where the term *agent* starts to enter the conversation. -All LLMs are dependent on calling external [functions](https://gorilla.cs.berkeley.edu/leaderboard.html), also called tools, to interact with the outside world beyond the prompt and to perform deterministic actions. Just like you might use a calculator to accurately crunch numbers, or a web browser to explore the internet, an LLM might use its own calculator and fetch tools in the same way. Even basic capabilities like reading a file from disk are implemented via tools. +All LLMs depend on calling external [functions](https://gorilla.cs.berkeley.edu/leaderboard.html), also called tools, to interact with the outside world beyond the prompt and to perform deterministic actions. Just like you might use a calculator to accurately crunch numbers, or a web browser to explore the internet, an LLM might use its own calculator and fetch tools in the same way. Even basic capabilities like reading a file from disk are implemented via tools. Without tools a language model is like someone sitting in an empty, windowless box with only their memories from an array of random encyclopedias, books, etc. (training data) to pull from. Our interactions with them are something along the lines of: A human slips a question written on a piece of paper under the door for the model to read, and the model slips back a response using only their prior knowledge and imagination. From 6cc6632da722f66a83fb4adff519e752a009877f Mon Sep 17 00:00:00 2001 From: bowlofarugula Date: Thu, 20 Nov 2025 10:56:16 -0800 Subject: [PATCH 07/21] nit 'some anothers' -> 'some others' Signed-off-by: bowlofarugula --- content/blog/mcp-with-wasmcp.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/blog/mcp-with-wasmcp.md b/content/blog/mcp-with-wasmcp.md index 189a422b..0c33dae6 100644 --- a/content/blog/mcp-with-wasmcp.md +++ b/content/blog/mcp-with-wasmcp.md @@ -83,7 +83,7 @@ Many projects and platforms, from new startups to established enterprises, are t While WebAssembly (Wasm) is commonly thought of as a browser technology, it has evolved into a versatile platform for building applications more generally. Wasm components are composable, self-contained binaries that can be compiled from various programming languages and run portably and efficiently across a range of host devices while remaining sandboxed from host resources. -The architectural goals of Wasm's [component model](https://component-model.bytecodealliance.org/) align clearly with MCP’s [server design principles](https://modelcontextprotocol.io/specification/2025-06-18/architecture#design-principles). MCP servers are intended to be progressively composed of features, which we can directly map to individual Wasm component binaries. We could author a few components covering various MCP tools, and some anothers for MCP resources, then compose them together as binaries into a complete MCP server component. +The architectural goals of Wasm's [component model](https://component-model.bytecodealliance.org/) align clearly with MCP’s [server design principles](https://modelcontextprotocol.io/specification/2025-06-18/architecture#design-principles). MCP servers are intended to be progressively composed of features, which we can directly map to individual Wasm component binaries. We could author a few components covering various MCP tools, and some others for MCP resources, then compose them together as binaries into a complete MCP server component. Wasm components are inherently sandboxed from host resources with least privilege access by default, resulting in a light and secure way for agents to run untrusted code on a given machine. Moreover, individual components within that sandboxed process can only interop through explicit interfaces. A visual analogy for this idea might look like a bento box. From 96b98e11515ea8d2f0a33494eabb8dc5ede6b626 Mon Sep 17 00:00:00 2001 From: bowlofarugula Date: Thu, 20 Nov 2025 11:16:07 -0800 Subject: [PATCH 08/21] nit clarify about 'agent SDKs' etc. and add a relevant link. Signed-off-by: bowlofarugula --- content/blog/mcp-with-wasmcp.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/blog/mcp-with-wasmcp.md b/content/blog/mcp-with-wasmcp.md index 0c33dae6..d435b508 100644 --- a/content/blog/mcp-with-wasmcp.md +++ b/content/blog/mcp-with-wasmcp.md @@ -38,11 +38,11 @@ Problem solved, right? Not quite… We’d need to implement a new version of our tool for OpenAI’s GPT models, and another for the Claude family, another for Gemini, etc. So M models x N tools = T total tool implementations. Consider that fetch is only one example, and we might want many different kinds of tools available for various tasks. -Agent SDKs can solve this problem directly at the library level by implementing support for multiple models and exposing a common interface for tools over them. +[AI SDKs](https://ai-sdk.dev/docs/ai-sdk-core/tools-and-tool-calling) can solve this problem directly for a given programming language by implementing support for multiple models and exposing a common interface for tools over them. -**Problem 2**: Even if tool implementations are not coupled to specific models, they become coupled to the specific agent SDKs used to implement them. Because models themselves have no built-in way of discovering and connecting to new tools over the wire, the tools must run alongside the same code that calls inference to implement the agent's loop. +**Problem 2**: Even if tool implementations are not coupled to specific models, they become coupled to the specific SDK used to implement them, and by extension to the runtime of that SDK. Because models themselves have no built-in way of discovering and connecting to new tools over the wire, the tools must run alongside the same code that calls inference to implement the agent's loop. -We want tools to be discoverable and accessible portably, potentially across the air, at scale. We need a layer of indirection between models and their tools. +We want tools to be discoverable and accessible to existing agent processes, potentially across the air, at scale. We need a layer of indirection between models and their tools. ## The Model Context Protocol From 033ba6a193c12c9b71e0fe288508ef6971352a72 Mon Sep 17 00:00:00 2001 From: bowlofarugula Date: Thu, 20 Nov 2025 11:23:03 -0800 Subject: [PATCH 09/21] nit headers Signed-off-by: bowlofarugula --- content/blog/mcp-with-wasmcp.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/content/blog/mcp-with-wasmcp.md b/content/blog/mcp-with-wasmcp.md index d435b508..dbcb5982 100644 --- a/content/blog/mcp-with-wasmcp.md +++ b/content/blog/mcp-with-wasmcp.md @@ -79,7 +79,7 @@ Local MCP servers are relatively simple to implement over the stdio transport. O Many projects and platforms, from new startups to established enterprises, are taking a crack at solutions to each of these problems. These implementations are often piecemeal and built on proprietary stacks. But there is a unique intersection of emerging technologies that stands to provide a more holistic, portable, and open foundation. -## WebAssembly Components and Wasmcp +## WebAssembly Components While WebAssembly (Wasm) is commonly thought of as a browser technology, it has evolved into a versatile platform for building applications more generally. Wasm components are composable, self-contained binaries that can be compiled from various programming languages and run portably and efficiently across a range of host devices while remaining sandboxed from host resources. @@ -99,6 +99,8 @@ Simply using existing MCP server SDKs on WebAssembly is increasingly possible, b This is where [wasmcp](https://github.com/wasmcp/wasmcp) comes in. +## [Wasmcp](https://github.com/wasmcp/wasmcp) + Wasmcp isn’t a runtime, and it’s not exactly an SDK. It’s a collection of WebAssembly components and tooling that work together to function as a polyglot framework for authoring MCP features as WebAssembly components. The result is a single MCP server as a WebAssembly component binary. Many MCP patterns that would otherwise require external gateways become possible in memory within a single binary via composition. From 06ca73a3a3d31e3943d5b42b6ae893b7e2956d38 Mon Sep 17 00:00:00 2001 From: bowlofarugula Date: Thu, 20 Nov 2025 11:41:14 -0800 Subject: [PATCH 10/21] Add more explanation about how wasmcp composes things. Signed-off-by: bowlofarugula --- content/blog/mcp-with-wasmcp.md | 37 ++++++++++++++++++++++++++++++--- 1 file changed, 34 insertions(+), 3 deletions(-) diff --git a/content/blog/mcp-with-wasmcp.md b/content/blog/mcp-with-wasmcp.md index dbcb5982..905a00fd 100644 --- a/content/blog/mcp-with-wasmcp.md +++ b/content/blog/mcp-with-wasmcp.md @@ -234,9 +234,9 @@ You can now configure your favorite agent to use the MCP server. * [Visual Studio Code](https://code.visualstudio.com/docs/copilot/customization/mcp-servers) * [Zed](https://zed.dev/docs/ai/mcp) -## Unlimited Feature Composition +## Unlimited Composition -The real power of the component model becomes apparent when adding another tool component to our server. We'll use Python this time. +The real power of the component model and wasmcp's composition architecture becomes apparent when adding another tool component to our server. We'll use Python this time. ``` $ wasmcp new python-tools –language python @@ -317,11 +317,42 @@ Serving http://127.0.0.1:3000 Now our server has four tools: `add`, `subtract`, `reverse`, and `uppercase`! Two are implemented in Python, and two in Rust. +### Wasmcp's architecture + +Server features like tools, resources, prompts, and completions, are implemented by individual WebAssembly components that export narrow [WIT](https://component-model.bytecodealliance.org/design/wit.html) interfaces mapped from the MCP spec's [schema types](https://modelcontextprotocol.io/specification/draft/schema). + +`wasmcp compose` plugs these feature components into framework components and composes them together behind a transport component as a complete middleware [chain of responsibility](https://en.wikipedia.org/wiki/Chain-of-responsibility_pattern) that implements an MCP server. + +Any of the wasmcp framework components, like the transport, can be swapped out for custom implementations during composition, enabling flexible server configurations. + +``` +Transport + ↓ + Middleware₀ + ↓ + Middleware₁ + ↓ + Middleware₂ + ↓ + ... + ↓ + Middlewareₙ + ↓ + MethodNotFound +``` + +Each component: +- Handles requests it understands (e.g., `tools/call`) +- Delegates others downstream +- Merges results (e.g., combining tool lists) + +This enables dynamic composition without complex configuration, all within a single Wasm binary. Think Unix pipes for MCP using Wasm components. + This example only scratched the surface of what we can potentially do with `wasmcp`. To see some of the more advanced patterns like custom middleware components and session-enabled features, check out the [examples](https://github.com/wasmcp/wasmcp/tree/main/examples). ## Publishing to OCI Registries -We can use [wkg](https://github.com/bytecodealliance/wasm-pkg-tools) to publish our server to an [OCI](https://opencontainers.org/) registry, like [GitHub Container Registry](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry), [Docker Hub](https://docs.docker.com/docker-hub/repos/manage/hub-images/oci-artifacts/), or [Amazon Elastic Container Registry](https://aws.amazon.com/ecr/). +We can use [wkg](https://github.com/bytecodealliance/wasm-pkg-tools) to publish our server to an [OCI](https://opencontainers.org/) registry like [GitHub Container Registry](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry), [Docker Hub](https://docs.docker.com/docker-hub/repos/manage/hub-images/oci-artifacts/), or [Amazon Elastic Container Registry](https://aws.amazon.com/ecr/). ``` $ wkg publish polyglot.wasm --package mygithub:basic-utils@0.1.0 From 64d29fadebfc87c69d6d6a5c63767e3d8953eef4 Mon Sep 17 00:00:00 2001 From: bowlofarugula Date: Thu, 20 Nov 2025 11:51:29 -0800 Subject: [PATCH 11/21] nit conclusion Signed-off-by: bowlofarugula --- content/blog/mcp-with-wasmcp.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/blog/mcp-with-wasmcp.md b/content/blog/mcp-with-wasmcp.md index 905a00fd..a00b890d 100644 --- a/content/blog/mcp-with-wasmcp.md +++ b/content/blog/mcp-with-wasmcp.md @@ -372,8 +372,8 @@ We can publish any individual component, or any sequence of composed MCP feature ## An Open Foundation for AI Agents -By building on two complementary open standards, MCP and the WebAssembly component model, we can expose new context to AI applications and agents in a portable and composable way. +By building on two complementary open standards, MCP and the WebAssembly component model, we can expose new context to AI applications and agents in a useful way that solves some of the current challenges towards achieving that goal. -To distribute an MCP server over the network, we can target Spin-compatible cloud platforms like [Fermyon Wasm Functions](https://www.fermyon.com/wasm-functions), which will scale a server component efficiently across the global network edge with access to application-scoped key-value storage. [SpinKube](https://www.spinkube.dev/), which you can host on your own infrastructure, unlocks another level of flexibility. Any platform or runtime that directly supports the Wasm component model becomes a valid deployment target for the same component binary. A hypothetical MCP-specific hosting platform could even leverage this architecture to safely run user-submitted MCP features more directly. +To distribute an MCP server as a Wasm component over the network, we can target Spin-compatible cloud platforms like [Fermyon Wasm Functions](https://www.fermyon.com/wasm-functions), which will scale a server component efficiently across the global network edge with access to application-scoped key-value storage. [SpinKube](https://www.spinkube.dev/), which you can host on your own infrastructure, unlocks another level of flexibility. Any platform or runtime that directly supports the Wasm component model becomes a valid deployment target for the same component binary. A hypothetical MCP-specific hosting platform could even leverage this architecture to safely run user-submitted MCP features more directly. This story will only get better as Wasm components improve alongside active advances in language models. From a829420365a0569f3a87c93263b97f0c4ad85349 Mon Sep 17 00:00:00 2001 From: bowlofarugula Date: Thu, 20 Nov 2025 11:58:31 -0800 Subject: [PATCH 12/21] Add link to quickstart section at the top. Signed-off-by: bowlofarugula --- content/blog/mcp-with-wasmcp.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/content/blog/mcp-with-wasmcp.md b/content/blog/mcp-with-wasmcp.md index a00b890d..db6a87cc 100644 --- a/content/blog/mcp-with-wasmcp.md +++ b/content/blog/mcp-with-wasmcp.md @@ -16,6 +16,8 @@ author = "Ian McDonald" Together they form a polyglot toolchain for extending the capabilities of language models in a composable, portable, and secure way. +See the [quickstart](#quickstart) or read on for some context. + ## What are tools? Large language models (LLMs) are trained on vast heaps of data that they use to generate natural language responses to input queries. But that knowledge is static once training is over. They are unable to answer simple questions that require current data, like “What time is it?” or “What's the weather tomorrow in Atlanta?”. This highlights the gap between a simple model and an intelligent system that can actually *do* things and acquire new information, or context, dynamically. This is generally where the term *agent* starts to enter the conversation. From f1316215d79a470a92d9c792da16fe97b585521d Mon Sep 17 00:00:00 2001 From: bowlofarugula Date: Thu, 20 Nov 2025 12:12:04 -0800 Subject: [PATCH 13/21] Add link to Wassette and explain the relationship to wasmcp. Signed-off-by: bowlofarugula --- content/blog/mcp-with-wasmcp.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/content/blog/mcp-with-wasmcp.md b/content/blog/mcp-with-wasmcp.md index db6a87cc..85221557 100644 --- a/content/blog/mcp-with-wasmcp.md +++ b/content/blog/mcp-with-wasmcp.md @@ -370,7 +370,13 @@ $ spin up -f mygithub:basic-utils@0.1.0.wasm Serving http://127.0.0.1:3000 ``` -We can publish any individual component, or any sequence of composed MCP feature components and middleware, as a standalone artifact in the same way. This enables dynamic and flexible composition of reusable components across servers in a kind of recursive drag-and-drop way, supporting composition and distribution of pre-built patterns which are themselves further composable. See `wasmcp compose --help` for more details on creating standalone feature compositions. +We can publish any individual component, or any sequence of composed MCP feature components and middleware, as a standalone artifact in the same way. This enables dynamic and flexible composition of reusable components across servers in a kind of recursive drag-and-drop way, supporting composition and distribution of pre-built patterns which are themselves further composable. See `wasmcp compose --help` for details. + +## Related projects + +Microsoft's [Wassette](https://github.com/microsoft/wassette) is a security-oriented runtime that runs WebAssembly Components via MCP. + +While Wassette is a custom MCP-specific runtime that can dynamically load and execute components as individual tools on demand with deeply integrated access control, Wasmcp is a toolchain for producing an MCP server as a component that is compatible across Wasm runtimes. ## An Open Foundation for AI Agents From c6248ea34f56edace40b31d02d45b6eb9842d1ec Mon Sep 17 00:00:00 2001 From: bowlofarugula Date: Thu, 20 Nov 2025 12:34:15 -0800 Subject: [PATCH 14/21] nit about current MCP problems Signed-off-by: bowlofarugula --- content/blog/mcp-with-wasmcp.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/blog/mcp-with-wasmcp.md b/content/blog/mcp-with-wasmcp.md index 85221557..de9d5882 100644 --- a/content/blog/mcp-with-wasmcp.md +++ b/content/blog/mcp-with-wasmcp.md @@ -75,7 +75,7 @@ So why don't we see every org implementing MCP servers to integrate their applic Local MCP servers are relatively simple to implement over the stdio transport. Official SDKs and advanced third party frameworks are available in nearly every programming language. But distributing an MCP server, either as a local installation or as a service over the network, presents a number of new challenges. 1. Local MCP servers can be an attack vector for exploiting host resources, unless they run in a sandboxed environment. -2. Much of the value proposition of MCP, particularly its advanced bidirectional features and tool discovery mechanisms, is locked behind the need for stateful server sessions. This means that we either need servers to run as long-lived processes, keeping their session state directly onboard, or else we need to manage the infrastructure for external session state plus the server code that interacts with it, which may incur additional network latency and complexity. +2. Many of MCP's advanced bidirectional features and tool discovery mechanisms are locked behind a dependency on server-managed sessions. This means that we either need servers to run as long-lived processes, keeping their session state directly onboard, or else we need to manage the infrastructure for external session state plus the server code that interacts with it, which may incur additional network latency and complexity. 3. Scaling and and response latency matter. We may not initially think of the response time of remote tool calls as being important, given inference itself (especially with thinking enabled) is generally slow anyway. But consider that in answering a single query, an agent may need to make many consecutive tool calls to one or more remote MCP servers. The latency of even a few hundred milliseconds for each tool call can quickly snowball to seconds of lag. In realtime use cases like a voice or stock trading agent, even small response delays for tool calls can translate to the success or failure of the overall interaction or goal. 4. Authorization is painful. While the MCP spec does define OAuth flows, authorization is not yet straightforward to implement in practice. Currently, it requires an authorizer that supports [Dynamic Client Registration](https://datatracker.ietf.org/doc/html/rfc7591). Support for a simplified flow via [OAuth Client ID Metadata Documents](https://datatracker.ietf.org/doc/draft-ietf-oauth-client-id-metadata-document/) is confirmed for the November 2025 spec release. From 9f218eb23f1c831a4c88eb6cd6184fccbc3bc461 Mon Sep 17 00:00:00 2001 From: bowlofarugula Date: Thu, 20 Nov 2025 23:35:59 -0800 Subject: [PATCH 15/21] Respond to feedback and some nits/reorganization Signed-off-by: bowlofarugula --- content/blog/mcp-with-wasmcp.md | 164 ++++++++++++++++++++------------ 1 file changed, 102 insertions(+), 62 deletions(-) diff --git a/content/blog/mcp-with-wasmcp.md b/content/blog/mcp-with-wasmcp.md index de9d5882..f34c715e 100644 --- a/content/blog/mcp-with-wasmcp.md +++ b/content/blog/mcp-with-wasmcp.md @@ -118,12 +118,12 @@ Because the [Spin](https://github.com/spinframework/spin) runtime implements [wa ## Quickstart Install wasmcp via script to get the latest release binary. -``` -$ curl -fsSL https://raw.githubusercontent.com/wasmcp/wasmcp/main/install.sh | bash +```shell +curl -fsSL https://raw.githubusercontent.com/wasmcp/wasmcp/main/install.sh | bash ``` Or build it from source. -``` -$ cargo install --git https://github.com/wasmcp/wasmcp +```shell +cargo install --git https://github.com/wasmcp/wasmcp ``` Source / open a new terminal and then scaffold out a tool component with `wasmcp new`. Only basic dependencies and build tooling from Bytecode Alliance are included. TypesScript uses [jco](https://github.com/bytecodealliance/jco), Rust uses [wit-bindgen](https://github.com/bytecodealliance/wit-bindgen), and Python uses [componentize-py](https://github.com/bytecodealliance/componentize-py). @@ -132,14 +132,14 @@ wasmcp does not maintain any language-specific SDKs. The [WIT](https://component We'll target Rust for our first one. -``` -$ wasmcp new my-first-tools --language rust -📦 Fetching WIT dependencies... +```shell +wasmcp new my-first-tools --language rust ``` -If you open up `my-first-tools/src/lib.rs`, you’ll see some boilerplate that you can fill in with your tool implementations. A single tool component can define multiple MCP tools. As we’ll see, multiple tool components can then be chained together and their tools aggregated. This pattern also applies to the other MCP primitives: [resources](https://github.com/wasmcp/wasmcp/blob/main/cli/templates/rust-resources/src/lib.rs) and [prompts](https://github.com/wasmcp/wasmcp/blob/main/cli/templates/rust-prompts/src/lib.rs) +If you open up `my-first-tools/src/lib.rs`, you’ll see some boilerplate similar to the code block below that you can fill in with your tool implementations. A single tool component can define multiple MCP tools. As we’ll see, multiple tool components can then be chained together and their tools aggregated. This pattern also applies to the other MCP primitives: [resources](https://github.com/wasmcp/wasmcp/blob/main/cli/templates/rust-resources/src/lib.rs) and [prompts](https://github.com/wasmcp/wasmcp/blob/main/cli/templates/rust-prompts/src/lib.rs) ```rust +/// my-first-tools/src/lib.rs impl Guest for Calculator { fn list_tools( _ctx: RequestCtx, @@ -200,30 +200,46 @@ This is accomplished with Bytecode Alliance’s [wac](https://github.com/bytecod Note that any of the framework-level components can also be interchanged with your own custom implementations, like a custom transport component. See `wasmcp compose server –help` for details. -``` -$ make -$ wasmcp compose server target/wasm32-wasip2/release/my-first-tools.wasm -o server.wasm +```shell +cd my-first-tools/ +make +wasmcp compose server target/wasm32-wasip2/release/my-first-tools.wasm -o server.wasm ``` Now that we have a complete `server.wasm` component, we can run it directly with `spin up`. +```shell +spin up --from server.wasm ``` -$ spin up -f server.wasm -Serving http://127.0.0.1:3000 -Available Routes: - http-trigger1-component: http://127.0.0.1:3000 (wildcard) -``` +We can provide more detailed runtime configuration with a [spin.toml](https://spinframework.dev/v3/writing-apps) file. + +```toml +# my-first-tools/spin.toml +spin_manifest_version = 2 -Note that runtime configuration can be managed with a [spin.toml](https://spinframework.dev/v3/writing-apps) file. +[application] +name = "mcp" +version = "0.1.0" +authors = ["You "] +description = "My MCP server" -And _just like that_, we have a functional MCP server over the Streamable HTTP transport! Authorization for providers that implement Dynamic Client Registration is configurable via [environment variables](https://github.com/wasmcp/wasmcp/tree/main/docs). The `stdio` transport can also be used via [Wasmtime](https://github.com/bytecodealliance/wasmtime) directly. +[[trigger.http]] +route = "/mcp" +component = "mcp" +[component.mcp] +source = "server.wasm" +allowed_outbound_hosts = [] # Update for outbound HTTP ``` -$ wasmtime run server.wasm + +```shell +spin up --from spin.toml ``` -You can now configure your favorite agent to use the MCP server. +Just like _that_, we have a functional MCP server over the Streamable HTTP transport. + +These AI applications are just some of the many that can use this MCP server to extend their capabilities. * [Antigravity](https://antigravity.google/docs/mcp) * [ChatGPT (developer mode)](https://platform.openai.com/docs/guides/developer-mode) @@ -236,17 +252,68 @@ You can now configure your favorite agent to use the MCP server. * [Visual Studio Code](https://code.visualstudio.com/docs/copilot/customization/mcp-servers) * [Zed](https://zed.dev/docs/ai/mcp) -## Unlimited Composition +## Runtime Portability and Deployment Targets -The real power of the component model and wasmcp's composition architecture becomes apparent when adding another tool component to our server. We'll use Python this time. +The MCP server component we just created exports the standard [`wasi:http/incoming-handler`](https://github.com/WebAssembly/wasi-http) interface. This means any WebAssembly runtime that supports `wasi:http` can serve the component to MCP clients over the Streamable HTTP transport. + +For example, we can use [`wasmtime serve`](https://github.com/bytecodealliance/wasmtime): + +```shell +wasmtime serve -Scli server.wasm +``` + +Our server also exports [`wasi:cli/run`](https://github.com/WebAssembly/wasi-cli), which lets it support the stdio MCP transport. + +```shell +wasmtime run server.wasm +``` + +To deploy an MCP server as a Wasm component over the network, we can target a Spin-compatible cloud platform like [Fermyon Wasm Functions](https://www.fermyon.com/wasm-functions), which will scale a server component efficiently across [Akamai's](https://www.akamai.com/why-akamai/global-infrastructure)'s distributed network edge with application-scoped key-value storage. Projects like [SpinKube](https://www.spinkube.dev/) and [wasmCloud](https://github.com/wasmCloud/wasmCloud) allow MCP server components to be deployed on self-hosted Kubernetes clusters. A hypothetical MCP-specific hosting platform could potentially leverage this architecture to manage user-submitted MCP components. + +This story will expand as the ecosystems around both WebAssembly components and MCP continue to grow. + +## Publishing to OCI Registries + +With a `spin.toml` file like the one above, we can use the `spin registry` command to publish our server component to an [OCI](https://opencontainers.org/) registry like [GitHub Container Registry](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry), [Docker Hub](https://docs.docker.com/docker-hub/repos/manage/hub-images/oci-artifacts/), or [Amazon Elastic Container Registry](https://aws.amazon.com/ecr/). + +```shell +echo $GHCR_PAT | spin registry login --username mygithub --password-stdin ghcr.io +spin registry push ghcr.io/mygithub/basic-utils:0.1.0 +``` + +`spin up` can automatically resolve a component from the registry. + +```shell +spin up --from ghcr.io/mygithub/basic-utils:0.1.0 +``` + +We can also use [wkg](https://github.com/bytecodealliance/wasm-pkg-tools) directly to publish our server to an [OCI](https://opencontainers.org/) registry like [GitHub Container Registry](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry), [Docker Hub](https://docs.docker.com/docker-hub/repos/manage/hub-images/oci-artifacts/), or [Amazon Elastic Container Registry](https://aws.amazon.com/ecr/). +```shell +wkg oci push ghcr.io/mygithub/basic-utils:0.1.0 polyglot.wasm ``` -$ wasmcp new python-tools –language python -$ cd python-tools # Develop -$ make # Builds to python-tools/python-tools.wasm + +Anyone with read access to this artifact can then pull the component using `wkg` to run it with a particular runtime. + +```shell +wkg oci pull ghcr.io/mygithub/basic-utils:0.1.0 +wasmtime serve -Scli mygithub:basic-utils@0.1.0.wasm +``` + +We can publish any individual MCP feature component, or any sequence of composed components (which need not be servers) as a standalone artifact in the same way. This allows for composition and distribution of pre-built component patterns which are themselves not yet servers, and are further composable. See `wasmcp compose --help` for details. + +## The Architecture of Wasmcp + +The real power of the component model and wasmcp's composition architecture becomes apparent when adding another tool component to our server. We'll use Python this time. + +```shell +wasmcp new python-tools –-language python +cd python-tools +make ``` ```python +# python-tools/app.py class StringsTools(exports.Tools): def list_tools( self, @@ -305,21 +372,18 @@ class StringsTools(exports.Tools): We compose our first and second tool components together by adding the paths to both tool component binaries in the `wasmcp compose server` arguments. Note that these local paths can be substituted for OCI registry artifacts. See `wasmcp compose server –help` for details. -``` -$ wasmcp compose server ./my-first-tools/target/wasm32-wasip2/release/my-first-tools.wasm ./python-tools/python-tools.wasm -o polyglot.wasm - +```shell +wasmcp compose server ./my-first-tools/target/wasm32-wasip2/release/my-first-tools.wasm ./python-tools/python-tools.wasm -o polyglot.wasm ``` Run `polyglot.wasm` with `spin up`. -``` -$ spin up -f polyglot.wasm - -Serving http://127.0.0.1:3000 +```shell +spin up -f polyglot.wasm ``` Now our server has four tools: `add`, `subtract`, `reverse`, and `uppercase`! Two are implemented in Python, and two in Rust. -### Wasmcp's architecture +### How? Server features like tools, resources, prompts, and completions, are implemented by individual WebAssembly components that export narrow [WIT](https://component-model.bytecodealliance.org/design/wit.html) interfaces mapped from the MCP spec's [schema types](https://modelcontextprotocol.io/specification/draft/schema). @@ -352,36 +416,12 @@ This enables dynamic composition without complex configuration, all within a sin This example only scratched the surface of what we can potentially do with `wasmcp`. To see some of the more advanced patterns like custom middleware components and session-enabled features, check out the [examples](https://github.com/wasmcp/wasmcp/tree/main/examples). -## Publishing to OCI Registries - -We can use [wkg](https://github.com/bytecodealliance/wasm-pkg-tools) to publish our server to an [OCI](https://opencontainers.org/) registry like [GitHub Container Registry](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry), [Docker Hub](https://docs.docker.com/docker-hub/repos/manage/hub-images/oci-artifacts/), or [Amazon Elastic Container Registry](https://aws.amazon.com/ecr/). - -``` -$ wkg publish polyglot.wasm --package mygithub:basic-utils@0.1.0 -``` - -Anyone with read access to this artifact can then download and run the server using `wkg`. - -``` -$ wkg get mygithub:basic-utils@0.1.0 - -$ spin up -f mygithub:basic-utils@0.1.0.wasm - -Serving http://127.0.0.1:3000 -``` - -We can publish any individual component, or any sequence of composed MCP feature components and middleware, as a standalone artifact in the same way. This enables dynamic and flexible composition of reusable components across servers in a kind of recursive drag-and-drop way, supporting composition and distribution of pre-built patterns which are themselves further composable. See `wasmcp compose --help` for details. - -## Related projects - -Microsoft's [Wassette](https://github.com/microsoft/wassette) is a security-oriented runtime that runs WebAssembly Components via MCP. +## Related Projects -While Wassette is a custom MCP-specific runtime that can dynamically load and execute components as individual tools on demand with deeply integrated access control, Wasmcp is a toolchain for producing an MCP server as a component that is compatible across Wasm runtimes. +Microsoft's [Wassette](https://github.com/microsoft/wassette) is a security-oriented runtime that runs WebAssembly Components via MCP. It can dynamically load and execute components as individual tools on demand with deeply integrated access controls. Wassette itself is not a component. It is an MCP server than runs components. -## An Open Foundation for AI Agents +By contrast, Wasmcp is a toolchain for producing an MCP server as a component that exports the standard [WASI](https://wasi.dev/) interfaces for HTTP and CLI commands: [`wasi:http`](https://github.com/WebAssembly/wasi-http) and [`wasi:cli`](https://github.com/WebAssembly/wasi-cli). This component runs on any server runtime that supports WASI and the component model. -By building on two complementary open standards, MCP and the WebAssembly component model, we can expose new context to AI applications and agents in a useful way that solves some of the current challenges towards achieving that goal. +## Implications -To distribute an MCP server as a Wasm component over the network, we can target Spin-compatible cloud platforms like [Fermyon Wasm Functions](https://www.fermyon.com/wasm-functions), which will scale a server component efficiently across the global network edge with access to application-scoped key-value storage. [SpinKube](https://www.spinkube.dev/), which you can host on your own infrastructure, unlocks another level of flexibility. Any platform or runtime that directly supports the Wasm component model becomes a valid deployment target for the same component binary. A hypothetical MCP-specific hosting platform could even leverage this architecture to safely run user-submitted MCP features more directly. - -This story will only get better as Wasm components improve alongside active advances in language models. +The ecosystems around both WebAssembly components and MCP continue to grow rapidly. As more developers adopt these technologies, we can expect to see more innovative projects and applications emerge across a variety of use cases. From 5fdec12573d8575333001a8925fdd00d3c199305 Mon Sep 17 00:00:00 2001 From: bowlofarugula Date: Thu, 20 Nov 2025 23:57:07 -0800 Subject: [PATCH 16/21] more nits Signed-off-by: bowlofarugula --- content/blog/mcp-with-wasmcp.md | 35 ++++++++++++++++----------------- 1 file changed, 17 insertions(+), 18 deletions(-) diff --git a/content/blog/mcp-with-wasmcp.md b/content/blog/mcp-with-wasmcp.md index f34c715e..18286c2a 100644 --- a/content/blog/mcp-with-wasmcp.md +++ b/content/blog/mcp-with-wasmcp.md @@ -58,7 +58,7 @@ MCP defines a set of context [primitives](https://modelcontextprotocol.io/specif | Resources | Application-controlled | Contextual data attached and managed by the client | File contents, git history | | Tools | Model-controlled | Functions exposed to the LLM to take actions | API POST requests, file writing | -Beyond server features, MCP defines client-hosted features that servers can call directly. For example [elicitations](https://modelcontextprotocol.io/specification/2025-06-18/client/elicitation) can be implemented by a client to allow a server to directly prompt for user input during the course of a tool call, bypassing the model as an intermediary. +Beyond server features, MCP defines client-hosted features that servers can call directly. For example, [elicitations](https://modelcontextprotocol.io/specification/2025-06-18/client/elicitation) can be implemented by a client to allow a server to directly prompt for user input during the course of a tool call, bypassing the model as an intermediary. These bidirectional features are possible because MCP is designed as an inherently bidirectional protocol based on [JSON-RPC](https://www.jsonrpc.org/specification). @@ -79,7 +79,7 @@ Local MCP servers are relatively simple to implement over the stdio transport. O 3. Scaling and and response latency matter. We may not initially think of the response time of remote tool calls as being important, given inference itself (especially with thinking enabled) is generally slow anyway. But consider that in answering a single query, an agent may need to make many consecutive tool calls to one or more remote MCP servers. The latency of even a few hundred milliseconds for each tool call can quickly snowball to seconds of lag. In realtime use cases like a voice or stock trading agent, even small response delays for tool calls can translate to the success or failure of the overall interaction or goal. 4. Authorization is painful. While the MCP spec does define OAuth flows, authorization is not yet straightforward to implement in practice. Currently, it requires an authorizer that supports [Dynamic Client Registration](https://datatracker.ietf.org/doc/html/rfc7591). Support for a simplified flow via [OAuth Client ID Metadata Documents](https://datatracker.ietf.org/doc/draft-ietf-oauth-client-id-metadata-document/) is confirmed for the November 2025 spec release. -Many projects and platforms, from new startups to established enterprises, are taking a crack at solutions to each of these problems. These implementations are often piecemeal and built on proprietary stacks. But there is a unique intersection of emerging technologies that stands to provide a more holistic, portable, and open foundation. +There is a unique intersection of emerging technologies that could address some of this pain and more. ## WebAssembly Components @@ -93,9 +93,9 @@ We can push, pull, and compose component binaries from OCI registries just like This means that dynamic composition of published artifacts can happen truly on-the-fly relative to other virtualization options. In only a few seconds we can pull a set of OCI-hosted component binaries that implement individual MCP [server features](https://modelcontextprotocol.io/docs/learn/server-concepts#core-server-features), compose them into a sandboxed MCP server, and start it up. We can also distribute fully composed server components on OCI registries in the same way. -Existing MCP SDKs are fragmented across language ecosystems and generally require long-lived compute or external session backends to implement advanced bidirectional features over the network, if they are supported at all. By contrast, the component model opens the door to safely composing MCP features as component binaries against standard [WASI](https://wasi.dev/) interfaces that runtimes and platforms implement under the hood. This architecture allows for naturally progressive and dynamically configurable servers that expose advanced session-enabled features with portability across component runtimes and platforms. +Existing MCP SDKs are fragmented across language ecosystems and generally require long-lived compute or external session backends to implement advanced bidirectional features over the network, if they are supported at all. By contrast, the component model opens the door to safely composing MCP features as component binaries against standard [WASI](https://wasi.dev/) interfaces like [wasi:keyvalue](https://github.com/WebAssembly/wasi-keyvalue), which runtimes and platforms implement under the hood. This architecture allows for session-enabled features to be implemented portably without being tied to some particular implementation of the external session state bucket. -But first we need a way to author MCP server features as WebAssembly components, and we need the tooling to compose these components into functional, spec-compliant servers that run on portably across WebAssembly runtimes. +But first we need a way to author MCP server features as WebAssembly components, and we need the tooling to compose these components into functional, spec-compliant servers that run portably across WebAssembly runtimes. Simply using existing MCP server SDKs on WebAssembly is increasingly possible, but this approach treats runtime compatibility as an obstacle to overcome, with basic parity as the final goal. Instead we want to leverage the strengths of the component model itself as a paradigm to enable the patterns we just explored. @@ -126,7 +126,7 @@ Or build it from source. cargo install --git https://github.com/wasmcp/wasmcp ``` -Source / open a new terminal and then scaffold out a tool component with `wasmcp new`. Only basic dependencies and build tooling from Bytecode Alliance are included. TypesScript uses [jco](https://github.com/bytecodealliance/jco), Rust uses [wit-bindgen](https://github.com/bytecodealliance/wit-bindgen), and Python uses [componentize-py](https://github.com/bytecodealliance/componentize-py). +Open a new terminal and then scaffold out a tool component with `wasmcp new`. Only basic dependencies and build tooling from Bytecode Alliance are included. TypesScript uses [jco](https://github.com/bytecodealliance/jco), Rust uses [wit-bindgen](https://github.com/bytecodealliance/wit-bindgen), and Python uses [componentize-py](https://github.com/bytecodealliance/componentize-py). wasmcp does not maintain any language-specific SDKs. The [WIT](https://component-model.bytecodealliance.org/design/wit.html) language describes the framework boundary. @@ -239,8 +239,7 @@ spin up --from spin.toml Just like _that_, we have a functional MCP server over the Streamable HTTP transport. -These AI applications are just some of the many that can use this MCP server to extend their capabilities. - +These AI applications are just some of the many that can use this MCP server to extend their capabilities: * [Antigravity](https://antigravity.google/docs/mcp) * [ChatGPT (developer mode)](https://platform.openai.com/docs/guides/developer-mode) * [Claude Code](https://code.claude.com/docs/en/mcp) @@ -268,7 +267,7 @@ Our server also exports [`wasi:cli/run`](https://github.com/WebAssembly/wasi-cli wasmtime run server.wasm ``` -To deploy an MCP server as a Wasm component over the network, we can target a Spin-compatible cloud platform like [Fermyon Wasm Functions](https://www.fermyon.com/wasm-functions), which will scale a server component efficiently across [Akamai's](https://www.akamai.com/why-akamai/global-infrastructure)'s distributed network edge with application-scoped key-value storage. Projects like [SpinKube](https://www.spinkube.dev/) and [wasmCloud](https://github.com/wasmCloud/wasmCloud) allow MCP server components to be deployed on self-hosted Kubernetes clusters. A hypothetical MCP-specific hosting platform could potentially leverage this architecture to manage user-submitted MCP components. +To deploy an MCP server as a Wasm component over the network, we can target a Spin-compatible cloud platform like [Fermyon Wasm Functions](https://www.fermyon.com/wasm-functions), which will scale a server component efficiently across [Akamai's](https://www.akamai.com/why-akamai/global-infrastructure)'s distributed network edge with application-scoped key-value storage. Projects like [SpinKube](https://www.spinkube.dev/) and [wasmCloud](https://github.com/wasmCloud/wasmCloud) allow MCP server components to be deployed on self-hosted Kubernetes clusters. A hypothetical MCP-specific platform could potentially leverage this architecture to manage user-submitted MCP components. This story will expand as the ecosystems around both WebAssembly components and MCP continue to grow. @@ -287,22 +286,22 @@ spin registry push ghcr.io/mygithub/basic-utils:0.1.0 spin up --from ghcr.io/mygithub/basic-utils:0.1.0 ``` -We can also use [wkg](https://github.com/bytecodealliance/wasm-pkg-tools) directly to publish our server to an [OCI](https://opencontainers.org/) registry like [GitHub Container Registry](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry), [Docker Hub](https://docs.docker.com/docker-hub/repos/manage/hub-images/oci-artifacts/), or [Amazon Elastic Container Registry](https://aws.amazon.com/ecr/). +We can also use [wkg](https://github.com/bytecodealliance/wasm-pkg-tools) directly to publish our server. ```shell wkg oci push ghcr.io/mygithub/basic-utils:0.1.0 polyglot.wasm ``` -Anyone with read access to this artifact can then pull the component using `wkg` to run it with a particular runtime. +Anyone with read access to this artifact can then pull the component using `wkg` to run it with their runtime of choice. ```shell wkg oci pull ghcr.io/mygithub/basic-utils:0.1.0 wasmtime serve -Scli mygithub:basic-utils@0.1.0.wasm ``` -We can publish any individual MCP feature component, or any sequence of composed components (which need not be servers) as a standalone artifact in the same way. This allows for composition and distribution of pre-built component patterns which are themselves not yet servers, and are further composable. See `wasmcp compose --help` for details. +We can publish any individual MCP feature component, or any sequence of composed components (which need not be servers) as a standalone artifact in the same way. This allows for composition and distribution of pre-built middleware stacks which are further composable into servers. See `wasmcp compose --help` for details. -## The Architecture of Wasmcp +## Advanced Composition and Wasmcp Architecture The real power of the component model and wasmcp's composition architecture becomes apparent when adding another tool component to our server. We'll use Python this time. @@ -385,11 +384,11 @@ Now our server has four tools: `add`, `subtract`, `reverse`, and `uppercase`! Tw ### How? -Server features like tools, resources, prompts, and completions, are implemented by individual WebAssembly components that export narrow [WIT](https://component-model.bytecodealliance.org/design/wit.html) interfaces mapped from the MCP spec's [schema types](https://modelcontextprotocol.io/specification/draft/schema). +Server features like tools, resources, prompts, and completions are implemented by individual WebAssembly components that export narrow [WIT](https://component-model.bytecodealliance.org/design/wit.html) interfaces mapped from the MCP spec's [schema types](https://modelcontextprotocol.io/specification/draft/schema). -`wasmcp compose` plugs these feature components into framework components and composes them together behind a transport component as a complete middleware [chain of responsibility](https://en.wikipedia.org/wiki/Chain-of-responsibility_pattern) that implements an MCP server. +`wasmcp compose` plugs these feature components into framework middleware components and composes them together as a [chain of responsibility](https://en.wikipedia.org/wiki/Chain-of-responsibility_pattern) that implements an MCP server. -Any of the wasmcp framework components, like the transport, can be swapped out for custom implementations during composition, enabling flexible server configurations. +Any of the components in the chain, like the transport, can be swapped out during composition. ``` Transport @@ -412,9 +411,9 @@ Each component: - Delegates others downstream - Merges results (e.g., combining tool lists) -This enables dynamic composition without complex configuration, all within a single Wasm binary. Think Unix pipes for MCP using Wasm components. +This enables dynamic composition of component binaries into a single MCP server. Sequences of middleware components can be composed together to form reusable functionality that can be saved and plugged into multiple servers. -This example only scratched the surface of what we can potentially do with `wasmcp`. To see some of the more advanced patterns like custom middleware components and session-enabled features, check out the [examples](https://github.com/wasmcp/wasmcp/tree/main/examples). +Advanced patterns featuring custom middleware components, session-enabled features, and SSE streaming are available at [examples](https://github.com/wasmcp/wasmcp/tree/main/examples). ## Related Projects @@ -422,6 +421,6 @@ Microsoft's [Wassette](https://github.com/microsoft/wassette) is a security-orie By contrast, Wasmcp is a toolchain for producing an MCP server as a component that exports the standard [WASI](https://wasi.dev/) interfaces for HTTP and CLI commands: [`wasi:http`](https://github.com/WebAssembly/wasi-http) and [`wasi:cli`](https://github.com/WebAssembly/wasi-cli). This component runs on any server runtime that supports WASI and the component model. -## Implications +## Futures The ecosystems around both WebAssembly components and MCP continue to grow rapidly. As more developers adopt these technologies, we can expect to see more innovative projects and applications emerge across a variety of use cases. From cf50e396f2f50ef41bd48222ce9a431bb55576cb Mon Sep 17 00:00:00 2001 From: bowlofarugula Date: Fri, 21 Nov 2025 08:59:29 -0800 Subject: [PATCH 17/21] fix lingering typos and punctuation Signed-off-by: bowlofarugula --- content/blog/mcp-with-wasmcp.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/content/blog/mcp-with-wasmcp.md b/content/blog/mcp-with-wasmcp.md index 18286c2a..bbd9fb73 100644 --- a/content/blog/mcp-with-wasmcp.md +++ b/content/blog/mcp-with-wasmcp.md @@ -198,7 +198,7 @@ Now let’s build this component and compose it into a full MCP server. The `was This is accomplished with Bytecode Alliance’s [wac](https://github.com/bytecodealliance/wac) tooling, which you can also use directly for composition. -Note that any of the framework-level components can also be interchanged with your own custom implementations, like a custom transport component. See `wasmcp compose server –help` for details. +Note that any of the framework-level components can also be interchanged with your own custom implementations, like a custom transport component. See `wasmcp compose server --help` for details. ```shell cd my-first-tools/ @@ -267,7 +267,7 @@ Our server also exports [`wasi:cli/run`](https://github.com/WebAssembly/wasi-cli wasmtime run server.wasm ``` -To deploy an MCP server as a Wasm component over the network, we can target a Spin-compatible cloud platform like [Fermyon Wasm Functions](https://www.fermyon.com/wasm-functions), which will scale a server component efficiently across [Akamai's](https://www.akamai.com/why-akamai/global-infrastructure)'s distributed network edge with application-scoped key-value storage. Projects like [SpinKube](https://www.spinkube.dev/) and [wasmCloud](https://github.com/wasmCloud/wasmCloud) allow MCP server components to be deployed on self-hosted Kubernetes clusters. A hypothetical MCP-specific platform could potentially leverage this architecture to manage user-submitted MCP components. +To deploy an MCP server as a Wasm component over the network, we can target a Spin-compatible cloud platform like [Fermyon Wasm Functions](https://www.fermyon.com/wasm-functions), which will scale a server component efficiently across [Akamai](https://www.akamai.com/why-akamai/global-infrastructure)'s distributed network edge with application-scoped key-value storage. Projects like [SpinKube](https://www.spinkube.dev/) and [wasmCloud](https://github.com/wasmCloud/wasmCloud) allow MCP server components to be deployed on self-hosted Kubernetes clusters. A hypothetical MCP-specific platform could potentially leverage this architecture to manage user-submitted MCP components. This story will expand as the ecosystems around both WebAssembly components and MCP continue to grow. @@ -369,7 +369,7 @@ class StringsTools(exports.Tools): return None # We don't handle this tool ``` -We compose our first and second tool components together by adding the paths to both tool component binaries in the `wasmcp compose server` arguments. Note that these local paths can be substituted for OCI registry artifacts. See `wasmcp compose server –help` for details. +We compose our first and second tool components together by adding the paths to both tool component binaries in the `wasmcp compose server` arguments. Note that these local paths can be substituted for OCI registry artifacts. See `wasmcp compose server -–help` for details. ```shell wasmcp compose server ./my-first-tools/target/wasm32-wasip2/release/my-first-tools.wasm ./python-tools/python-tools.wasm -o polyglot.wasm @@ -423,4 +423,4 @@ By contrast, Wasmcp is a toolchain for producing an MCP server as a component th ## Futures -The ecosystems around both WebAssembly components and MCP continue to grow rapidly. As more developers adopt these technologies, we can expect to see more innovative projects and applications emerge across a variety of use cases. +The ecosystems around both WebAssembly components and MCP continue to grow rapidly. As developers continue to adopt these technologies, we can expect to see more innovative projects and applications emerge across a variety of use cases. From 20a1f8971be09ae5227a5eb1d1841761cdcddefc Mon Sep 17 00:00:00 2001 From: bowlofarugula Date: Sat, 22 Nov 2025 15:10:06 -0800 Subject: [PATCH 18/21] nit clarifications and some more links Signed-off-by: bowlofarugula --- content/blog/mcp-with-wasmcp.md | 126 ++++++++++++++++++-------------- 1 file changed, 73 insertions(+), 53 deletions(-) diff --git a/content/blog/mcp-with-wasmcp.md b/content/blog/mcp-with-wasmcp.md index bbd9fb73..addb48c4 100644 --- a/content/blog/mcp-with-wasmcp.md +++ b/content/blog/mcp-with-wasmcp.md @@ -14,37 +14,43 @@ author = "Ian McDonald" [Wasmcp](https://github.com/wasmcp/wasmcp) is a [WebAssembly Component](https://component-model.bytecodealliance.org/) development kit for the [Model Context Protocol](https://modelcontextprotocol.io/docs/getting-started/intro). -Together they form a polyglot toolchain for extending the capabilities of language models in a composable, portable, and secure way. +Together they form a polyglot toolchain for extending the capabilities of language models in a new way that offers unique benefits and potential solutions to current pains. See the [quickstart](#quickstart) or read on for some context. -## What are tools? +## What are Tools? -Large language models (LLMs) are trained on vast heaps of data that they use to generate natural language responses to input queries. But that knowledge is static once training is over. They are unable to answer simple questions that require current data, like “What time is it?” or “What's the weather tomorrow in Atlanta?”. This highlights the gap between a simple model and an intelligent system that can actually *do* things and acquire new information, or context, dynamically. This is generally where the term *agent* starts to enter the conversation. +Large language models (LLMs) are trained on vast heaps of data that they use to generate responses to input queries. But that knowledge is static once training is over. They are unable to answer simple questions that require current data, like “What time is it?” or “What's the weather tomorrow in Atlanta?”. This highlights the gap between a simple model and an intelligent system that can actually *do* things and acquire new information, or context, dynamically. This is generally where the term *agent* starts to enter the conversation. -All LLMs depend on calling external [functions](https://gorilla.cs.berkeley.edu/leaderboard.html), also called tools, to interact with the outside world beyond the prompt and to perform deterministic actions. Just like you might use a calculator to accurately crunch numbers, or a web browser to explore the internet, an LLM might use its own calculator and fetch tools in the same way. Even basic capabilities like reading a file from disk are implemented via tools. +All LLMs depend on calling external [functions](https://gorilla.cs.berkeley.edu/leaderboard.html), also called tools, to interact with the outside world beyond the prompt and to perform deterministic actions. Just like you might use a calculator to accurately crunch numbers, or a web browser to explore the internet, an LLM might use its own calculator and HTTP fetch tools in the same way. Even basic capabilities like reading a file from disk are implemented via tools. -Without tools a language model is like someone sitting in an empty, windowless box with only their memories from an array of random encyclopedias, books, etc. (training data) to pull from. Our interactions with them are something along the lines of: A human slips a question written on a piece of paper under the door for the model to read, and the model slips back a response using only their prior knowledge and imagination. +Without tools a language model is like someone sitting in an empty, windowless box with only their memories from an array of random encyclopedias, books, and other training data to pull from. Our interactions with them are something along the lines of: A human slips a question written on a piece of paper under the door for the model to read, and the model slips back a response using only their prior knowledge and imagination. -That's a far cry from the promise of autonomous systems that understand and act on the world in realtime, let alone transform it. +That's a long way from the promise of autonomous systems that understand and act on the world in realtime, let alone transform it. Our first thought might be to implement a simple HTTP fetch tool for our target model. Now that model can search the internet in a loop against user queries and, voilà, we have an *agent*. Fresh data and the means of interacting with the current state of the world become available. That windowless box gets a desktop with a browser. -Problem solved, right? Not quite… +Problem solved, right? Not quite. ## Communication Hurdles **Problem 1**: The internal representation of tools is not standard across models. In other words, their hands look different. How do we build a hammer that each of them can grip? -We’d need to implement a new version of our tool for OpenAI’s GPT models, and another for the Claude family, another for Gemini, etc. So M models x N tools = T total tool implementations. Consider that fetch is only one example, and we might want many different kinds of tools available for various tasks. +We’d need to implement a new version of each tool for OpenAI’s GPT models, and another for the Claude family, another for Gemini, etc. So M models x N tools = T total tool implementations. -[AI SDKs](https://ai-sdk.dev/docs/ai-sdk-core/tools-and-tool-calling) can solve this problem directly for a given programming language by implementing support for multiple models and exposing a common interface for tools over them. +[AI SDKs](https://ai-sdk.dev/docs/ai-sdk-core/tools-and-tool-calling) can alleviate this problem for a given programming language by implementing tool calling support for multiple models and exposing a common interface for tools over them. -**Problem 2**: Even if tool implementations are not coupled to specific models, they become coupled to the specific SDK used to implement them, and by extension to the runtime of that SDK. Because models themselves have no built-in way of discovering and connecting to new tools over the wire, the tools must run alongside the same code that calls inference to implement the agent's loop. +**Problem 2**: With AI SDKs, tools become coupled to the specific SDK used to implement them. Because models themselves have no built-in way of discovering and connecting to new tools over the wire, the tools must run alongside the same code that calls inference to implement the agent's loop. We now have S SDKs x N tools = T total tool implementations. S < M, so this is an improvement, but we're still not at N = T. -We want tools to be discoverable and accessible to existing agent processes, potentially across the air, at scale. We need a layer of indirection between models and their tools. +We want to implement a given tool only once and make it discoverable and accessible dynamically for any AI application, potentially across the air, at scale. + +The [Fundamental Theorem of Software Engineering](https://en.wikipedia.org/wiki/Fundamental_theorem_of_software_engineering) states: + +> We can solve any problem by introducing an extra level of [indirection](https://en.wikipedia.org/wiki/Indirection). + +We need a layer of indirection between models and their tools. ## The Model Context Protocol @@ -62,7 +68,7 @@ Beyond server features, MCP defines client-hosted features that servers can call These bidirectional features are possible because MCP is designed as an inherently bidirectional protocol based on [JSON-RPC](https://www.jsonrpc.org/specification). -MCP is architected as two [layers](https://modelcontextprotocol.io/docs/learn/architecture#layers): Multiple interchangeable transports ([stdio](https://modelcontextprotocol.io/specification/2025-06-18/basic/transports#stdio), [Streamable HTTP](https://modelcontextprotocol.io/specification/2025-06-18/basic/transports#streamable-http), additional custom transports) can be used to serve the same underlying features. +MCP is architected as two [layers](https://modelcontextprotocol.io/docs/learn/architecture#layers): the Transport layer and the Data layer. Multiple interchangeable transports ([stdio](https://modelcontextprotocol.io/specification/2025-06-18/basic/transports#stdio), [Streamable HTTP](https://modelcontextprotocol.io/specification/2025-06-18/basic/transports#streamable-http), additional custom transports) can be used to serve the same underlying features. Since its release, MCP has become the only tool calling protocol with near-consensus adoption and broad client support. It continues to attract interest from both individual developers and organizations. For example, AWS [joined](https://aws.amazon.com/blogs/opensource/open-protocols-for-agent-interoperability-part-1-inter-agent-communication-on-mcp/) the MCP steering committee earlier this year in May, and OpenAI's new [apps](https://openai.com/index/introducing-apps-in-chatgpt/) are MCP-based. @@ -72,48 +78,51 @@ So why don't we see every org implementing MCP servers to integrate their applic ## The Current State of MCP -Local MCP servers are relatively simple to implement over the stdio transport. Official SDKs and advanced third party frameworks are available in nearly every programming language. But distributing an MCP server, either as a local installation or as a service over the network, presents a number of new challenges. +Local MCP servers are relatively simple to implement. Official SDKs and advanced third party frameworks are available in nearly every programming language. But distributing an MCP server, either as a local installation or as a service over the network, presents a number of new challenges. -1. Local MCP servers can be an attack vector for exploiting host resources, unless they run in a sandboxed environment. +1. Local MCP servers can be an attack vector for exploiting host resources, unless they run in a sandboxed environment. Current solutions to MCP sandboxing involve running the server in a container or virtual machine. 2. Many of MCP's advanced bidirectional features and tool discovery mechanisms are locked behind a dependency on server-managed sessions. This means that we either need servers to run as long-lived processes, keeping their session state directly onboard, or else we need to manage the infrastructure for external session state plus the server code that interacts with it, which may incur additional network latency and complexity. 3. Scaling and and response latency matter. We may not initially think of the response time of remote tool calls as being important, given inference itself (especially with thinking enabled) is generally slow anyway. But consider that in answering a single query, an agent may need to make many consecutive tool calls to one or more remote MCP servers. The latency of even a few hundred milliseconds for each tool call can quickly snowball to seconds of lag. In realtime use cases like a voice or stock trading agent, even small response delays for tool calls can translate to the success or failure of the overall interaction or goal. -4. Authorization is painful. While the MCP spec does define OAuth flows, authorization is not yet straightforward to implement in practice. Currently, it requires an authorizer that supports [Dynamic Client Registration](https://datatracker.ietf.org/doc/html/rfc7591). Support for a simplified flow via [OAuth Client ID Metadata Documents](https://datatracker.ietf.org/doc/draft-ietf-oauth-client-id-metadata-document/) is confirmed for the November 2025 spec release. +4. Authorization is not straightforward to implement. The spec-compliant auth flow requires an authorizer that supports [Dynamic Client Registration](https://datatracker.ietf.org/doc/html/rfc7591). Support for a simplified flow via [OAuth Client ID Metadata Documents](https://datatracker.ietf.org/doc/draft-ietf-oauth-client-id-metadata-document/) is confirmed for the November 2025 spec release. There is a unique intersection of emerging technologies that could address some of this pain and more. ## WebAssembly Components -While WebAssembly (Wasm) is commonly thought of as a browser technology, it has evolved into a versatile platform for building applications more generally. Wasm components are composable, self-contained binaries that can be compiled from various programming languages and run portably and efficiently across a range of host devices while remaining sandboxed from host resources. +While WebAssembly (Wasm) is commonly thought of as a browser technology, it has evolved into a versatile platform for building applications more generally. Self-contained binaries can be compiled from various programming languages and run portably and efficiently across a range of host devices while remaining sandboxed from host resources. This sandboxing capability presents a lighter alternative to containers and virtual machines that works without having to bundle layers of the operating system and its dependencies. + +The Wasm [component model](https://component-model.bytecodealliance.org/) builds on these strengths to implement a broad-reaching architecture for building interoperable WebAssembly libraries, applications, and environments. Wasm components within a single sandboxed process are further isolated from each other, with interop only allowed through explicit interfaces. A visual analogy for this idea might look like a bento box. -The architectural goals of Wasm's [component model](https://component-model.bytecodealliance.org/) align clearly with MCP’s [server design principles](https://modelcontextprotocol.io/specification/2025-06-18/architecture#design-principles). MCP servers are intended to be progressively composed of features, which we can directly map to individual Wasm component binaries. We could author a few components covering various MCP tools, and some others for MCP resources, then compose them together as binaries into a complete MCP server component. +The component model shares architectural similarities with MCP’s [server design principles](https://modelcontextprotocol.io/specification/2025-06-18/architecture#design-principles): -Wasm components are inherently sandboxed from host resources with least privilege access by default, resulting in a light and secure way for agents to run untrusted code on a given machine. Moreover, individual components within that sandboxed process can only interop through explicit interfaces. A visual analogy for this idea might look like a bento box. +> 1. Servers should be extremely easy to build +> 2. Servers should be highly composable +> 3. Servers should not be able to read the whole conversation, nor “see into” other servers +> 4. Features can be added to servers and clients progressively -We can push, pull, and compose component binaries from OCI registries just like container images. But unlike full container images or micro VMs, which bundle layers of the operating system and its dependencies, components encapsulate only their own functionality and can be mere kilobytes in size. Full servers can weigh in under 1MB. +Imagine mapping individual MCP features to Wasm components, which can be composed together to form a complete MCP server component. -This means that dynamic composition of published artifacts can happen truly on-the-fly relative to other virtualization options. In only a few seconds we can pull a set of OCI-hosted component binaries that implement individual MCP [server features](https://modelcontextprotocol.io/docs/learn/server-concepts#core-server-features), compose them into a sandboxed MCP server, and start it up. We can also distribute fully composed server components on OCI registries in the same way. +We can push, pull, and compose component binaries from OCI registries just like container images. But unlike containers or virtual machines, components encapsulate only their own functionality and can be mere kilobytes in size. Full servers can weigh in under 1MB. -Existing MCP SDKs are fragmented across language ecosystems and generally require long-lived compute or external session backends to implement advanced bidirectional features over the network, if they are supported at all. By contrast, the component model opens the door to safely composing MCP features as component binaries against standard [WASI](https://wasi.dev/) interfaces like [wasi:keyvalue](https://github.com/WebAssembly/wasi-keyvalue), which runtimes and platforms implement under the hood. This architecture allows for session-enabled features to be implemented portably without being tied to some particular implementation of the external session state bucket. +Wasm components can be served from the network edge, or run directly on edge devices themselves. -But first we need a way to author MCP server features as WebAssembly components, and we need the tooling to compose these components into functional, spec-compliant servers that run portably across WebAssembly runtimes. +Session-enabled features over Streamable HTTP are an [open problem](https://github.com/modelcontextprotocol/python-sdk/issues/880) across existing MCP SDKs, with various solutions fragmented across language ecosystems or otherwise remaining unsolved. These implementations generally require either long-lived compute or specific external session backends with corresponding glue. [WASI](https://wasi.dev/) interfaces like [wasi:keyvalue](https://github.com/WebAssembly/wasi-keyvalue), which Wasm runtimes and platforms can implement, allow for session-enabled features to be implemented portably without being tied to some particular implementation of an external session state bucket. -Simply using existing MCP server SDKs on WebAssembly is increasingly possible, but this approach treats runtime compatibility as an obstacle to overcome, with basic parity as the final goal. Instead we want to leverage the strengths of the component model itself as a paradigm to enable the patterns we just explored. +But first we need the tooling to make this possible. While existing MCP server SDKs are increasingly compatible with WebAssembly runtimes, they do not take advantage of the strengths of the component model. This is where [wasmcp](https://github.com/wasmcp/wasmcp) comes in. ## [Wasmcp](https://github.com/wasmcp/wasmcp) -Wasmcp isn’t a runtime, and it’s not exactly an SDK. It’s a collection of WebAssembly components and tooling that work together to function as a polyglot framework for authoring MCP features as WebAssembly components. The result is a single MCP server as a WebAssembly component binary. +Wasmcp isn’t a runtime, and it’s not exactly an SDK. It is a collection of WebAssembly components and tooling that work together to function as a polyglot framework for authoring and composing MCP features as WebAssembly components. The result of this composition is an MCP server as a single component binary. -Many MCP patterns that would otherwise require external gateways become possible in memory within a single binary via composition. +Many MCP composition patterns that would otherwise require external networked gateways become possible in memory within a single binary via composition. -These servers can run on any component-enabled runtime, like Spin. - -With wasmcp we can implement a polyglot MCP server composed of Python tools that use [Pandas](https://pandas.pydata.org/), TypeScript tools that use [Zod](https://zod.dev/), and performance critical tools or [Regorus](https://github.com/microsoft/regorus)-enabled authorization middleware in Rust. +With wasmcp we can implement a polyglot MCP server composed of Python tools that use [Pandas](https://pandas.pydata.org/), TypeScript tools that use [Zod](https://zod.dev/), and performance-critical tools or [Regorus](https://github.com/microsoft/regorus)-enabled authorization middleware in Rust. We can also interchangeably compose different transports and middlewares into the server binary. -Because the [Spin](https://github.com/spinframework/spin) runtime implements [wasi:keyvalue](https://github.com/WebAssembly/wasi-keyvalue), we get a pluggable backend for MCP session state. This means the full range of spec features, including server-sent requests and notifications, work out of the box with compatible MCP clients over the Streamable HTTP transport. +Because [Spin](https://github.com/spinframework/spin) implements [wasi:keyvalue](https://github.com/WebAssembly/wasi-keyvalue) at the runtime level, we get a pluggable backend for MCP session state. This means the full range of spec features, including server-sent requests and notifications, work out of the box with compatible MCP clients over the Streamable HTTP transport. ## Quickstart @@ -128,18 +137,18 @@ cargo install --git https://github.com/wasmcp/wasmcp Open a new terminal and then scaffold out a tool component with `wasmcp new`. Only basic dependencies and build tooling from Bytecode Alliance are included. TypesScript uses [jco](https://github.com/bytecodealliance/jco), Rust uses [wit-bindgen](https://github.com/bytecodealliance/wit-bindgen), and Python uses [componentize-py](https://github.com/bytecodealliance/componentize-py). -wasmcp does not maintain any language-specific SDKs. The [WIT](https://component-model.bytecodealliance.org/design/wit.html) language describes the framework boundary. +Wasmcp does not include any language-specific SDKs. The [WIT](https://component-model.bytecodealliance.org/design/wit.html) language describes the framework boundary. -We'll target Rust for our first one. +We'll target Rust for our first component. ```shell -wasmcp new my-first-tools --language rust +wasmcp new rust-tools --language rust ``` -If you open up `my-first-tools/src/lib.rs`, you’ll see some boilerplate similar to the code block below that you can fill in with your tool implementations. A single tool component can define multiple MCP tools. As we’ll see, multiple tool components can then be chained together and their tools aggregated. This pattern also applies to the other MCP primitives: [resources](https://github.com/wasmcp/wasmcp/blob/main/cli/templates/rust-resources/src/lib.rs) and [prompts](https://github.com/wasmcp/wasmcp/blob/main/cli/templates/rust-prompts/src/lib.rs) +If you open up `rust-tools/src/lib.rs`, you’ll see some boilerplate similar to the code block below that you can fill in with your tool implementations. A single tool component can define multiple MCP tools. This pattern also applies to the other MCP primitives, [resources](https://github.com/wasmcp/wasmcp/blob/main/cli/templates/rust-resources/src/lib.rs) and [prompts](https://github.com/wasmcp/wasmcp/blob/main/cli/templates/rust-prompts/src/lib.rs), as well as server-side utility features like [completion](https://modelcontextprotocol.io/specification/2025-03-26/server/utilities/completion). ```rust -/// my-first-tools/src/lib.rs +/// rust-tools/src/lib.rs impl Guest for Calculator { fn list_tools( _ctx: RequestCtx, @@ -201,9 +210,9 @@ This is accomplished with Bytecode Alliance’s [wac](https://github.com/bytecod Note that any of the framework-level components can also be interchanged with your own custom implementations, like a custom transport component. See `wasmcp compose server --help` for details. ```shell -cd my-first-tools/ +cd rust-tools/ make -wasmcp compose server target/wasm32-wasip2/release/my-first-tools.wasm -o server.wasm +wasmcp compose server target/wasm32-wasip2/release/rust_tools.wasm -o server.wasm ``` Now that we have a complete `server.wasm` component, we can run it directly with `spin up`. @@ -212,10 +221,12 @@ Now that we have a complete `server.wasm` component, we can run it directly with spin up --from server.wasm ``` +Just like _that_, we have a functional MCP server over the Streamable HTTP transport. + We can provide more detailed runtime configuration with a [spin.toml](https://spinframework.dev/v3/writing-apps) file. ```toml -# my-first-tools/spin.toml +# rust-tools/spin.toml spin_manifest_version = 2 [application] @@ -237,8 +248,6 @@ allowed_outbound_hosts = [] # Update for outbound HTTP spin up --from spin.toml ``` -Just like _that_, we have a functional MCP server over the Streamable HTTP transport. - These AI applications are just some of the many that can use this MCP server to extend their capabilities: * [Antigravity](https://antigravity.google/docs/mcp) * [ChatGPT (developer mode)](https://platform.openai.com/docs/guides/developer-mode) @@ -267,7 +276,22 @@ Our server also exports [`wasi:cli/run`](https://github.com/WebAssembly/wasi-cli wasmtime run server.wasm ``` -To deploy an MCP server as a Wasm component over the network, we can target a Spin-compatible cloud platform like [Fermyon Wasm Functions](https://www.fermyon.com/wasm-functions), which will scale a server component efficiently across [Akamai](https://www.akamai.com/why-akamai/global-infrastructure)'s distributed network edge with application-scoped key-value storage. Projects like [SpinKube](https://www.spinkube.dev/) and [wasmCloud](https://github.com/wasmCloud/wasmCloud) allow MCP server components to be deployed on self-hosted Kubernetes clusters. A hypothetical MCP-specific platform could potentially leverage this architecture to manage user-submitted MCP components. +To deploy an MCP server as a Wasm component over the network, we can target a Spin-compatible cloud platform like [Fermyon Wasm Functions](https://www.fermyon.com/wasm-functions), which will scale a server component efficiently across [Akamai](https://www.akamai.com/why-akamai/global-infrastructure)'s distributed network edge with application-scoped key-value storage. + +``` +$ spin aka deploy +Name of new app: mcp-server +Creating new app mcp-server in account my-fwf-user +Note: If you would instead like to deploy to an existing app, cancel this deploy and link this workspace to the app with `spin aka app link` +OK to continue? yes +Workspace linked to app mcp-server +Waiting for app to be ready... ready + +App Routes: +- mcp: https://65d837d6-0862-4d76-acc0-xxxxxxxxxxxx.fwf.app/mcp +``` + +Projects like [SpinKube](https://www.spinkube.dev/) and [wasmCloud](https://github.com/wasmCloud/wasmCloud) allow MCP server components to be deployed on self-hosted Kubernetes clusters. A hypothetical MCP platform could leverage this architecture to manage user-submitted MCP components. This story will expand as the ecosystems around both WebAssembly components and MCP continue to grow. @@ -299,11 +323,11 @@ wkg oci pull ghcr.io/mygithub/basic-utils:0.1.0 wasmtime serve -Scli mygithub:basic-utils@0.1.0.wasm ``` -We can publish any individual MCP feature component, or any sequence of composed components (which need not be servers) as a standalone artifact in the same way. This allows for composition and distribution of pre-built middleware stacks which are further composable into servers. See `wasmcp compose --help` for details. +We can publish any individual MCP feature component, or any sequence of composed components, in the same way. ## Advanced Composition and Wasmcp Architecture -The real power of the component model and wasmcp's composition architecture becomes apparent when adding another tool component to our server. We'll use Python this time. +The unique advantages of the component model and wasmcp's composition architecture become apparent when adding another tool component to our server. We'll use Python this time. ```shell wasmcp new python-tools –-language python @@ -372,7 +396,7 @@ class StringsTools(exports.Tools): We compose our first and second tool components together by adding the paths to both tool component binaries in the `wasmcp compose server` arguments. Note that these local paths can be substituted for OCI registry artifacts. See `wasmcp compose server -–help` for details. ```shell -wasmcp compose server ./my-first-tools/target/wasm32-wasip2/release/my-first-tools.wasm ./python-tools/python-tools.wasm -o polyglot.wasm +wasmcp compose server ./python-tools/python-tools.wasm ./rust-tools/target/wasm32-wasip2/release/rust_tools.wasm -o polyglot.wasm ``` Run `polyglot.wasm` with `spin up`. @@ -380,13 +404,13 @@ Run `polyglot.wasm` with `spin up`. spin up -f polyglot.wasm ``` -Now our server has four tools: `add`, `subtract`, `reverse`, and `uppercase`! Two are implemented in Python, and two in Rust. +Now our single server binary exposes four tools: `add`, `subtract`, `reverse`, and `uppercase`! Two are implemented in Python, and two in Rust. ### How? -Server features like tools, resources, prompts, and completions are implemented by individual WebAssembly components that export narrow [WIT](https://component-model.bytecodealliance.org/design/wit.html) interfaces mapped from the MCP spec's [schema types](https://modelcontextprotocol.io/specification/draft/schema). +Server features like tools, resources, prompts, and completions are implemented by individual WebAssembly components that export narrow [WIT](https://component-model.bytecodealliance.org/design/wit.html) interfaces mapped from the MCP spec's [schema](https://modelcontextprotocol.io/specification/draft/schema). -`wasmcp compose` plugs these feature components into framework middleware components and composes them together as a [chain of responsibility](https://en.wikipedia.org/wiki/Chain-of-responsibility_pattern) that implements an MCP server. +`wasmcp compose server` plugs these feature components into framework middleware components and composes them together as a [chain of responsibility](https://en.wikipedia.org/wiki/Chain-of-responsibility_pattern) that implements an MCP server. Any of the components in the chain, like the transport, can be swapped out during composition. @@ -413,14 +437,10 @@ Each component: This enables dynamic composition of component binaries into a single MCP server. Sequences of middleware components can be composed together to form reusable functionality that can be saved and plugged into multiple servers. -Advanced patterns featuring custom middleware components, session-enabled features, and SSE streaming are available at [examples](https://github.com/wasmcp/wasmcp/tree/main/examples). +Check out some [examples](https://github.com/wasmcp/wasmcp/tree/main/examples) to see advanced patterns featuring custom middleware components, session-enabled features, and SSE streaming. ## Related Projects -Microsoft's [Wassette](https://github.com/microsoft/wassette) is a security-oriented runtime that runs WebAssembly Components via MCP. It can dynamically load and execute components as individual tools on demand with deeply integrated access controls. Wassette itself is not a component. It is an MCP server than runs components. - -By contrast, Wasmcp is a toolchain for producing an MCP server as a component that exports the standard [WASI](https://wasi.dev/) interfaces for HTTP and CLI commands: [`wasi:http`](https://github.com/WebAssembly/wasi-http) and [`wasi:cli`](https://github.com/WebAssembly/wasi-cli). This component runs on any server runtime that supports WASI and the component model. - -## Futures +[Wassette](https://github.com/microsoft/wassette) is a security-oriented runtime that runs WebAssembly Components via MCP. It can dynamically load and execute components as individual tools on-demand with deeply integrated access controls. Wassette itself is not a component. It is an MCP server than runs components. -The ecosystems around both WebAssembly components and MCP continue to grow rapidly. As developers continue to adopt these technologies, we can expect to see more innovative projects and applications emerge across a variety of use cases. +Wasmcp is not an MCP server. It is a toolchain for producing an MCP server as a component that exports the standard [WASI](https://wasi.dev/) interfaces for HTTP and CLI commands: [`wasi:http/incoming-handler`](https://github.com/WebAssembly/wasi-http) and [`wasi:cli/run`](https://github.com/WebAssembly/wasi-cli). This server component runs on any runtime or platform that supports WASI and the component model. From 9c3ad482d064bae279ad729d4896ccb9e3bdaacc Mon Sep 17 00:00:00 2001 From: bowlofarugula Date: Sat, 22 Nov 2025 22:46:52 -0800 Subject: [PATCH 19/21] fix typos and code blocks, and some other clarification and refactoring Signed-off-by: bowlofarugula --- content/blog/mcp-with-wasmcp.md | 158 ++++++++++++++++++-------------- 1 file changed, 90 insertions(+), 68 deletions(-) diff --git a/content/blog/mcp-with-wasmcp.md b/content/blog/mcp-with-wasmcp.md index addb48c4..72664d50 100644 --- a/content/blog/mcp-with-wasmcp.md +++ b/content/blog/mcp-with-wasmcp.md @@ -10,14 +10,28 @@ author = "Ian McDonald" --- -[Spin](https://github.com/spinframework/spin) is an open source framework for building and running fast, secure, and composable cloud microservices with WebAssembly. +[Wasmcp](https://github.com/wasmcp/wasmcp) is a [WebAssembly Component](https://component-model.bytecodealliance.org/) Development Kit for the [Model Context Protocol](https://modelcontextprotocol.io/docs/getting-started/intro). -[Wasmcp](https://github.com/wasmcp/wasmcp) is a [WebAssembly Component](https://component-model.bytecodealliance.org/) development kit for the [Model Context Protocol](https://modelcontextprotocol.io/docs/getting-started/intro). +It works with [Spin](https://github.com/spinframework/spin) to let you: -Together they form a polyglot toolchain for extending the capabilities of language models in a new way that offers unique benefits and potential solutions to current pains. +* Build composable MCP servers as WebAssembly components. +* Mix tools and features written in Rust, Python, TypeScript, etc. in a single server binary. +* Plug in shared components for authorization, sessions, logging, and more across multiple MCP servers. +* Run the same sandboxed MCP server binary locally, on [Fermyon Wasm Functions](https://www.fermyon.com/wasm-functions), on Kubernetes clusters (e.g. via [SpinKube](https://www.spinkube.dev/)), or on any runtime that speaks WASI + components. +* Expose both stdio and Streamable HTTP transports via standard [WASI](https://wasi.dev/) exports. See the [quickstart](#quickstart) or read on for some context. +* [What are Tools?](#what-are-tools) +* [The Model Context Protocol](#the-model-context-protocol) +* [Challenges](#challenges) +* [WebAssembly Components](#webassembly-components) +* [Wasmcp](#wasmcp) +* [Quickstart](#quickstart) +* [Compatible Runtimes and Deployments](#compatible-runtimes-and-deployments) +* [Related Projects](#related-projects) +* [Why?](#why) + ## What are Tools? Large language models (LLMs) are trained on vast heaps of data that they use to generate responses to input queries. But that knowledge is static once training is over. They are unable to answer simple questions that require current data, like “What time is it?” or “What's the weather tomorrow in Atlanta?”. This highlights the gap between a simple model and an intelligent system that can actually *do* things and acquire new information, or context, dynamically. This is generally where the term *agent* starts to enter the conversation. @@ -34,17 +48,17 @@ That windowless box gets a desktop with a browser. Problem solved, right? Not quite. -## Communication Hurdles +### Communication Hurdles **Problem 1**: The internal representation of tools is not standard across models. In other words, their hands look different. How do we build a hammer that each of them can grip? -We’d need to implement a new version of each tool for OpenAI’s GPT models, and another for the Claude family, another for Gemini, etc. So M models x N tools = T total tool implementations. +We’d need to write a new implementation of each tool for OpenAI’s GPT models, another for the Claude family, another for Gemini, etc. So the number of total tool implementations is `M x N`, where `M` is the number of models and `N` is the number of unique tools. [AI SDKs](https://ai-sdk.dev/docs/ai-sdk-core/tools-and-tool-calling) can alleviate this problem for a given programming language by implementing tool calling support for multiple models and exposing a common interface for tools over them. -**Problem 2**: With AI SDKs, tools become coupled to the specific SDK used to implement them. Because models themselves have no built-in way of discovering and connecting to new tools over the wire, the tools must run alongside the same code that calls inference to implement the agent's loop. We now have S SDKs x N tools = T total tool implementations. S < M, so this is an improvement, but we're still not at N = T. +**Problem 2**: Tool calling implemented by an AI SDK couples tool instances to an application's runtime. Tools must run alongside the same code that calls inference to implement the agent's loop. We cannot take one application's tools and call them from an external process. -We want to implement a given tool only once and make it discoverable and accessible dynamically for any AI application, potentially across the air, at scale. +We want to implement a given tool only once and make it discoverable and accessible dynamically for any AI application, potentially across the network, at scale. The [Fundamental Theorem of Software Engineering](https://en.wikipedia.org/wiki/Fundamental_theorem_of_software_engineering) states: @@ -54,7 +68,7 @@ We need a layer of indirection between models and their tools. ## The Model Context Protocol -In November 2024, Anthropic suggested an open-source standard for connecting AI applications to external systems: The [Model Context Protocol (MCP)](https://modelcontextprotocol.io/docs/getting-started/intro). +In November 2024, Anthropic suggested an open-source standard for connecting AI applications to external systems: The [Model Context Protocol (MCP)](https://modelcontextprotocol.io/docs/getting-started/intro). It aims to be the USB-C for tool calling, and more. MCP defines a set of context [primitives](https://modelcontextprotocol.io/specification/draft/server) that are implemented as server features. @@ -68,30 +82,31 @@ Beyond server features, MCP defines client-hosted features that servers can call These bidirectional features are possible because MCP is designed as an inherently bidirectional protocol based on [JSON-RPC](https://www.jsonrpc.org/specification). -MCP is architected as two [layers](https://modelcontextprotocol.io/docs/learn/architecture#layers): the Transport layer and the Data layer. Multiple interchangeable transports ([stdio](https://modelcontextprotocol.io/specification/2025-06-18/basic/transports#stdio), [Streamable HTTP](https://modelcontextprotocol.io/specification/2025-06-18/basic/transports#streamable-http), additional custom transports) can be used to serve the same underlying features. +MCP is architected as two [layers](https://modelcontextprotocol.io/docs/learn/architecture#layers): the Transport layer and the Data layer. Multiple interchangeable transports can be used to serve the same underlying features. The [stdio](https://modelcontextprotocol.io/specification/2025-06-18/basic/transports#stdio) transport allows a client to launch a local MCP server as a subprocess. The [Streamable HTTP](https://modelcontextprotocol.io/specification/2025-06-18/basic/transports#streamable-http) transport allows multiple clients to connect to a potentially remote MCP server, with support for sessions and authorization. Additional [custom transports](https://modelcontextprotocol.io/specification/2025-06-18/basic/transports#custom-transports) can be implemented to support unique needs. Since its release, MCP has become the only tool calling protocol with near-consensus adoption and broad client support. It continues to attract interest from both individual developers and organizations. For example, AWS [joined](https://aws.amazon.com/blogs/opensource/open-protocols-for-agent-interoperability-part-1-inter-agent-communication-on-mcp/) the MCP steering committee earlier this year in May, and OpenAI's new [apps](https://openai.com/index/introducing-apps-in-chatgpt/) are MCP-based. -Many popular agents, including ChatGPT/Codex, Claude Code/Desktop, and Gemini CLI, already support MCP out-of-the-box. In addition, agent SDKs like the OpenAI Agents SDK, Google’s Agent Development Kit, and many others have adopted MCP either as their core tool calling mechanism or else as a first-class option. Inference APIs like the OpenAI [Responses API](https://platform.openai.com/docs/api-reference/responses/create#responses_create-tools) have also rolled out initial support for directly using remote MCP servers during inference itself. +Many popular agents, including ChatGPT/Codex, Claude Code/Desktop, and Gemini CLI, already support MCP out of the box. In addition, agent SDKs like the OpenAI Agents SDK, Google’s Agent Development Kit, and many others have adopted MCP either as their core tool calling mechanism or else as a first-class option. + +Inference APIs like the OpenAI [Responses API](https://platform.openai.com/docs/api-reference/responses/create#responses_create-tools) have also rolled out initial support for directly calling remote MCP servers during inference itself. -So why don't we see every org implementing MCP servers to integrate their applications and data with agents? +## Challenges -## The Current State of MCP +Local MCP servers are relatively simple to implement. Official SDKs and advanced third-party frameworks are available in nearly every programming language. But local MCP servers can be an attack vector for exploiting host resources, unless they run in a sandboxed environment. Current solutions to MCP sandboxing involve running the server in a container or virtual machine. -Local MCP servers are relatively simple to implement. Official SDKs and advanced third party frameworks are available in nearly every programming language. But distributing an MCP server, either as a local installation or as a service over the network, presents a number of new challenges. +Running an MCP server as a service over the network unlocks new distribution potential but also presents new challenges: -1. Local MCP servers can be an attack vector for exploiting host resources, unless they run in a sandboxed environment. Current solutions to MCP sandboxing involve running the server in a container or virtual machine. -2. Many of MCP's advanced bidirectional features and tool discovery mechanisms are locked behind a dependency on server-managed sessions. This means that we either need servers to run as long-lived processes, keeping their session state directly onboard, or else we need to manage the infrastructure for external session state plus the server code that interacts with it, which may incur additional network latency and complexity. -3. Scaling and and response latency matter. We may not initially think of the response time of remote tool calls as being important, given inference itself (especially with thinking enabled) is generally slow anyway. But consider that in answering a single query, an agent may need to make many consecutive tool calls to one or more remote MCP servers. The latency of even a few hundred milliseconds for each tool call can quickly snowball to seconds of lag. In realtime use cases like a voice or stock trading agent, even small response delays for tool calls can translate to the success or failure of the overall interaction or goal. -4. Authorization is not straightforward to implement. The spec-compliant auth flow requires an authorizer that supports [Dynamic Client Registration](https://datatracker.ietf.org/doc/html/rfc7591). Support for a simplified flow via [OAuth Client ID Metadata Documents](https://datatracker.ietf.org/doc/draft-ietf-oauth-client-id-metadata-document/) is confirmed for the November 2025 spec release. +1. Session-enabled features over Streamable HTTP are an [open problem](https://github.com/modelcontextprotocol/python-sdk/issues/880) across MCP SDKs, with various solutions fragmented across language ecosystems or otherwise remaining unsolved. These implementations generally require either long-lived compute or specific external session backends with corresponding glue. This complicates the horizontal scalability of servers that use session-dependent bidirectional features and tool discovery mechanisms. +2. Scaling and performance matter. We may not initially think of the response time of remote tool calls as being important, given inference itself (especially with thinking / reasoning features enabled) is generally slow anyway. But consider that in answering a single query, an agent may need to make many consecutive tool calls to one or more remote MCP servers. The latency of even a few hundred milliseconds for each tool call can quickly snowball to seconds of lag. In realtime use cases like a voice or stock trading agent, even small response delays for tool calls can translate to the success or failure of the overall interaction or goal. +3. Authorization is not straightforward to implement. The spec-compliant auth flow requires an authorizer that supports [Dynamic Client Registration](https://datatracker.ietf.org/doc/html/rfc7591). Support for a simplified flow via [OAuth Client ID Metadata Documents](https://datatracker.ietf.org/doc/draft-ietf-oauth-client-id-metadata-document/) is confirmed for the November 2025 spec release. Sharing an authorizer across multiple servers is a common goal usually achieved using an HTTP gateway. -There is a unique intersection of emerging technologies that could address some of this pain and more. +To make full-featured MCP servers that are safe, fast, and composable, we'd like an efficient sandbox plus a way build servers within that sandbox from reusable building blocks. This is exactly what the WebAssembly component model gives us. ## WebAssembly Components While WebAssembly (Wasm) is commonly thought of as a browser technology, it has evolved into a versatile platform for building applications more generally. Self-contained binaries can be compiled from various programming languages and run portably and efficiently across a range of host devices while remaining sandboxed from host resources. This sandboxing capability presents a lighter alternative to containers and virtual machines that works without having to bundle layers of the operating system and its dependencies. -The Wasm [component model](https://component-model.bytecodealliance.org/) builds on these strengths to implement a broad-reaching architecture for building interoperable WebAssembly libraries, applications, and environments. Wasm components within a single sandboxed process are further isolated from each other, with interop only allowed through explicit interfaces. A visual analogy for this idea might look like a bento box. +The Wasm [component model](https://component-model.bytecodealliance.org/) builds on these strengths to implement a broad-reaching architecture for building interoperable WebAssembly libraries, applications, and environments. Wasm components within a single sandboxed process are further isolated from each other and interop only through explicit interfaces. A visual analogy for this idea might look like a bento box (independent compartments sharing a box but not mixing contents). The component model shares architectural similarities with MCP’s [server design principles](https://modelcontextprotocol.io/specification/2025-06-18/architecture#design-principles): @@ -102,27 +117,29 @@ The component model shares architectural similarities with MCP’s [server desig Imagine mapping individual MCP features to Wasm components, which can be composed together to form a complete MCP server component. -We can push, pull, and compose component binaries from OCI registries just like container images. But unlike containers or virtual machines, components encapsulate only their own functionality and can be mere kilobytes in size. Full servers can weigh in under 1MB. - -Wasm components can be served from the network edge, or run directly on edge devices themselves. - -Session-enabled features over Streamable HTTP are an [open problem](https://github.com/modelcontextprotocol/python-sdk/issues/880) across existing MCP SDKs, with various solutions fragmented across language ecosystems or otherwise remaining unsolved. These implementations generally require either long-lived compute or specific external session backends with corresponding glue. [WASI](https://wasi.dev/) interfaces like [wasi:keyvalue](https://github.com/WebAssembly/wasi-keyvalue), which Wasm runtimes and platforms can implement, allow for session-enabled features to be implemented portably without being tied to some particular implementation of an external session state bucket. - But first we need the tooling to make this possible. While existing MCP server SDKs are increasingly compatible with WebAssembly runtimes, they do not take advantage of the strengths of the component model. This is where [wasmcp](https://github.com/wasmcp/wasmcp) comes in. ## [Wasmcp](https://github.com/wasmcp/wasmcp) -Wasmcp isn’t a runtime, and it’s not exactly an SDK. It is a collection of WebAssembly components and tooling that work together to function as a polyglot framework for authoring and composing MCP features as WebAssembly components. The result of this composition is an MCP server as a single component binary. +Wasmcp isn’t a runtime, and it’s not exactly an SDK. It is a polyglot framework for developing and composing MCP servers from a collection of WebAssembly components. + +The result of composition is a standalone MCP server as a single WebAssembly component binary that can be deployed to any runtime that supports WebAssembly components. -Many MCP composition patterns that would otherwise require external networked gateways become possible in memory within a single binary via composition. +Many composition patterns that would normally require external gateways can instead happen in-memory by composing component binaries inside a single sandboxed process. That means less glue, fewer moving parts, and fewer network hops. With wasmcp we can implement a polyglot MCP server composed of Python tools that use [Pandas](https://pandas.pydata.org/), TypeScript tools that use [Zod](https://zod.dev/), and performance-critical tools or [Regorus](https://github.com/microsoft/regorus)-enabled authorization middleware in Rust. -We can also interchangeably compose different transports and middlewares into the server binary. +We can also interchangeably compose different transports, authorizers, and middleware into the server binary. + +We can push and pull component binaries from OCI registries just like container images. But unlike containers or virtual machines, components encapsulate only their own functionality and can be mere kilobytes in size. Full servers can weigh in under 1MB. -Because [Spin](https://github.com/spinframework/spin) implements [wasi:keyvalue](https://github.com/WebAssembly/wasi-keyvalue) at the runtime level, we get a pluggable backend for MCP session state. This means the full range of spec features, including server-sent requests and notifications, work out of the box with compatible MCP clients over the Streamable HTTP transport. +These components can be served from the network edge or run directly on edge devices themselves. + +Because runtimes like Spin implement [wasi:keyvalue](https://github.com/WebAssembly/wasi-keyvalue), wasmcp can support session-enabled features without baking in any particular external session store. We get portable sessions across runtimes rather than a hard dependency on a specific external ‘state bucket X’ service. + +This enables the full range of bidirectional and session-enabled features over both stdio and Streamable HTTP. ## Quickstart @@ -135,7 +152,7 @@ Or build it from source. cargo install --git https://github.com/wasmcp/wasmcp ``` -Open a new terminal and then scaffold out a tool component with `wasmcp new`. Only basic dependencies and build tooling from Bytecode Alliance are included. TypesScript uses [jco](https://github.com/bytecodealliance/jco), Rust uses [wit-bindgen](https://github.com/bytecodealliance/wit-bindgen), and Python uses [componentize-py](https://github.com/bytecodealliance/componentize-py). +Open a new terminal and then scaffold out a tool component with `wasmcp new`. Only basic dependencies and build tooling from Bytecode Alliance are included. TypeScript uses [jco](https://github.com/bytecodealliance/jco), Rust uses [wit-bindgen](https://github.com/bytecodealliance/wit-bindgen), and Python uses [componentize-py](https://github.com/bytecodealliance/componentize-py). Wasmcp does not include any language-specific SDKs. The [WIT](https://component-model.bytecodealliance.org/design/wit.html) language describes the framework boundary. @@ -248,7 +265,7 @@ allowed_outbound_hosts = [] # Update for outbound HTTP spin up --from spin.toml ``` -These AI applications are just some of the many that can use this MCP server to extend their capabilities: +These AI applications are some of the many that can use our new MCP server to extend their capabilities: * [Antigravity](https://antigravity.google/docs/mcp) * [ChatGPT (developer mode)](https://platform.openai.com/docs/guides/developer-mode) * [Claude Code](https://code.claude.com/docs/en/mcp) @@ -260,31 +277,31 @@ These AI applications are just some of the many that can use this MCP server to * [Visual Studio Code](https://code.visualstudio.com/docs/copilot/customization/mcp-servers) * [Zed](https://zed.dev/docs/ai/mcp) -## Runtime Portability and Deployment Targets +## Compatible Runtimes and Deployments The MCP server component we just created exports the standard [`wasi:http/incoming-handler`](https://github.com/WebAssembly/wasi-http) interface. This means any WebAssembly runtime that supports `wasi:http` can serve the component to MCP clients over the Streamable HTTP transport. -For example, we can use [`wasmtime serve`](https://github.com/bytecodealliance/wasmtime): +For example, we can use [`wasmtime serve`](https://github.com/bytecodealliance/wasmtime) which calls `wasi:http/incoming-handler`: ```shell -wasmtime serve -Scli server.wasm +wasmtime serve -Scli -Shttp -Skeyvalue server.wasm ``` -Our server also exports [`wasi:cli/run`](https://github.com/WebAssembly/wasi-cli), which lets it support the stdio MCP transport. +Our server also exports [`wasi:cli/run`](https://github.com/WebAssembly/wasi-cli), which lets it operate over the stdio MCP transport using `wasmtime run`: ```shell wasmtime run server.wasm ``` -To deploy an MCP server as a Wasm component over the network, we can target a Spin-compatible cloud platform like [Fermyon Wasm Functions](https://www.fermyon.com/wasm-functions), which will scale a server component efficiently across [Akamai](https://www.akamai.com/why-akamai/global-infrastructure)'s distributed network edge with application-scoped key-value storage. +To deploy an MCP server as a Wasm component over the network, we can target a Spin-compatible cloud platform like [Fermyon Wasm Functions](https://www.fermyon.com/wasm-functions), which will scale a server component horizontally across [Akamai](https://www.akamai.com/why-akamai/global-infrastructure)'s distributed network edge with application-scoped key-value storage. ``` $ spin aka deploy -Name of new app: mcp-server -Creating new app mcp-server in account my-fwf-user +Name of new app: rust-tools +Creating new app rust-tools in account my-fwf-user Note: If you would instead like to deploy to an existing app, cancel this deploy and link this workspace to the app with `spin aka app link` OK to continue? yes -Workspace linked to app mcp-server +Workspace linked to app rust-tools Waiting for app to be ready... ready App Routes: @@ -301,43 +318,43 @@ With a `spin.toml` file like the one above, we can use the `spin registry` comma ```shell echo $GHCR_PAT | spin registry login --username mygithub --password-stdin ghcr.io -spin registry push ghcr.io/mygithub/basic-utils:0.1.0 +spin registry push ghcr.io/mygithub/rust-tools:0.1.0 ``` `spin up` can automatically resolve a component from the registry. ```shell -spin up --from ghcr.io/mygithub/basic-utils:0.1.0 +spin up --from ghcr.io/mygithub/rust-tools:0.1.0 ``` We can also use [wkg](https://github.com/bytecodealliance/wasm-pkg-tools) directly to publish our server. ```shell -wkg oci push ghcr.io/mygithub/basic-utils:0.1.0 polyglot.wasm +wkg oci push ghcr.io/mygithub/rust-tools:0.1.0 server.wasm ``` Anyone with read access to this artifact can then pull the component using `wkg` to run it with their runtime of choice. ```shell -wkg oci pull ghcr.io/mygithub/basic-utils:0.1.0 -wasmtime serve -Scli mygithub:basic-utils@0.1.0.wasm +wkg oci pull ghcr.io/mygithub/rust-tools:0.1.0 +wasmtime serve -Scli mygithub:rust-tools@0.1.0.wasm ``` We can publish any individual MCP feature component, or any sequence of composed components, in the same way. -## Advanced Composition and Wasmcp Architecture +## Tool Composition -The unique advantages of the component model and wasmcp's composition architecture become apparent when adding another tool component to our server. We'll use Python this time. +The unique advantages of the component model and wasmcp's component architecture become apparent when adding another tool component to our server. We'll use Python this time. ```shell -wasmcp new python-tools –-language python +wasmcp new python-tools --language python cd python-tools make ``` ```python # python-tools/app.py -class StringsTools(exports.Tools): +class StringUtils(exports.Tools): def list_tools( self, ctx: server_handler.RequestCtx, @@ -377,23 +394,26 @@ class StringsTools(exports.Tools): ctx: server_handler.RequestCtx, request: mcp.CallToolRequest, ) -> Optional[mcp.CallToolResult]: - if not request.arguments: - return error_result("Missing tool arguments") - - try: - args = json.loads(request.arguments) - except json.JSONDecodeError as e: - return error_result(f"Invalid JSON arguments: {e}") + input_text = json.loads(request.arguments)["text"] + + def make_result(text: str) -> mcp.CallToolResult: + return mcp.CallToolResult( + content=[mcp.ContentBlock_Text(mcp.TextContent( + text=mcp.TextData_Text(text), + options=None, + ))], + is_error=None, + meta=None, + structured_content=None, + ) if request.name == "reverse": - return reverse_string(args.get("text")) - elif request.name == "uppercase": - return uppercase_string(args.get("text")) - else: - return None # We don't handle this tool + return make_result(input_text[::-1]) + if request.name == "uppercase": + return make_result(input_text.upper()) ``` -We compose our first and second tool components together by adding the paths to both tool component binaries in the `wasmcp compose server` arguments. Note that these local paths can be substituted for OCI registry artifacts. See `wasmcp compose server -–help` for details. +We compose our Python tool component together with our Rust tool component by adding the paths to both component binaries in the `wasmcp compose server` arguments. Note that these local paths can be substituted for OCI registry references. See `wasmcp compose server --help` for details. ```shell wasmcp compose server ./python-tools/python-tools.wasm ./rust-tools/target/wasm32-wasip2/release/rust_tools.wasm -o polyglot.wasm @@ -401,10 +421,10 @@ wasmcp compose server ./python-tools/python-tools.wasm ./rust-tools/target/wasm3 Run `polyglot.wasm` with `spin up`. ```shell -spin up -f polyglot.wasm +spin up --from polyglot.wasm ``` -Now our single server binary exposes four tools: `add`, `subtract`, `reverse`, and `uppercase`! Two are implemented in Python, and two in Rust. +Now our single MCP server binary exposes four tools: `add`, `subtract`, `reverse`, and `uppercase`, implemented in two different languages and composed into a single component. ### How? @@ -412,8 +432,6 @@ Server features like tools, resources, prompts, and completions are implemented `wasmcp compose server` plugs these feature components into framework middleware components and composes them together as a [chain of responsibility](https://en.wikipedia.org/wiki/Chain-of-responsibility_pattern) that implements an MCP server. -Any of the components in the chain, like the transport, can be swapped out during composition. - ``` Transport ↓ @@ -435,12 +453,16 @@ Each component: - Delegates others downstream - Merges results (e.g., combining tool lists) -This enables dynamic composition of component binaries into a single MCP server. Sequences of middleware components can be composed together to form reusable functionality that can be saved and plugged into multiple servers. +Any of the components in the chain, like the transport, can be swapped out during composition. Sequences of middleware components can be composed together to form reusable functionality that can be saved and plugged into multiple servers. Check out some [examples](https://github.com/wasmcp/wasmcp/tree/main/examples) to see advanced patterns featuring custom middleware components, session-enabled features, and SSE streaming. ## Related Projects -[Wassette](https://github.com/microsoft/wassette) is a security-oriented runtime that runs WebAssembly Components via MCP. It can dynamically load and execute components as individual tools on-demand with deeply integrated access controls. Wassette itself is not a component. It is an MCP server than runs components. +[Wassette](https://github.com/microsoft/wassette) is a security-oriented runtime that runs WebAssembly Components via MCP. It can dynamically load and execute components as individual tools on-demand with deeply integrated access controls. Wassette itself is not a component. It is an MCP server that runs components. + +Wasmcp is not an MCP server. It is a toolchain for producing an MCP server as a component that exports the standard [WASI](https://wasi.dev/) interfaces for HTTP and CLI commands. This server component runs on any runtime or platform that supports WASI and the component model. + +## Why? -Wasmcp is not an MCP server. It is a toolchain for producing an MCP server as a component that exports the standard [WASI](https://wasi.dev/) interfaces for HTTP and CLI commands: [`wasi:http/incoming-handler`](https://github.com/WebAssembly/wasi-http) and [`wasi:cli/run`](https://github.com/WebAssembly/wasi-cli). This server component runs on any runtime or platform that supports WASI and the component model. +We built wasmcp because we want to run agent-facing applications at scale in a future where MCP is the foundation for distributed intelligent systems. That means enabling powerful new MCP servers that are first-class applications rather than just proxies for REST APIs. Wasmcp is a step toward a polyglot AI application architecture that works consistently across local, cloud, and self-hosted platforms. From 867de70491631dfb30a4b543942310822c22db59 Mon Sep 17 00:00:00 2001 From: bowlofarugula Date: Sun, 23 Nov 2025 23:51:16 -0800 Subject: [PATCH 20/21] fix some command blocks Signed-off-by: bowlofarugula --- content/blog/mcp-with-wasmcp.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/content/blog/mcp-with-wasmcp.md b/content/blog/mcp-with-wasmcp.md index 72664d50..28426a6e 100644 --- a/content/blog/mcp-with-wasmcp.md +++ b/content/blog/mcp-with-wasmcp.md @@ -100,13 +100,13 @@ Running an MCP server as a service over the network unlocks new distribution pot 2. Scaling and performance matter. We may not initially think of the response time of remote tool calls as being important, given inference itself (especially with thinking / reasoning features enabled) is generally slow anyway. But consider that in answering a single query, an agent may need to make many consecutive tool calls to one or more remote MCP servers. The latency of even a few hundred milliseconds for each tool call can quickly snowball to seconds of lag. In realtime use cases like a voice or stock trading agent, even small response delays for tool calls can translate to the success or failure of the overall interaction or goal. 3. Authorization is not straightforward to implement. The spec-compliant auth flow requires an authorizer that supports [Dynamic Client Registration](https://datatracker.ietf.org/doc/html/rfc7591). Support for a simplified flow via [OAuth Client ID Metadata Documents](https://datatracker.ietf.org/doc/draft-ietf-oauth-client-id-metadata-document/) is confirmed for the November 2025 spec release. Sharing an authorizer across multiple servers is a common goal usually achieved using an HTTP gateway. -To make full-featured MCP servers that are safe, fast, and composable, we'd like an efficient sandbox plus a way build servers within that sandbox from reusable building blocks. This is exactly what the WebAssembly component model gives us. +Next, we introduce a new stack that addresses some of these problems and paves a path to full-featured MCP servers that are sandboxed, fast, composable, and deployable on scalable infrastructure. ## WebAssembly Components While WebAssembly (Wasm) is commonly thought of as a browser technology, it has evolved into a versatile platform for building applications more generally. Self-contained binaries can be compiled from various programming languages and run portably and efficiently across a range of host devices while remaining sandboxed from host resources. This sandboxing capability presents a lighter alternative to containers and virtual machines that works without having to bundle layers of the operating system and its dependencies. -The Wasm [component model](https://component-model.bytecodealliance.org/) builds on these strengths to implement a broad-reaching architecture for building interoperable WebAssembly libraries, applications, and environments. Wasm components within a single sandboxed process are further isolated from each other and interop only through explicit interfaces. A visual analogy for this idea might look like a bento box (independent compartments sharing a box but not mixing contents). +The Wasm [component model](https://component-model.bytecodealliance.org/) builds on these strengths to implement a broad-reaching architecture for building interoperable WebAssembly libraries, applications, and environments. Wasm components within a single sandboxed process are further isolated from each other and interop only through explicit interfaces. A visual analogy for this idea might look like a bento box (independent compartments sharing a box but not contents unless you decide to mix them). The component model shares architectural similarities with MCP’s [server design principles](https://modelcontextprotocol.io/specification/2025-06-18/architecture#design-principles): @@ -284,13 +284,14 @@ The MCP server component we just created exports the standard [`wasi:http/incomi For example, we can use [`wasmtime serve`](https://github.com/bytecodealliance/wasmtime) which calls `wasi:http/incoming-handler`: ```shell -wasmtime serve -Scli -Shttp -Skeyvalue server.wasm +wasmcp compose server target/wasm32-wasip2/release/rust_tools.wasm -o wasmtime-server.wasm --runtime wasmtime +wasmtime serve -Scli -Shttp -Skeyvalue wasmtime-server.wasm ``` Our server also exports [`wasi:cli/run`](https://github.com/WebAssembly/wasi-cli), which lets it operate over the stdio MCP transport using `wasmtime run`: ```shell -wasmtime run server.wasm +wasmtime run -Scli -Skeyvalue -Shttp wasmtime-server.wasm ``` To deploy an MCP server as a Wasm component over the network, we can target a Spin-compatible cloud platform like [Fermyon Wasm Functions](https://www.fermyon.com/wasm-functions), which will scale a server component horizontally across [Akamai](https://www.akamai.com/why-akamai/global-infrastructure)'s distributed network edge with application-scoped key-value storage. @@ -416,6 +417,7 @@ class StringUtils(exports.Tools): We compose our Python tool component together with our Rust tool component by adding the paths to both component binaries in the `wasmcp compose server` arguments. Note that these local paths can be substituted for OCI registry references. See `wasmcp compose server --help` for details. ```shell +cd .. wasmcp compose server ./python-tools/python-tools.wasm ./rust-tools/target/wasm32-wasip2/release/rust_tools.wasm -o polyglot.wasm ``` From 9ba302c7d9d173b4309dad819dcfd21c6a6e4e23 Mon Sep 17 00:00:00 2001 From: bowlofarugula Date: Mon, 24 Nov 2025 17:15:27 -0800 Subject: [PATCH 21/21] Apply suggestions from code review Signed-off-by: bowlofarugula --- content/blog/mcp-with-wasmcp.md | 24 +++++++++++------------- 1 file changed, 11 insertions(+), 13 deletions(-) diff --git a/content/blog/mcp-with-wasmcp.md b/content/blog/mcp-with-wasmcp.md index 28426a6e..61c15457 100644 --- a/content/blog/mcp-with-wasmcp.md +++ b/content/blog/mcp-with-wasmcp.md @@ -1,5 +1,5 @@ title = "Build MCP Servers with Wasmcp and Spin" -date = "2025-11-20T10:15:47Z" +date = "2025-11-25T10:15:47Z" template = "blog_post" description = "Introducing a new approach to building MCP servers on the WebAssembly component model." tags = ["agents", "ai", "llm", "mcp", "model"] @@ -10,14 +10,14 @@ author = "Ian McDonald" --- -[Wasmcp](https://github.com/wasmcp/wasmcp) is a [WebAssembly Component](https://component-model.bytecodealliance.org/) Development Kit for the [Model Context Protocol](https://modelcontextprotocol.io/docs/getting-started/intro). +[Wasmcp](https://github.com/wasmcp/wasmcp) is a [WebAssembly component](https://component-model.bytecodealliance.org/) development kit for the [Model Context Protocol](https://modelcontextprotocol.io/docs/getting-started/intro). It works with [Spin](https://github.com/spinframework/spin) to let you: * Build composable MCP servers as WebAssembly components. * Mix tools and features written in Rust, Python, TypeScript, etc. in a single server binary. * Plug in shared components for authorization, sessions, logging, and more across multiple MCP servers. -* Run the same sandboxed MCP server binary locally, on [Fermyon Wasm Functions](https://www.fermyon.com/wasm-functions), on Kubernetes clusters (e.g. via [SpinKube](https://www.spinkube.dev/)), or on any runtime that speaks WASI + components. +* Run the same sandboxed MCP server binary locally, on the network edge via [Fermyon Wasm Functions](https://www.fermyon.com/wasm-functions), on Kubernetes clusters (e.g. via [SpinKube](https://www.spinkube.dev/)), or on any runtime that speaks WASI + components. * Expose both stdio and Streamable HTTP transports via standard [WASI](https://wasi.dev/) exports. See the [quickstart](#quickstart) or read on for some context. @@ -38,7 +38,7 @@ Large language models (LLMs) are trained on vast heaps of data that they use to All LLMs depend on calling external [functions](https://gorilla.cs.berkeley.edu/leaderboard.html), also called tools, to interact with the outside world beyond the prompt and to perform deterministic actions. Just like you might use a calculator to accurately crunch numbers, or a web browser to explore the internet, an LLM might use its own calculator and HTTP fetch tools in the same way. Even basic capabilities like reading a file from disk are implemented via tools. -Without tools a language model is like someone sitting in an empty, windowless box with only their memories from an array of random encyclopedias, books, and other training data to pull from. Our interactions with them are something along the lines of: A human slips a question written on a piece of paper under the door for the model to read, and the model slips back a response using only their prior knowledge and imagination. +Without tools, a language model is like someone sitting in an empty, windowless box with only their memories from an array of random encyclopedias, books, and other training data to pull from. Our interactions with them are something along the lines of: A human slips a question written on a piece of paper under the door for the model to read, and the model slips back a response using only their prior knowledge and imagination. That's a long way from the promise of autonomous systems that understand and act on the world in realtime, let alone transform it. @@ -60,9 +60,7 @@ We’d need to write a new implementation of each tool for OpenAI’s GPT models We want to implement a given tool only once and make it discoverable and accessible dynamically for any AI application, potentially across the network, at scale. -The [Fundamental Theorem of Software Engineering](https://en.wikipedia.org/wiki/Fundamental_theorem_of_software_engineering) states: - -> We can solve any problem by introducing an extra level of [indirection](https://en.wikipedia.org/wiki/Indirection). +The [Fundamental Theorem of Software Engineering](https://en.wikipedia.org/wiki/Fundamental_theorem_of_software_engineering) states: "We can solve any problem by introducing an extra level of [indirection](https://en.wikipedia.org/wiki/Indirection)." We need a layer of indirection between models and their tools. @@ -70,7 +68,7 @@ We need a layer of indirection between models and their tools. In November 2024, Anthropic suggested an open-source standard for connecting AI applications to external systems: The [Model Context Protocol (MCP)](https://modelcontextprotocol.io/docs/getting-started/intro). It aims to be the USB-C for tool calling, and more. -MCP defines a set of context [primitives](https://modelcontextprotocol.io/specification/draft/server) that are implemented as server features. +MCP defines a set of context [primitives](https://modelcontextprotocol.io/specification/draft/server) that are implemented as server features. The following table from [MCP's documentation](https://modelcontextprotocol.io/specification/draft/server) summarizes the scope of each primitive. | Primitive | Control | Description | Example | | --------- | ---------------------- | -------------------------------------------------- | ------------------------------- | @@ -108,12 +106,12 @@ While WebAssembly (Wasm) is commonly thought of as a browser technology, it has The Wasm [component model](https://component-model.bytecodealliance.org/) builds on these strengths to implement a broad-reaching architecture for building interoperable WebAssembly libraries, applications, and environments. Wasm components within a single sandboxed process are further isolated from each other and interop only through explicit interfaces. A visual analogy for this idea might look like a bento box (independent compartments sharing a box but not contents unless you decide to mix them). -The component model shares architectural similarities with MCP’s [server design principles](https://modelcontextprotocol.io/specification/2025-06-18/architecture#design-principles): +The component model shares architectural similarities with MCP’s [server design principles](https://modelcontextprotocol.io/specification/2025-06-18/architecture#design-principles), which are quoted below. -> 1. Servers should be extremely easy to build -> 2. Servers should be highly composable -> 3. Servers should not be able to read the whole conversation, nor “see into” other servers -> 4. Features can be added to servers and clients progressively +1. Servers should be extremely easy to build +2. Servers should be highly composable +3. Servers should not be able to read the whole conversation, nor “see into” other servers +4. Features can be added to servers and clients progressively Imagine mapping individual MCP features to Wasm components, which can be composed together to form a complete MCP server component.