|
| 1 | ++++ |
| 2 | +title = "Spec‑Driven SDKs in the Age of Generative AI" |
| 3 | +description = "Walk the middle path: use specs to bound AI, and AI to accelerate specs. An ACP‑based SDK with routing, golden tests, and auditability." |
| 4 | +date = 2025-10-15 |
| 5 | +slug = "spec-driven-sdks" |
| 6 | + |
| 7 | +[taxonomies] |
| 8 | +tags = ["AI", "SDK", "Spec", "ACP", "Agent", "MCP"] |
| 9 | +categories = ["engineering"] |
| 10 | + |
| 11 | +[extra] |
| 12 | +lang = "en" |
| 13 | ++++ |
| 14 | + |
| 15 | +> AI writes fast — and tends to bloat. |
| 16 | +> Specs are strict — and can ossify. |
| 17 | +> |
| 18 | +> I choose the middle path — |
| 19 | +> let AI run inside boundaries, and let humans own outcomes. |
| 20 | +
|
| 21 | +## Why Not Pure AI, Nor Pure Spec |
| 22 | + |
| 23 | +AI‑assisted coding is addictive, but as projects grow, context drifts and complexity balloons. |
| 24 | +You get something that runs, but barely holds. |
| 25 | + |
| 26 | +On the other hand, pure spec‑driven frameworks — like GitHub’s [Spec Kit](https://github.com/github/spec-kit) — are often heavy for iterative work. |
| 27 | +Precise, but hard to live with. |
| 28 | + |
| 29 | +So I took the middle ground: spec‑first boundaries, AI‑assisted speed — using the Agent Client Protocol (ACP). |
| 30 | + |
| 31 | +## What Makes ACP a Good Base |
| 32 | + |
| 33 | +ACP, started by Zed, defines how editors and agents talk. |
| 34 | +It uses JSON‑RPC 2.0 and is described in JSON Schema — clear edges, machine‑readable types, auditable behaviors. |
| 35 | +That balance makes it ideal for SDKs that must live across languages and over time. |
| 36 | + |
| 37 | +I applied this while maintaining the ACP [Python SDK](https://github.com/psiace/agent-client-protocol-python) for the [Agent Client Protocol](https://agentclientprotocol.com). |
| 38 | + |
| 39 | +## How I Built the Python SDK |
| 40 | + |
| 41 | +### Stage 0 — Minimal, but Runnable |
| 42 | + |
| 43 | +I relied heavily on a coding agent (Codex) with human‑set boundaries and review. Three constraints: |
| 44 | + |
| 45 | +1. Agent‑generated Pydantic models from the datamodel. |
| 46 | +2. Minimal I/O (JSON‑RPC over stdio), with agent assistance. |
| 47 | +3. Agent‑ported examples and basic tests, human‑verified for regression. |
| 48 | + |
| 49 | +Small on purpose: ship first, surface problems early. |
| 50 | + |
| 51 | +### Stage 1 — Learning from Feedback |
| 52 | + |
| 53 | +Early users found naming awkward and ergonomics rough. |
| 54 | +I had the agent scan docs to produce a rename map (more Pythonic names), then borrowed ideas from other SDKs: |
| 55 | + |
| 56 | +- TypeScript had helpers. |
| 57 | +- Go had golden tests. |
| 58 | +- Rust had clean modular structure. |
| 59 | +- Protocol SDKs like MCP offered architecture hints. |
| 60 | + |
| 61 | +With agent assistance I restructured code and tests; I set the standards and did the reviews. |
| 62 | + |
| 63 | +### Stage 2 — Routing over Conditionals |
| 64 | + |
| 65 | +I didn’t want endless if‑trees like `if method == "..."`. |
| 66 | +So I wrote a compact router that makes method→handler first‑class: |
| 67 | + |
| 68 | +```python |
| 69 | +builder = RouterBuilder() |
| 70 | + |
| 71 | +builder.request_attr(CLIENT_METHODS["fs_write_text_file"], WriteTextFileRequest, client, "writeTextFile") |
| 72 | +builder.request_attr(CLIENT_METHODS["fs_read_text_file"], ReadTextFileRequest, client, "readTextFile") |
| 73 | +``` |
| 74 | + |
| 75 | +Registered handlers are pluggable, testable, and auditable. |
| 76 | +By v0.4.9 the SDK wasn’t big, but it was stable and teachable. |
| 77 | + |
| 78 | +### Context curation — feed the agent only what matters |
| 79 | + |
| 80 | +I treat prompt context as an engineering budget: |
| 81 | + |
| 82 | +- Context manifest: list only schemas, public interfaces, and minimal examples; exclude generated/vendor code. |
| 83 | +- Diff‑aware prompting: provide changed hunks plus a small window (5–15 lines), not whole files. |
| 84 | +- Golden snippets index: let the agent retrieve tagged canonical examples instead of pasting large blobs. |
| 85 | +- Guardrails first: “Do not invent fields beyond schema; fail closed; ask for missing inputs.” |
| 86 | + |
| 87 | +## Principles for Spec‑Driven, AI‑Assisted SDKs |
| 88 | + |
| 89 | +| Principle | Description | |
| 90 | +| :-- | :-- | |
| 91 | +| Separate generated vs human‑governed code | Models are generated; business logic is human‑owned and reviewed. Never mix. | |
| 92 | +| Ship the smallest working unit first | Real feedback drives ergonomics. | |
| 93 | +| Golden tests for major RPCs | Capture and replay requests/responses as the regression and audit baseline. | |
| 94 | +| Curate and budget context | Feed only what’s necessary — small, precise, reproducible. | |
| 95 | + |
| 96 | +Specs give AI a sandbox — and AI makes specs come alive. |
| 97 | + |
| 98 | +## Accountability: Why I Trust AI‑Written Code |
| 99 | + |
| 100 | +AI can write, but I set the boundaries. |
| 101 | +Typed models, reproducible tests, and replayable traces let me verify every step. |
| 102 | +I trust not the model, but the engineering discipline around it. |
| 103 | + |
| 104 | +## A Note in My PRs |
| 105 | + |
| 106 | +I first saw this note in @Xuanwo’s PRs. I haven’t used it in mine yet, but I’m willing to adopt it — because I’m accountable for the outcome. In this SDK work I used AI heavily; I set the boundaries and own the result: |
| 107 | + |
| 108 | +> This PR was primarily authored with GPT‑5‑Codex and hand‑reviewed by me. |
| 109 | +> I’m responsible for every line that lands here. |
| 110 | +
|
| 111 | +This isn’t about showing off AI — it’s about owning the outcome. |
| 112 | + |
| 113 | +## Closing: Fuel and Rails |
| 114 | + |
| 115 | +AI is the fuel. |
| 116 | +Spec is the rail. |
| 117 | +Fuel gives speed; rails keep you alive. |
| 118 | + |
| 119 | +I like this balance — AI writes, I define boundaries. |
| 120 | +Every line explainable, every bug reproducible. |
| 121 | +That’s how to build responsibly in the AI age. |
| 122 | + |
| 123 | +— **PsiACE** ([GitHub](https://github.com/PsiACE) · [Apache OpenDAL PMC Member](https://opendal.apache.org/)) |
0 commit comments