Skip to content

Conversation

@caspervonb
Copy link
Collaborator

@caspervonb caspervonb commented Sep 22, 2025

This adds a standalone nats.client package

Main differences in nats.client compared to nats.aio

  • Asynchronous first
  • Substantially faster than nats.aio
  • Proper headers, with support for multi-values.
  • JetStream unaware as a feature.

@caspervonb caspervonb force-pushed the add-nats-client-package branch from 381af5b to 781a9bf Compare September 22, 2025 12:24
@caspervonb caspervonb requested a review from Copilot September 22, 2025 12:28
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new NATS client package with significant performance improvements over the existing nats.aio client. The implementation provides core NATS messaging functionality including publish/subscribe, request/reply, queue groups, and message headers through an asyncio-based Python client.

  • Introduces a complete NATS client implementation with high-level API
  • Adds comprehensive test coverage for all client features and edge cases
  • Includes performance benchmarking tools demonstrating 35x improvement for small messages

Reviewed Changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
nats-client/src/nats/client/init.py Main client implementation with connection management, messaging, and reconnection logic
nats-client/src/nats/client/subscription.py Subscription class supporting async iteration and context management
nats-client/src/nats/client/protocol/message.py NATS protocol message parsing with support for MSG, HMSG, and control messages
nats-client/src/nats/client/protocol/command.py Command encoding for NATS protocol operations (PUB, SUB, CONNECT, etc.)
nats-client/src/nats/client/message.py Message data structures including Headers, Status, and Message classes
nats-client/tests/ Comprehensive test suite covering client functionality, subscriptions, and protocol handling
nats-client/tools/bench.py Benchmarking tool for performance testing

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@caspervonb caspervonb force-pushed the add-nats-server-package branch 2 times, most recently from 83d0f80 to 44fb982 Compare September 22, 2025 13:14
@caspervonb caspervonb force-pushed the add-nats-client-package branch from 781a9bf to 9a8cd71 Compare September 23, 2025 08:13
@wallyqs
Copy link
Member

wallyqs commented Sep 23, 2025

Move this to be under nats.experimental.client to be able to use both while still evaluating the client.

@caspervonb caspervonb mentioned this pull request Sep 23, 2025
13 tasks
Copy link
Member

@philpennock philpennock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anything in the SSL context init to handle the server identity for TLS identity verification, when we learnt a reconnect address as an IP from the INFO line?

If we connect with a hostname originally, we validate that hostname in the cert, and if we reconnect to a learnt IP address, we validate that same original hostname as present in the new server cert. Did I just miss the handling of that?

@caspervonb caspervonb force-pushed the add-nats-server-package branch 3 times, most recently from accfd1c to 20963ae Compare October 3, 2025 10:28
@caspervonb caspervonb changed the base branch from add-nats-server-package to main October 3, 2025 11:41
@caspervonb caspervonb changed the base branch from main to migrate-project-to-uv-workspace October 3, 2025 18:06
@caspervonb caspervonb force-pushed the add-nats-client-package branch from 9a8cd71 to 0094dba Compare October 3, 2025 18:09
This was referenced Oct 5, 2025
@caspervonb caspervonb requested a review from Copilot October 5, 2025 09:08
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 21 out of 22 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

.github/workflows/test.yml:1

  • Git merge conflict markers are present in the CI configuration file. These need to be resolved before merging.
name: test

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@caspervonb caspervonb requested a review from Copilot October 7, 2025 09:58
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 21 out of 22 changed files in this pull request and generated no new comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@caspervonb caspervonb changed the base branch from migrate-project-to-uv-workspace to main October 7, 2025 10:50
@caspervonb caspervonb force-pushed the add-nats-client-package branch 2 times, most recently from 16d8621 to 7acfd1f Compare October 7, 2025 10:52
wallyqs
wallyqs previously requested changes Oct 7, 2025
Copy link
Member

@wallyqs wallyqs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change the import to nats.experimental.aio.client while iterating on the implementation, then can work using that namespace in the other branches with the JetStream changes.

@caspervonb
Copy link
Collaborator Author

Change the import to nats.experimental.aio.client while iterating on the implementation, then can work using that namespace in the other branches with the JetStream changes.

So nats-experimental-aio-client as a package name? doesn't exactly roll of the tongue so not sure about that.

Also would make all the currently pending PRs painful to merge and also the ones that aren't opened yet (e.g client auth, client tls, jetstream ordered consumer). Not sure what the benefit is of such a rename?

@caspervonb caspervonb force-pushed the add-nats-client-package branch 2 times, most recently from 2e13fe5 to 3cd0ce1 Compare October 8, 2025 20:12

for callback in subscription._callbacks:
try:
callback(msg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

callback runs in place in the read loop?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, only synchronous callbacks allowed. If user wants to perform an asynchronous task, up to them to spawn one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so if you have two subscriptions, one callback processing will cause head of line blocking on the other subscription no? unless you create a task per msg so that it doesn't?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't await as it introduces a bunch of overhead, will basically stall everything to a halt. If the user chooses to do heavy processing in a callback, then they are responsible for scheduling that in a way that makes sense.

Ideally, you will use async for await, callbacks are for dispatching of messages as they arrive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I understand how this is expected to be used, can you try to implement this example?

import asyncio
import time
import nats

async def main():
    nc = await nats.connect("nats://localhost:4222")

    async def cb1(msg):
        print(time.time_ns(), "      [1] --->", msg)
        await asyncio.sleep(5)

    sub2 = await nc.subscribe("foo", cb=cb1)

    async def cb2(msg):
        print(time.time_ns(), "[2] --->", msg)
        await asyncio.sleep(1)

    sub2 = await nc.subscribe("bar", cb=cb2)

    async def task1():
        while True:
            await nc.publish("foo", b'1')
            await asyncio.sleep(1)

    async def task2():
        while True:
            await nc.publish("bar", b'2')
            await asyncio.sleep(0.5)

    t1 = asyncio.create_task(task1())
    t2 = asyncio.create_task(task2())

    await asyncio.sleep(120)
            
    await nc.close()

if __name__ == "__main__":
    asyncio.run(main())

I just see a lot of blocking happening using this model, starting point for example:

import asyncio
import time
from nats.client import connect

async def main():
    nc = await connect("nats://localhost:4222")

    def cb1(msg):
        print(time.time_ns(), "     [1] --->", msg)
        # can't async await or do i/o since that blocks the event loop
        time.sleep(5)
        
    sub2 = await nc.subscribe("foo", callback=cb1)

    def cb2(msg):
        print(time.time_ns(), "[2] --->", msg)
        # e.g. do some work
        for i in range(1, 50000000):
            pass

    sub2 = await nc.subscribe("foo", callback=cb2)

    async def task1():
        while True:
            await nc.publish("foo", b'1')
            await asyncio.sleep(1)

    async def task2():
        while True:
            await nc.publish("foo", b'2')
            await asyncio.sleep(0.5)

    t1 = asyncio.create_task(task1())
    t2 = asyncio.create_task(task2())

    await asyncio.sleep(120)
            
    await nc.close()

if __name__ == "__main__":
    asyncio.run(main())

Copy link
Collaborator Author

@caspervonb caspervonb Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is using sub.messages, same bench script for both clients.
Using sub.next_msg is about 2x slower.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you tried the nats/benchmark/sub_perf.py in the repo?
I get this for example:

# nats bench pub test --size 64
python3 nats/benchmark/sub_perf.py --servers nats://localhost:4222
Waiting for 100000 messages on [test]...
****************************************************************************************************
Test completed : 337487.6190604489 msgs/sec sent
Received 100000 messages (337487.6190604489 msgs/sec)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same changes just addressing async sub iterator: #752

# nats bench pub test --size 64
uv run nats/benchmark/sub_perf_messages.py --servers nats://localhost:4222
Waiting for 100000 messages on [test]...
****************************************************************************************************
Test completed : 316181.45606272004 msgs/sec sent
Received 100000 messages (316181.45606272004 msgs/sec)

Copy link
Collaborator Author

@caspervonb caspervonb Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just sub tho? the above benchmark is pub sub, see tools/bench.py, you can run both clients with it.

Tried running with the patch and got 8000 messages per sec (slow consumers triggered).

Copy link
Member

@wallyqs wallyqs Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are you running exactly? I get this with that script:

uv run tools/bench.py --client aio --msgs 1000000 --size 128 --pub --sub
Starting pub/sub benchmark with nats.aio [msgs=1,000,000, size=128 B]

Publisher results: 
Test completed: 1,000,000 messages, 128,000,000 bytes, 4.44 seconds
  Throughput: 224,990 msgs/sec, 27.46 MB/sec
  Latency: (min/avg/max/std) = 0.00/0.00/60.50/0.42 ms

Subscriber results: 
Test completed: 1,000,000 messages, 128,000,000 bytes, 4.51 seconds
  Throughput: 221,720 msgs/sec, 27.07 MB/sec
  Latency: (min/avg/max/std) = 93.17/2306.20/4603.37/1284.93 ms

logger.exception("Error in subscription callback")

try:
await subscription.queue.put(msg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same with callback dispatch, isn't this blocking the read loop? I guess no since it is unbound?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With an unbound queue, not an issue. With slow consumer wiring it will eventually be put_nowait.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is client slow consumer handled?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently isn't.

else:
logger.debug("->> SUB %s %s", subject, sid)

await self._connection.write(command)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

subs are sync writes?

Copy link
Collaborator Author

@caspervonb caspervonb Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, PUB and HPUB are buffered.
PING, PONG, SUB are unbuffered.

@caspervonb caspervonb requested a review from Copilot October 10, 2025 10:52
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 21 out of 22 changed files in this pull request and generated 5 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@caspervonb
Copy link
Collaborator Author

caspervonb commented Oct 10, 2025

is nats-experimental-aio-client the folder? can't have the workspace to be underexperimental/aio/client as a workspace then make the import as import nats.experimental.aio.client?

It can be either @wallyqs, but using versions is a better way to signal stability than changing package and namespace names imo.

And again, I still have a lot of work to push up here, not limited to but including auth and mtls. Other pull requests like everything on jetstream that are here now are stuck in rebase hell.

@wallyqs
Copy link
Member

wallyqs commented Oct 11, 2025

I still have a lot of work to push up here, not limited to but including auth and mtls. Other pull requests that are here now are stuck in rebase hell.

@caspervonb there doesn't have to be, honestly I'm not finding the rewrite sound yet, although there are good ideas and like the repo structure they could have been backported to what we have in nats-py and instead close some of the opened issues as I mentioned a few times already.

@caspervonb caspervonb force-pushed the add-nats-client-package branch from bc5924c to 777a0e0 Compare October 13, 2025 21:03
@caspervonb
Copy link
Collaborator Author

Removed performance section from README, let's generate this data from CI instead.

@caspervonb caspervonb requested a review from wallyqs October 15, 2025 09:57
@caspervonb caspervonb force-pushed the add-nats-client-package branch 4 times, most recently from 6eb24dc to bd8a5f0 Compare October 23, 2025 11:23
@caspervonb caspervonb requested a review from Copilot November 11, 2025 22:54
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 56 out of 57 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@caspervonb
Copy link
Collaborator Author

caspervonb commented Nov 14, 2025

Performance Benchmarks

I did some further optimization and comparison.
These are from running publisher and subscriber in the same process.

Test Environment

Hardware:

  • Model: MacBook Pro (Mac15,11)
  • Chip: Apple M3 Max
  • Cores: 14 (10 performance + 4 efficiency)
  • Memory: 36 GB
  • OS: macOS Darwin 24.0.0

Software:

  • CPython: 3.13.7
  • PyPy: 3.10.14 (PyPy 7.3.17)
  • NATS Server: localhost:4222

Test Parameters:

  • Messages per test: 1,000,000
  • Message sizes: 8 B, 128 B, 1 KB, 8 KB
  • Timeout: 60 seconds for slow consumers
  • Modes: Queue-based and Callback-based subscriptions

Complete Results Table (Ordered by Subscriber Performance)

Runtime Client Mode Size Pub Rate (msg/s) Pub MB/s Sub Rate (msg/s) Sub MB/s Received Dropped Drop %
PyPy nats-client callback 8 B 4,691,949 35.80 4,615,995 🥇 35.22 1,000,000 0 0%
PyPy nats-client queue 8 B 4,565,524 34.83 2,336,040 17.82 1,000,000 0 0%
PyPy nats-client callback 128 B 2,459,309 300.21 2,243,722 273.89 1,000,000 0 0%
PyPy nats-client queue 128 B 1,674,313 204.38 1,402,979 171.26 1,000,000 0 0%
PyPy nats.aio callback 8 B 2,493,561 19.02 924,494 7.05 1,000,000 0 0%
CPython nats-client callback 8 B 1,579,744 12.05 880,430 6.72 1,000,000 0 0%
PyPy nats.aio callback 128 B 775,115 94.62 629,760 76.87 1,000,000 0 0%
PyPy nats-client callback 1 KB 609,447 595.16 615,474 601.05 1,000,000 0 0%
CPython nats-client callback 128 B 704,567 86.01 562,402 68.65 1,000,000 0 0%
CPython nats-client queue 8 B 1,564,495 11.94 553,636 4.22 1,000,000 0 0%
PyPy nats-client queue 1 KB 487,882 476.45 493,128 481.57 1,000,000 0 0%
CPython nats-client callback 1 KB 440,026 429.71 430,058 419.98 1,000,000 0 0%
CPython nats-client queue 128 B 544,772 66.50 418,524 51.09 1,000,000 0 0%
CPython nats-client queue 1 KB 342,981 334.94 338,365 330.43 1,000,000 0 0%
PyPy nats.aio callback 1 KB 288,049 281.30 281,563 274.96 1,000,000 0 0%
CPython nats.aio callback 128 B 250,276 30.55 250,934 30.63 1,000,000 0 0%
CPython nats.aio callback 8 B 246,990 1.88 236,046 1.80 1,000,000 0 0%
CPython nats.aio callback 1 KB 221,796 216.60 217,994 212.88 1,000,000 0 0%
CPython nats-client callback 8 KB 166,760 1302.81 168,008 1312.56 1,000,000 0 0%
CPython nats-client queue 8 KB 154,709 1208.66 157,786 1232.70 1,000,000 0 0%
CPython nats.aio callback 8 KB 121,886 952.23 121,416 948.56 1,000,000 0 0%
PyPy nats-client callback 8 KB 101,937 796.38 102,817 803.26 1,000,000 0 0%
PyPy nats-client queue 8 KB 92,085 719.42 92,806 725.05 1,000,000 0 0%
PyPy nats.aio callback 8 KB 63,666 497.39 63,503 496.12 1,000,000 0 0%
CPython nats.aio queue 8 B 118,989 0.91 8,769 0.07 524,621 475,379 47.5% ⚠️
CPython nats.aio queue 128 B 104,272 12.73 8,758 1.07 524,680 475,320 47.5% ⚠️
PyPy nats.aio queue 8 B 2,627,229 20.04 8,755 0.07 524,349 475,651 47.5% ⚠️
PyPy nats.aio queue 128 B 406,140 49.58 8,756 1.07 524,660 475,340 47.5% ⚠️
CPython nats.aio queue 1 KB 72,483 70.78 2,232 2.18 133,723 866,277 86.6% ⚠️
PyPy nats.aio queue 1 KB 131,912 128.82 2,232 2.18 133,726 866,274 86.6% ⚠️
CPython nats.aio queue 8 KB 49,323 385.34 622 4.86 37,265 962,735 96.2% ⚠️
PyPy nats.aio queue 8 KB 54,782 427.99 622 4.86 37,264 962,736 96.2% ⚠️

Key Findings

1. Performance Rankings

By Subscriber Throughput:

  1. 🥇 PyPy + nats-client callback: 103K - 4.6M msg/s (BEST)
  2. PyPy + nats-client queue: 93K - 2.3M msg/s
  3. PyPy + nats.aio callback: 64K - 924K msg/s
  4. CPython + nats-client callback: 168K - 880K msg/s
  5. CPython + nats-client queue: 158K - 554K msg/s
  6. CPython + nats.aio callback: 121K - 251K msg/s
  7. ⚠️ nats.aio queue mode: 622 - 8.8K msg/s with 47-96% message loss

2. Message Loss Analysis

nats.aio Queue Mode - Critical Issue:

  • 8 B: 47.5% dropped (475K messages lost)
  • 128 B: 47.5% dropped (475K messages lost)
  • 1 KB: 86.6% dropped (866K messages lost)
  • 8 KB: 96.2% dropped (963K messages lost)

All other configurations: 0% message loss ✅

3. Runtime Comparison

PyPy vs CPython (nats-client callback):

  • 8 B: 5.2x faster with PyPy
  • 128 B: 4.0x faster with PyPy
  • 1 KB: 1.4x faster with PyPy
  • 8 KB: 0.6x (CPython faster due to PyPy memory overhead)

4. Implementation Comparison

nats-client vs nats.aio (callback mode):

  • CPython: 1.4x - 3.7x faster
  • PyPy: 1.6x - 5.0x faster

Callback vs Queue (nats-client):

  • CPython: 1.0x - 1.6x faster
  • PyPy: 1.1x - 2.0x faster

Performance Comparison

Callback Mode (Subscriber Throughput)

Message Size CPython nats-client CPython nats.aio Speedup PyPy nats-client PyPy nats.aio Speedup
8 B 880,430 msg/s 236,046 msg/s 3.7x 4,615,995 msg/s 924,494 msg/s 5.0x
128 B 562,402 msg/s 250,934 msg/s 2.2x 2,243,722 msg/s 629,760 msg/s 3.6x
1 KB 430,058 msg/s 217,994 msg/s 2.0x 615,474 msg/s 281,563 msg/s 2.2x
8 KB 168,008 msg/s 121,416 msg/s 1.4x 102,817 msg/s 63,503 msg/s 1.6x

Queue Mode (Subscriber Throughput)

Message Size CPython nats-client CPython nats.aio Speedup PyPy nats-client PyPy nats.aio Speedup
8 B 553,636 msg/s 8,769 msg/s* 63x 2,336,040 msg/s 8,755 msg/s* 267x
128 B 418,524 msg/s 8,758 msg/s* 48x 1,402,979 msg/s 8,756 msg/s* 160x
1 KB 338,365 msg/s 2,232 msg/s* 152x 493,128 msg/s 2,232 msg/s* 221x
8 KB 157,786 msg/s 622 msg/s* 254x 92,806 msg/s 622 msg/s* 149x

* nats.aio queue mode experiences severe message loss (47-96%)

Summary

  • Callback mode: nats-client is 1.4x - 5.0x faster depending on runtime and message size
  • Queue mode: nats-client is 48x - 267x faster and delivers 100% of messages vs 4-53% for nats.aio
  • Performance gap increases with smaller messages and PyPy runtime
  • nats.aio queue mode is fundamentally broken under high load with catastrophic message loss

Benchmark Commands

cd nats-client

# Individual tests
uv run python tools/bench.py --client client --messages 1000000 --size 128 [--callback]

uv run python tools/bench.py --client aio --messages 1000000 --size 128 [--callback]

@caspervonb caspervonb requested a review from Copilot November 14, 2025 03:40
Copilot finished reviewing on behalf of caspervonb November 14, 2025 03:41
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 56 out of 57 changed files in this pull request and generated 28 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@caspervonb caspervonb dismissed wallyqs’s stale review November 18, 2025 12:08

Dismissing this review blocked about naming, as the client is already in a separate nats-client package and will be published independently (likely as nats-core as that's the package name I currently sit on and nats-client).

Since it's not part of the nats.py namespace, the experimental designation and nested import path are unnecessary.

This PR has been open for 60 days and we need to move forward.

Copy link
Member

@Jarema Jarema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering it is a whole new package, not touching or affecting current client, we should merge it.

LGTM.

@caspervonb caspervonb force-pushed the add-nats-client-package branch 2 times, most recently from ee828b2 to 6d6c281 Compare November 24, 2025 12:28
Signed-off-by: Casper Beyer <[email protected]>
@caspervonb caspervonb force-pushed the add-nats-client-package branch from 6d6c281 to 22cbc2a Compare November 24, 2025 12:35
@caspervonb caspervonb merged commit b8fa6e0 into main Nov 24, 2025
42 of 56 checks passed
@caspervonb caspervonb deleted the add-nats-client-package branch November 24, 2025 12:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants