Skip to content

Commit 2ff8a77

Browse files
docs: update high-level-view.md with socket mode architecture (#69355)
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: [email protected] <[email protected]>
1 parent 916f1fd commit 2ff8a77

File tree

1 file changed

+37
-1
lines changed

1 file changed

+37
-1
lines changed

docs/platform/understanding-airbyte/high-level-view.md

Lines changed: 37 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,36 @@ The platform provides all the horizontal services required to configure and run
1010

1111
Connectors are independent modules which push/pull data to/from sources and destinations. Connectors are built in accordance with the [Airbyte Specification](./airbyte-protocol.md), which describes the interface with which data can be moved between a source and a destination using Airbyte. Connectors are packaged as Docker images, which allows total flexibility over the technologies used to implement them.
1212

13-
A more concrete diagram can be seen below:
13+
## Data Transfer Modes
14+
15+
Airbyte supports two data transfer modes that are automatically selected based on connector capabilities:
16+
17+
- **Socket Mode**: Records flow directly from source to destination via Unix domain sockets, enabling high-throughput parallel data transfer. A lightweight bookkeeper process handles control messages, state, and logs.
18+
- **Legacy Mode**: Records flow through an orchestrator middleware that sits between source and destination, using standard input/output streams.
19+
20+
Socket mode is used when both source and destination connectors support it, providing significantly higher performance for data movement operations.
21+
22+
### Data Flow Comparison
23+
24+
```mermaid
25+
---
26+
title: Data Transfer Modes
27+
---
28+
flowchart LR
29+
subgraph Legacy["Legacy Mode"]
30+
SRC1[Source] --> ORCH[Orchestrator] --> DEST1[Destination]
31+
end
32+
33+
subgraph Socket["Socket Mode"]
34+
SRC2[Source] -.->|control| BK[Bookkeeper]
35+
SRC2 ==>|records via sockets| DEST2[Destination]
36+
DEST2 -.->|state| BK
37+
end
38+
```
39+
40+
## Platform Architecture
41+
42+
A more concrete diagram of the platform orchestration can be seen below:
1443

1544
```mermaid
1645
---
@@ -45,6 +74,13 @@ flowchart LR
4574
- **Workload API** [`airbyte-workload-api-server`]: The HTTP interface for enqueuing workloads — the discrete pods that run the connector operations.
4675
- **Launcher** [`airbyte-workload-launcher`]: Consumes events from the workload API and interfaces with k8s to launch workloads.
4776

77+
### Data Transfer Middleware
78+
79+
Within connector operation pods, Airbyte runs middleware containers to process connector output:
80+
81+
- **Bookkeeper** [`airbyte-bookkeeper`]: Used in socket mode. Processes control messages, state, and logs while records flow directly between connectors via sockets.
82+
- **Container Orchestrator** [`airbyte-container-orchestrator`]: Used in legacy mode. Sits between source and destination connectors, processing all data and control messages.
83+
4884
The diagram shows the steady-state operation of Airbyte, there are components not described you'll see in your deployment:
4985

5086
- **Cron** [`airbyte-cron`]: Clean the server and sync logs (when using local logs). Regularly updates connector definitions and sweeps old workloads ensuring eventual consenus.

0 commit comments

Comments
 (0)