You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This technical design document bridges the gap between the protocol-level specification ([CIP-164](https://github.com/cardano-foundation/CIPs/pull/1078)) and its concrete implementation in [`cardano-node`](https://github.com/IntersectMBO/cardano-node). While CIP-164 defines *what* the Leios protocol is and *why* it benefits Cardano, this document addresses *how* to implement it reliably and serve as a practical guide for implementation teams.
16
16
17
-
This document builds on the [impact analysis](../ImpactAnalysis.md) and [early threat modelling](../threat-model.md) conducted. The document outlines the necessary architecture changes, highlights key risks and mitigation strategies, and proposes an implementation roadmap. As the implementation plan itself contains exploratory tasks, this document can be considered a living document and reflects our current understanding of the protocol, as well as design decisions taken during implementation.
17
+
This document builds on the conducted [impact analysis](../ImpactAnalysis.md) and [threat modeling](../threat-model.md). The document outlines the necessary architecture changes, highlights key risks and mitigation strategies, and proposes an implementation roadmap. As the implementation plan itself contains exploratory tasks, this document can be considered a living document and reflects our current understanding of the protocol, as well as design decisions taken during implementation.
18
18
19
19
Besides collecting node-specific details in this document, we intend to contribute implementation-independent specifications to the [cardano-blueprint](https://cardano-scaling.github.io/cardano-blueprint/) initiative and also update the CIP-164 specification through pull requests as needed.
20
20
21
21
**Document history**
22
22
23
23
This document is a living artifact and will be updated as implementation progresses, new risks are identified, and validation results become available.
| 0.6 | 2025-11-25 | Risks and mitigations with key threats |
28
+
| 0.5 | 2025-10-29 | Re-structure and start design chapter with impact analysis content |
29
+
| 0.4 | 2025-10-27 | Add overview chapter |
30
+
| 0.3 | 2025-10-25 | Add dependencies and interactions |
31
+
| 0.2 | 2025-10-24 | Add implementation plan |
32
+
| 0.1 | 2025-10-15 | Initial draft |
32
33
33
34
# Overview
34
35
@@ -264,52 +265,75 @@ In summary, Leios will not require fundamental changes to Mithril's architecture
264
265
265
266
# Risks and mitigations
266
267
267
-
> [!WARNING]
268
-
>
269
-
> TODO: Introduce chapter as being the bridge between implementation plan and concrete technical design; also, these are only selected aspects that inform the implementation (and not cover principal risks to the protocol or things that are avoided by design)
268
+
This chapter bridges the implementation plan with concrete technical design by examining selected threats that directly inform architectural decisions and validation priorities. The focus is on implementation-specific risks rather than general protocol threats, which are either mitigated by design or othterwise covered in the [threat model](../threat-model.md).
269
+
270
+
The threats examined here represent scenarios that could compromise the implementation's ability to deliver Leios' promised benefits while maintaining Praos' security guarantees. Each threat analysis motivates specific technical requirements, validation experiments, or design constraints that shape the implementation outlined in subsequent chapters.
270
271
271
272
## Key threats
272
273
273
-
> [!WARNING]
274
-
>
275
-
> TODO: Selection of key threats and attacks that further inform the design and/or implementation plan. Incorporate / reference the full [threat model](../threat-model.md)
274
+
The following threats have been selected for detailed analysis based on their potential to inform critical implementation decisions. These represent attack vectors that emerged prominently during research, have significant implications for system performance under adversarial conditions, or require empirical validation through prototyping and testing.
276
275
277
-
### Protocol bursts
276
+
### Data withholding
278
277
279
-
> [!WARNING]
280
-
>
281
-
> TODO: important because
282
-
>
283
-
> - was a prominent case in research
284
-
> - acknowledges the wealth of data to be processed
285
-
> - mitigation: freshest-first delivery / prioritization between praos and leios traffic
286
-
> - motivates experiments/features revolving around resource management
287
-
> - reference/include/move related RSK-.. items from impact analysis
278
+
In a data withholding attack (**ATK-LeiosDataWithholding**, see also [threat vectors #20, #21 and #22](../threat-model.md#data-withholding)), the adversary deliberately prevents the diffusion of endorser block transaction closures to disrupt certification and degrade network throughput.
279
+
This attack exploits the fundamental dependency between transaction availability and EB certification, targeting the gap between optimistic and worst-case diffusion scenarios that underlies Leios' [security argument](https://github.com/cardano-scaling/CIPs/blob/leios/CIP-0164/README.md#protocol-security).
280
+
281
+
The attack operates by manipulating timing and availability of transaction data required for EB validation.
282
+
When an EB is announced via an RB header, voting committee members must acquire and validate the complete transaction closure before casting votes.
283
+
The adversary can exploit this in several ways: withholding the EB body itself, selectively withholding individual transactions, or strategically timing data release to exceed the $L_\text{vote}$ deadline.
284
+
285
+
**Direct threshold impact.** The most direct form involves an adversarial block producer creating valid EBs but refusing to serve transaction closures when requested by voting nodes.
286
+
Since committee members cannot validate unavailable transactions, they cannot vote for certification, effectively nullifying the EB's throughput contribution.
287
+
More sophisticated variants involve network-level manipulation where the adversary controls network relays to selectively prevent transaction propagation to specific voting committee members.
288
+
289
+
Consider an adversary controlling 15% of stake attempting to prevent honest EBs from achieving the 75% certification threshold.
290
+
The adversary must withhold transaction data from enough voting committee members to reduce available honest stake below 75%.
291
+
Since the adversary controls 15% stake directly, they need to prevent an additional 10% of honest stake from voting.
292
+
This demonstrates how modest adversarial stake combined with strategic network positioning could significantly impact honest EB certification.
293
+
294
+
**Attack on safety.** While throughput degradation represents the obvious impact, the most dangerous variant targets blockchain safety itself.
295
+
The adversary can strategically delay transaction data release to create scenarios where EBs achieve certification but cannot be processed by honest nodes within the required timeframe.
296
+
Just before the voting deadline, they release data to a subset of voting committee members—enough to achieve certification, but not to all network participants.
297
+
The resulting certificate gets included in a subsequent RB, but honest block producers cannot acquire the certified EB's transaction closure within $L_\text{diff}$.
298
+
299
+
By reducing the number of honest nodes that received the EB data in time for certification, the adversary also impairs subsequent diffusion.
300
+
With fewer nodes initially possessing the complete transaction closure, propagation becomes slower and less reliable, potentially extending diffusion times beyond protocol anticipation.
301
+
This would represent a violation of Praos' timing assumptions.
302
+
While missing the $\Delta$ deadline occasionally does not break safety, short forks are normal in Ouroboros, persistent violations can lead to longer forks and degraded chain quality.
303
+
In summary, the attack fundamentally challenges the security argument's assumption that the difference between optimistic and worst-case diffusion remains bounded by $L_\text{diff}$.
304
+
305
+
Mitigation relies primarily on the protocol design ensuring that diffusion timing remains bounded even under adversarial conditions.
306
+
The certification mechanism provides defense against stake-based withholding by requiring broad consensus before including EBs in the ledger.
307
+
Network-level attacks require sophisticated countermeasures including redundant peer connections, timeouts that punish non-responsive nodes, and strategic committee selection considering network topology.
308
+
309
+
The implementation must validate empirically that real-world network conditions support the timing assumptions underlying the security argument through adversarial diffusion testing.
288
310
289
-
In a protocol burst attack (**ATK-LeiosProtocolBurst**) the adversary withholds a large number of EBs and/or their closures over a significant duration and then releases them all at once.
311
+
### Protocol bursts
312
+
313
+
In a protocol burst attack (**ATK-LeiosProtocolBurst**, see also [threat vector #23](../threat-model.md#protocol-bursts)) the adversary withholds a large number of EBs and/or their closures over a significant duration and then releases them all at once.
290
314
This will lead to a sustained maximal load on the honest network for a smaller but still significant duration, a.k.a. a burst.
291
315
The potential magnitude of that burst will depend on various factors, including at least the adversary's portion of stake, but the worst-case is more than a gigabyte of download.
292
316
The cost to the victim is merely the work to acquire the closures and to check the hashes of the received EB bodies and transaction bodies.
293
317
In particular, at most one of the EBs in the burst could extend the tip of a victim node's current selection, and so that's the only EB the victim would attempt to fully parse and validate.
294
-
295
318
Even without honest nodes needing to validate the vast majority of the burst, the sheer amount of concentrated bandwidth utilization can be problematic.
296
-
Suppose the adversary controls 1/3rd stake and they're issuing nominal RBs.
319
+
320
+
**Attack magnitude.** Suppose the adversary controls 1/3rd stake and they're issuing nominal RBs.
297
321
Recall that CIP-164 requires each honest node to attempt to acquire any EB that was promptly announced within the last 12 hours.
298
322
Even if it's too late for the honest node itself to vote on a tardy EB, the lack of global objective information implies the node must not assume the EB cannot be certified by other node's in the network.
299
323
Thus, the honest node might later need to switch to a fork that requires having this EB, and that switch ideally wouldn't be delayed by waiting on that EB's arrival; the node should still acquire the EB as soon as it can.
300
-
301
324
For this attack, the adversary would announce each EB promptly (by diffusing the corresponding RB headers on-time), but withhold the mini protocol messages that actually initiate the diffusion of substantial Leios traffic throughout the honest network.
302
325
Only after withholding every EB's diffusion for 12 hours would they suddenly release them.
303
-
In this scenario---which is not the worst-case---the average would be approximately 2160 * (1/3) = 720 EBs, but there could be hundreds more merely due to luck and multi-leader slots.
326
+
In this scenario, which is not the worst-case, the average would be approximately 2160 * (1/3) = 720 EBs, but there could be hundreds more merely due to luck and multi-leader slots.
304
327
There could be several thousand if the adversary is also grinding, for example, and/or had closer to 50% stake, etc.
305
-
If each of the attacker's EBs has the maximum size of 500 kilobytes of tx references and 12.5 megabytes of actual txs---which don't even need to be valid---then that's an average of 720 * (12.5 + 0.5 megabytes) = 9.36 gigabytes the honest nodes will be eagerly diffusing throughout the network.
328
+
If each of the attacker's EBs has the maximum size of 500 kilobytes of tx references and 12.5 megabytes of actual txs, which don't even need to be valid, then that's an average of 720 * (12.5 + 0.5 megabytes) = 9.36 gigabytes the honest nodes will be eagerly diffusing throughout the network.
306
329
307
-
For however long it takes for the network to (carefully) diffuse 10 gigabytes, honest traffic might diffuse more poorly.
330
+
**Resource contention.**For however long it takes for the network to (carefully) diffuse 10 gigabytes, honest traffic might diffuse more poorly.
308
331
CIP-164 requires that Praos traffic will be preferred over Leios traffic and that fresher Leios traffic will be preferred over stale Leios traffic.
309
332
That would prevent the burst from degrading contemporary honest traffic if the prioritization could be perfect.
333
+
310
334
However, there are some infrastructural resources that cannot be prioritized perfectly nor instantly reapportioned, including: CPU, memory, disk, disk bandwidth, and buffer utilization on the nodes themselves but also along the Internet routers carrying packets between Cardano peers.
311
335
One non-obvious concern is that cloud providers often throttle users exhibiting large bursts of bandwidth, so a node might perform fine outside of a protocol burst but struggle disproportionately during one.
312
-
(A node in a data center might not struggle at all to diffuse the 10 gigabytes over the course of each 12 hours but be very slow to diffuse it in a single burst that arrives every 12 hours.)
336
+
A node in a data center might not struggle at all to diffuse the 10 gigabytes over the course of each 12 hours but be very slow to diffuse it in a single burst that arrives every 12 hours.
313
337
314
338
Some of CPU and memory costs will scale in the number of txs rather than the number of EBs, which can be a ratio of more than 10,000 to 1.
315
339
If none of the 720 EBs overlap, then there would be more than 7,200,000 unique txs on average that the honest nodes need to keep track of during this burst.
@@ -323,28 +347,17 @@ The adversary is only able to issue EBs at an average rate in proportion to thei
323
347
There will be some variance, but in general they can do smaller bursts more often or larger bursts less often.
324
348
However, the Praos security argument's parameters represent the worst-case, so the largest burst fundamentally challenges the current Praos security argument even if it can only happen rarely to whatever extent the prioritization schemes of CIP-164 are imperfectly implemented.
325
349
326
-
### Data withholding
350
+
##Assumptions to validate early
327
351
328
-
> [!WARNING]
329
-
>
330
-
> TODO: important because
331
-
>
332
-
> - can be done from stake- and network-based attackers
333
-
> - trivially impacts high-throughput because no certifications happening
334
-
> - however, more advanced, potential avenue to attack blockchain safety (impact praos security argument) when carefully partitioning the network
335
-
> - mitigation: L_diff following the [security argument](https://github.com/cardano-scaling/CIPs/blob/leios/CIP-0164/README.md#protocol-security)
336
-
> - motivates validation of optimistic and worst-case diffusion paths
352
+
Following the principle of early validation outlined in the [implementation plan](#approach), several critical assumptions underlying Leios' security argument must be validated before we can commit to full scale implementation and deployment. These assumptions represent potential failure points where theoretical models may not match real-world performance.
337
353
338
-
## Assumptions to validate early
354
+
-**Worst case diffusion of EBs given certain honest stake is realistic.** The security argument assumes that even under adversarial conditions, EBs can be diffused to honest nodes within bounded timeframes. This assumption must be validated under various network topologies and attack scenarios to ensure the $L_\text{diff}$ parameter provides adequate protection.
339
355
340
-
> [!WARNING]
341
-
>
342
-
> TODO: Which assumptions in the CIP / on the protocol security need to be validated as early as possible?
343
-
>
344
-
> - Worst case diffusion of EBs given certain honest stake (certifying the EB) is realistic
345
-
> - The cardano network stack can realize freshest first delivery (sufficiently well)
346
-
> - A real ledger can (re-)process orders of magnitude higher loads as expected
347
-
> - ...?
356
+
-**The Cardano network stack can realize freshest-first delivery sufficiently well.** Prioritizing Praos, over recent Leios , over stale Leios traffic is essential for mitigating protocol burst attacks. Real-world validation must demonstrate that the network layer can maintain this prioritization under load without significantly impacting Praos traffic.
357
+
358
+
-**A real ledger can process orders of magnitude higher transaction loads as expected.** Leios assumes that nodes can validate and apply large transaction sets within tight timing constraints. This requires empirical validation of transaction validation throughput, especially when combined with disk-based ledger storage and concurrent processing demands.
359
+
360
+
The [prototyping and adversarial testing](#prototyping-and-adversarial-testing) phase of the implementation plan is specifically designed to validate these assumptions through controlled experiments. Only with such validation we can confidently design and implement the components that realize a Leios consensus.
0 commit comments