Skip to content

Commit ee10a08

Browse files
committed
Reword seed generation for clarity
1 parent 2cd1a64 commit ee10a08

File tree

1 file changed

+12
-10
lines changed

1 file changed

+12
-10
lines changed

IPNI_MH_SAMPLING.md

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -117,17 +117,19 @@ include:
117117
identical results, facilitating reproducibility.
118118

119119
The sampling process employs the PCG (Permuted Congruential Generator) random number generator, using a seed derived
120-
from a beacon to ensure deterministic output. The 128-bit PCG seed is extracted from a 32-byte beacon through the
121-
following steps:
120+
from a beacon to ensure deterministic output. The 128-bit PCG seed is extracted from up to a 32-byte beacon through the
121+
following refined steps:
122122

123-
- The 32-byte beacon is split into two 16-byte segments.
124-
- The least significant bytes of each 16-byte segment are used to form the high and low 64-bit seed values,
123+
- The 32-byte beacon is divided into two 16-byte halves.
124+
- From each 16-byte half, the least significant 16 bytes are used to form the high and low 64-bit seed values,
125125
respectively.
126126

127-
If the provided beacon is smaller than 32 bytes, it is padded with zeros to reach the required seed size. The use of a
128-
32-byte beacon, even though half might be discarded, is driven by the convenience of aligning with existing distributed
129-
randomness beacons like DRAND, which often provide larger outputs, and allows for future extensibility by reserving
130-
additional space for potential enhancements or increased complexity in random seed generation.
127+
If the provided beacon is shorter than 32 bytes, it is first divided into two parts and padded with zeros as needed to
128+
ensure each part is 16 bytes long. Beacons with an odd number of bytes are padded to the closest even byte count before
129+
being split in half. The use of a 32-byte beacon, even though half might be discarded, is driven by the convenience of
130+
aligning with existing distributed randomness beacons like DRAND, which often provide larger outputs, and allows for
131+
future extensibility by reserving additional space for potential enhancements or increased complexity in random seed
132+
generation.
131133

132134
To efficiently perform sampling, two pre-emptive data organizational strategies are proposed:
133135

@@ -167,8 +169,8 @@ Samples a set of multihashes ingested by an IPNI indexer for a given provider ID
167169

168170
- **Query Parameters**:
169171
- `beacon`: (string, optional) The hex encoded randomness beacon for deterministic sampling. Ensures repeatability
170-
of samples. Must not exceed 32 bytes.
171-
- Example: `3439d92d58e47d342131d446a3abe264396dd264717897af30525c98408c834f`
172+
of samples. _Must not exceed 32 bytes_.
173+
- Example: `3439d92d58e47d342131d446a3abe264396dd264717897af30525c98408c834f`
172174
- `max`: (integer, optional) The maximum number of multihashes to return. Defaults to one if unspecified. Must be
173175
greater than zero, with a maximum of 10.
174176
- `federation_epoch`: (optional) The IPNI federation epoch, currently only accepting zero, pending review of IPNI

0 commit comments

Comments
 (0)