Expand description
This module implements a variation of the FastCDC algorithm using 31-integers and right shifts instead of left shifts.
The explanation below is copied from ronomon/deduplication since this module is little more than a translation of that implementation:
The following optimizations and variations on FastCDC are involved in the chunking algorithm:
- 31 bit integers to avoid 64 bit integers for the sake of the Javascript reference implementation.
- A right shift instead of a left shift to remove the need for an additional modulus operator, which would otherwise have been necessary to prevent overflow.
- Masks are no longer zero-padded since a right shift is used instead of a left shift.
- A more adaptive threshold based on a combination of average and minimum chunk size (rather than just average chunk size) to decide the pivot point at which to switch masks. A larger minimum chunk size now switches from the strict mask to the eager mask earlier.
- Masks use 1 bit of chunk size normalization instead of 2 bits of chunk size normalization.
Structs§
- Represents a chunk, returned from the FastCDC iterator.
- The FastCDC chunker implementation by Joran Dirk Greef.
Constants§
- Largest acceptable value for the average chunk size.
- Smallest acceptable value for the average chunk size.
- Largest acceptable value for the maximum chunk size.
- Smallest acceptable value for the maximum chunk size.
- Largest acceptable value for the minimum chunk size.
- Smallest acceptable value for the minimum chunk size.