diff options
| author | David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 2019-07-27 00:19:13 +0200 | 
|---|---|---|
| committer | David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 2019-07-28 16:33:57 +0200 | 
| commit | 256c2458a4fa298c876d8e4a4450cb9a0834b877 (patch) | |
| tree | b8e619b55d0bd497010effce5a475b960d5bb845 /lib/fstree | |
| parent | cce36f459ddb5698fd1a40061c466996482146eb (diff) | |
Implement data block deduplication
The strategy is as follows:
 - At the beginning of every file, remember the current position
 - Once a file is done scan the list of existing files for the following:
   - Look for an existing file that has a block with the same size and
     checksum as the first non-sparse block of the current file
   - After that, every block in the current file has to match in size and
     checksum the ones in the file that we found, from that point onward
   - sparse blocks in either file are skipped
 - If we found a match, we update the current file to point to the first
   matching block and rewind the squashfs image to remove the newly written
   data
This strategy should in theory be able to find an existing file where the
on-disk data *contains* the on-disk data of the current file.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
Diffstat (limited to 'lib/fstree')
0 files changed, 0 insertions, 0 deletions
