summaryrefslogtreecommitdiff
path: root/lib/tar/cleanup.c
diff options
context:
space:
mode:
authorDavid Oberhollenzer <david.oberhollenzer@sigma-star.at>2019-07-27 00:19:13 +0200
committerDavid Oberhollenzer <david.oberhollenzer@sigma-star.at>2019-07-28 16:33:57 +0200
commit256c2458a4fa298c876d8e4a4450cb9a0834b877 (patch)
treeb8e619b55d0bd497010effce5a475b960d5bb845 /lib/tar/cleanup.c
parentcce36f459ddb5698fd1a40061c466996482146eb (diff)
Implement data block deduplication
The strategy is as follows: - At the beginning of every file, remember the current position - Once a file is done scan the list of existing files for the following: - Look for an existing file that has a block with the same size and checksum as the first non-sparse block of the current file - After that, every block in the current file has to match in size and checksum the ones in the file that we found, from that point onward - sparse blocks in either file are skipped - If we found a match, we update the current file to point to the first matching block and rewind the squashfs image to remove the newly written data This strategy should in theory be able to find an existing file where the on-disk data *contains* the on-disk data of the current file. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
Diffstat (limited to 'lib/tar/cleanup.c')
0 files changed, 0 insertions, 0 deletions