Age | Commit message (Collapse) | Author |
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This change removes the need for passing a list of files around for
deduplication. Also the deduplication code no longer needs to worry
about order, since the file being deduplicated is only added after
deduplication is done.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Only padd it if the *extracted* size is less then block size. Doing it
with the compressed size results in garbled blocks. Especially because
most of them are less than block size when compressed.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
- Split block reading code out from "dump_blocks" into precache_data_block,
similar to precache_fragment_block
- Merge the code paths for fragment/data block reading and uncompression
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit creates a new data structure called 'sqfs_reader_t' that
takes care of all the repetetive tasks like opening the file, reading
the super block, creating the compressor, deserializing an fstree and
creating a data reader.
This in turn makes it possible to remove all the duplicate code from
rdsquashfs and sqfs2tar.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
If we failed to create the root node, we don't need to cleanup the
fstree_t which would attempt to recursively cleanup the root node.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
We need to get the position _before_ writing the header, otherwise the
reader has no way to know the length of the value.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Data blocks need to be deduplicated before attempting to write a fragment.
In the current attempt if the data blocks are found to be duplicates but
the fragment isn't, the flushed fragments are purged as well, possibly
damaging other files.
Also, when the deduplication happens, the HAS_FRAGMENT flag needs to be
set, otherwise the deduplication code thinks that there is one more block
than there actually is.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Since it is actually completely independend of libsqfs and only works
on file_info_t lists, it can be safely moved over to libfstree and
the data writer becomes less cluttered as a result.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
A file is a complete duplicate if:
- It has no blocks, only a single fragment and that is a duplicate
- It has blocks but no fragment and the blocks are duplicate
- It has blocks and a fragment and both are duplicate
The previous version only counted the last one.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
If an entire file is eliminated, we need to reset the "used_bytes" counter,
otherwise, ALL the table positions are way off.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
We didn't allocate the ID table, so we don't need to free() it when
reading from disk fails.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Cut & paste misshap after mergining with fragment reader: If there are
no fragments, data_reader_create should return the data reader, not 0!
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Simplifies some task if we can just add a flag that a file has a framgent
or that it has already been detected as a duplicate.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
The strategy is as follows:
- At the beginning of every file, remember the current position
- Once a file is done scan the list of existing files for the following:
- Look for an existing file that has a block with the same size and
checksum as the first non-sparse block of the current file
- After that, every block in the current file has to match in size and
checksum the ones in the file that we found, from that point onward
- sparse blocks in either file are skipped
- If we found a match, we update the current file to point to the first
matching block and rewind the squashfs image to remove the newly written
data
This strategy should in theory be able to find an existing file where the
on-disk data *contains* the on-disk data of the current file.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
The strategy is simple:
- The data writer function that write data/fragment blocks get
access to the list files.
- When writing a fragment, we look for an already written file that has
a fragment with the same size and checksum.
- If we find one, we throw away the fragment and reuse the existing one.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
After the table read unification, there wasn't much left of the fragment
reader and the remains could easily be moved over to the data reader.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Make sure range is checked when reading a block and that the check is
made correctly. Also make the block log check a little more strict.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit attempts to make the generic table writer more readable.
A few changes are made, including heap allocation of the block list.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit fixes a bug in the fragment table reader where the reader
tries to read data into an out of bounds location due to an oversight
in size calculation.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
In most cases, we know exactely where the data that we want to read is
on disk, so instead of using read() on the squashfs (or lseek + read),
the code can in many places be cleaned up to use the pread wrapper
read_data_at instead.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit removes all the code for parsing and processing atime/ctime
and values and related test code.
Caring about those is kind of pointless because squashfs can only store
mtime in inodes. The only relevant place is when generating a struct
stat from a squashfs inode or an fstree node.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Requires that config.h be included before other headers, since the macro
_FILE_OFFSET_BITS changes the definitions of things like 'struct stat'.
I chose to simply include it at the top of every C file and at
immediately after the double-inclusion guards of every header.
Signed-off-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
- Bail early on empty directories without touching the meta readers.
- Aport the directory read loop if we can't even read a header anymore,
no matter if there are bytes remaining.
- Also add that same condition to the inner loop.
The later two actually caused a numeric overflow on some particularly
malformed squashfs images, going into a RAM filling infinite loop.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
First of all, this commit adds a mod_time field to a tree node. When
creating the tree node, the field is set from the struct stat. When
scanning a directory, the time stamps from the input are used if set.
Second, the libsqfs code that reads inodes is modified to store the
mod_time from the inode in the fstree node and to write the tree node
into a generated inode.
Finally, tar2sqfs is modified to optionally keep the timestamps from
the tar archive instead of setting defaults. gensquashfs is similarly
modified to keep the input timestamps if specified.
The result is as follows:
- sqfs2tar will always carry the timestamps from the squashfs over
to the tar ball.
- tar2sqfs will set defaults, unless explicitly asked to preserve
the mtime from the tar ball.
- gensquashfs can optionally preserve the mtime from the input
hierarchy it processes if only --pack-dir is specified.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
- Store them in a struct instead of a hacky uint64_t with magic shifts
- Split up key/value pair write function to write_key and write_value
- Move the size accounting into those functions respectively
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|