squashfs-tools-ng.git - A new set of tools and libraries for working with SquashFS images

Age	Commit message (Collapse)	Author
2019-08-09	Add option to sqfsdiff to extract regular files that are different	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-07	Fix resource leak in error branch for fscompare compare_files	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-07	Add documentation for sqfsdiff	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-07	Add flag to sqfsdiff to compare the fields in the super block	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-07	Add flag to sqfsdiff to compare inode numbers	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-07	Add flag to difftool to also compare time stamps	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-07	Add flag to sqfsdiff to skip comparing filesystem contents	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-07	Extend filesystem diff utility to work on squashfs images	David Oberhollenzer
	Since it already works on an fstree_t instance (constructed from the input paths), and we now have a handy sqfs_reader_t, it is quite simple to extend. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-07	Add a helper utility to compare filesystem trees	David Oberhollenzer
	The intended use case is to compare two mounted or unpacke squashfs images, so a repacked test image can be compared against its original or an image unpacked with unsquashfs can be compared with an image unpacked by rdsquashfs or sqfs2tar. Since the tool is only intended to aid development (specifically automated testing), it is not installed by `make install`. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-07	Add pread(2) like function to data_reader	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-07	Fix forward seek when unpacking sparse files	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-07	Fix zero padding of extracted data blocks	David Oberhollenzer
	Only padd it if the extracted size is less then block size. Doing it with the compressed size results in garbled blocks. Especially because most of them are less than block size when compressed. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-05	cleanup data reader	David Oberhollenzer
	- Split block reading code out from "dump_blocks" into precache_data_block, similar to precache_fragment_block - Merge the code paths for fragment/data block reading and uncompression Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-05	cleanup: unify all the code that reads squashfs images	David Oberhollenzer
	This commit creates a new data structure called 'sqfs_reader_t' that takes care of all the repetetive tasks like opening the file, reading the super block, creating the compressor, deserializing an fstree and creating a data reader. This in turn makes it possible to remove all the duplicate code from rdsquashfs and sqfs2tar. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-04	Improve file unpacking order	David Oberhollenzer
	This commit moves the file unpacking order & job scheduling to a libfstree function. The ordering is improved by making sure fragment blocks are not extracted more than once and files with data blocks are extracted in order. This way, serial unpacking of a 2GiB Debian live image could be reduced from ~5' on my test machine to ~3.5', whereas parallel unpacking stays roughly the same (~3' for -j 4). Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-04	Fix functions with side effect being used inside asserts	David Oberhollenzer
	If -DNDEBUG is set, the entire thing is omitted from the output. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-04	Make sure file listing generated by rdsquashfs -d is propperly escaped	David Oberhollenzer
	File names may contain traces of white space. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-03	Fix tar header error reporting on 32 bit systems	David Oberhollenzer
	If an extension header is rejected because its too big, the error path would print the size as size_t, altough it is an uint64_t. On 64 bit systems, this works because size_t is a 64 bit unsigned integer, on 32 bit systems, not so much. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-03	cleanup: remove left over atime/ctime code	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-02	Fix explicit NULL dereference in deserialize_fstree failure path	David Oberhollenzer
	If we failed to create the root node, we don't need to cleanup the fstree_t which would attempt to recursively cleanup the root node. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-02	cleanup: merge error paths in xattr reader restore_kv_pairs	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-02	Fix potential double free of xattr reader id_block_starts	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-02	Implement support for SOURCE_DATE_EPOCH environment variable	David Oberhollenzer
	reproducible-builds.org suggests the use of an environment variable as a source for time stamps: https://reproducible-builds.org/specs/source-date-epoch/ This commit adds support for setting the default mtime from the variable, if it is set and only defaulting to 0 if not. The timestamp given by the command line switch takes precedence. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-01	Add flag to rdsquashfs to optionally set xattrs on unpacked files	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-01	Update man pages	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-01	Add ability to sqfs2tar to optionally copy over xattrs	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-01	Add ability to write_tar_header to embedd extended attributes	David Oberhollenzer
	This commit patches the tar writer to generate a PAX header with SCHILY xattr key/value pairs if requested. The Schily format is used for two reasons: - It is simple - It is apparently more widely supported than the libarchive format Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-01	Add option to rdsquashfs to dump extended attributes for an inode	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-01	Add option to restore xattrs to deserialize_fstree	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-01	Add xattr reader implementation to recover xattrs from squashfs	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-01	Fix xattr writer size accounting	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-01	Fix super block flags: clear "no xattr" flag when writing xattrs	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-01	Fix xattr OOL position	David Oberhollenzer
	We need to get the position _before_ writing the header, otherwise the reader has no way to know the length of the value. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-31	Overhaul README and convert it to markdown	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-30	Update print_version text	David Oberhollenzer
	This commit updates the text issued by print_version() to reflect in some way that the software contains contributions from co-authors. The original text was based on the sterotypical --version output of GNU coreutils programs. It may have to be rewritten eventually. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-30	Add propper copyright headers to all source files	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-29	Fix order of data block deduplication	David Oberhollenzer
	Data blocks need to be deduplicated before attempting to write a fragment. In the current attempt if the data blocks are found to be duplicates but the fragment isn't, the flushed fragments are purged as well, possibly damaging other files. Also, when the deduplication happens, the HAS_FRAGMENT flag needs to be set, otherwise the deduplication code thinks that there is one more block than there actually is. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-29	Cleanup: move deduplication code from data writer to fstree	David Oberhollenzer
	Since it is actually completely independend of libsqfs and only works on file_info_t lists, it can be safely moved over to libfstree and the data writer becomes less cluttered as a result. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-29	Simplify fstree sorting	Zachary Dremann
	For merging, the use of a pointer to a pointer can simplify linked list operations For sorting, find the half-way point of the list in a single iteration over the list Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28	Fix missing initialization of file fragment fields	David Oberhollenzer
	Despite having a flag for that now, they still need to be initialized because they are written straight to disk. Fixes: d4d1854aaed867d28ebfc97afb3518254ab6fd4b Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28	Fix duplicate file accounting	David Oberhollenzer
	A file is a complete duplicate if: - It has no blocks, only a single fragment and that is a duplicate - It has blocks but no fragment and the blocks are duplicate - It has blocks and a fragment and both are duplicate The previous version only counted the last one. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28	Fix used bytes accounting when deduplicating file blocks	David Oberhollenzer
	If an entire file is eliminated, we need to reset the "used_bytes" counter, otherwise, ALL the table positions are way off. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28	Fix free() of stack pointer in id_table_read error path	David Oberhollenzer
	We didn't allocate the ID table, so we don't need to free() it when reading from disk fails. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28	Fix: return the correct value from data_reader_create	David Oberhollenzer
	Cut & paste misshap after mergining with fragment reader: If there are no fragments, data_reader_create should return the data reader, not 0! Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28	Add some nice statistics output to tar2sqfs and gensquashfs	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28	Add general purpose flags field to file_info_t	David Oberhollenzer
	Simplifies some task if we can just add a flag that a file has a framgent or that it has already been detected as a duplicate. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28	Implement data block deduplication	David Oberhollenzer
	The strategy is as follows: - At the beginning of every file, remember the current position - Once a file is done scan the list of existing files for the following: - Look for an existing file that has a block with the same size and checksum as the first non-sparse block of the current file - After that, every block in the current file has to match in size and checksum the ones in the file that we found, from that point onward - sparse blocks in either file are skipped - If we found a match, we update the current file to point to the first matching block and rewind the squashfs image to remove the newly written data This strategy should in theory be able to find an existing file where the on-disk data contains the on-disk data of the current file. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28	Implement fragment deduplication in data writer	David Oberhollenzer
	The strategy is simple: - The data writer function that write data/fragment blocks get access to the list files. - When writing a fragment, we look for an already written file that has a fragment with the same size and checksum. - If we find one, we throw away the fragment and reuse the existing one. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28	Unify common file start/end code from data writer in helper functions	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28	Compute per-block and per-fragment checksums in data wrtier	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>