squashfs-tools-ng.git - A new set of tools and libraries for working with SquashFS images

Age	Commit message (Collapse)	Author
2019-09-04	Split fstree inode serialization, move independend part to libsquashfs.so	David Oberhollenzer
	This commit adds a function to libsquashfs.so for writing generic inodes to a meta writer and another function to libsqfshelper.a that turns a tree node to an inode. That way, the tree serialization code can be expressed in terms of those functions and a bulk of the independend code can be moved over to libsquashfs.so Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-04	Add fstree independend directory writer to libsquashfs.so	David Oberhollenzer
	This commit adds a directory writer to libsquashfs that wrapps a meta data writer and provides a higher-level interface for writing directory entries. Under the hood it enforces the rules that squashfs insists upon. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-02	Move fstree independend parts of xattr_reader to libsquashfs.so	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-01	Internalize the layout of the id_table_t structure	David Oberhollenzer
	As an opaque struct it has a chance to change its layout in the future without breaking ABI compatibiliy. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-01	Move command line processing stuff out of compressor code	David Oberhollenzer
	This commit moves stuff like printing help text, command line option processing and enumerating available processors on stdout out of the generic compressor code. The option string is replaced with a structure that directly exposese the tweakable parameters for all compressors. A function for parsing the command line arguments into this structure is added in sqfshelper. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-01	API cleanup: Shuffle declarations around	David Oberhollenzer
	Move declarations for stuff that is defined in libsquashfs.so into the public headers and declarations for stuff that isn't, out of there. Also move the meta reader/writer helper functions to their respective headers. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-01	Break up squashfs.h into topic related headers	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-01	Install libsquashfs.so headers on the system in "sqfs" subdirectory	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-31	Split libsquashfs.a into low seperate libraries	David Oberhollenzer
	The idea is to make libsquashfs.a independend of libfstree.a, so it becomes a general purpose squashfs manipulation library. All the high level glue code for libfstree.a and utilites that are overly specific with to tools are moved to a seperate librarby. This commit makes the first step by moving the stuff with dependencies on libfstree to a seperate library. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-31	Merge libcompress.a into libsquashfs.a	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-25	Propperly set errno in read_inode_slink error path	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-23	Check against format limits in meta_reader_read_dir_header	David Oberhollenzer
	The SquashFS kernel implementation insists that a directory header is followed by no more than an upper bound of entries, way less than what the filed itself actually supports. This commit makes sure that the meta_reader_read_dir_header function also enforces that same limit. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-23	Size accounting + alloc() overflow checking, round #2	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-23	Do bounds checking in metadata reader	David Oberhollenzer
	In all cases where metadata blocks are read, we can roughly (in some cases even preciesly) say in what range those metadata blocks will be, so it makes sense to throw an error if an attempt is made to wander outside this range. Furthermore, when reading from an uncompressed block, it is more reasonable to check against the actual block bounds than to padd it with 0 bytes. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-23	Some simple search/replace cases for allocation	David Oberhollenzer
	This commit exchanges some malloc(x + y * z) patterns that can be found with a simple git grep and are obvious for the new wrappers. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-22	deserialize_tree: filter out directory loops	David Oberhollenzer
	The tree deserializer does a recursive depth-first search to populate the directory tree, moving back and forth between the directory listing containing the inode references and the inode table pointing to the list of child inodes. It is completely unaware of hard links and creates duplicate nodes instead. It is possible to create a malicious SquashFS image that contains a directory that contains as child a reference to its own inode. This can also be done transitively (i.e. directory contains its own parent or grand parent), leading to infinite recursion (actually finite, since it terminates once all stack memory is exhausted). This commit adds a simple check to see if a node has the same inode number as any of its would-be parents. If it does, the node is discarded and a warning message is emitted. Other cases with arbitrary layers of indirection could be constructed as well (e.g. dir 'a' contains hard link to 'b' and 'b' one back to 'a'), but the sub hierarchies are always expanded, this check should catch that too. Reported-by: Zachary Dremann <dremann@gmail.com> Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-21	Fix "no attributes" sentinel value for xattr reader	David Oberhollenzer
	An inode can be of extended type for reasons other than having extended attributes and simply set the xattr ID to 0xFFFFFFFF to indicate that it doesn't have extended attributes. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-19	Fix memory leak in data writer fragment deduplication	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-19	Fix memory leak in data writer error code paths	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-18	Replace update_crc32 helper function with crc32 from zlib	David Oberhollenzer
	It is optimized to the maximum and if we already use zlib anyway, why not use zlib crc32? This also makes zlib a hard dependency which also means the whole "do we have a compressor" sanity check in the build system can be removed. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-18	Make data writer use block processor	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-18	Restructure data writer around passing block_t structures	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-18	Minor interface change to data writer	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-18	cleanup: internalize deduplication list in data_writer	David Oberhollenzer
	This change removes the need for passing a list of files around for deduplication. Also the deduplication code no longer needs to worry about order, since the file being deduplicated is only added after deduplication is done. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-16	Fix: don't try to read xattrs if there are none	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-07	Add pread(2) like function to data_reader	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-07	Fix forward seek when unpacking sparse files	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-07	Fix zero padding of extracted data blocks	David Oberhollenzer
	Only padd it if the extracted size is less then block size. Doing it with the compressed size results in garbled blocks. Especially because most of them are less than block size when compressed. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-05	cleanup data reader	David Oberhollenzer
	- Split block reading code out from "dump_blocks" into precache_data_block, similar to precache_fragment_block - Merge the code paths for fragment/data block reading and uncompression Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-05	cleanup: unify all the code that reads squashfs images	David Oberhollenzer
	This commit creates a new data structure called 'sqfs_reader_t' that takes care of all the repetetive tasks like opening the file, reading the super block, creating the compressor, deserializing an fstree and creating a data reader. This in turn makes it possible to remove all the duplicate code from rdsquashfs and sqfs2tar. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-02	Fix explicit NULL dereference in deserialize_fstree failure path	David Oberhollenzer
	If we failed to create the root node, we don't need to cleanup the fstree_t which would attempt to recursively cleanup the root node. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-02	cleanup: merge error paths in xattr reader restore_kv_pairs	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-02	Fix potential double free of xattr reader id_block_starts	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-01	Add option to restore xattrs to deserialize_fstree	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-01	Add xattr reader implementation to recover xattrs from squashfs	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-01	Fix xattr writer size accounting	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-01	Fix super block flags: clear "no xattr" flag when writing xattrs	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-01	Fix xattr OOL position	David Oberhollenzer
	We need to get the position _before_ writing the header, otherwise the reader has no way to know the length of the value. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-30	Add propper copyright headers to all source files	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-29	Fix order of data block deduplication	David Oberhollenzer
	Data blocks need to be deduplicated before attempting to write a fragment. In the current attempt if the data blocks are found to be duplicates but the fragment isn't, the flushed fragments are purged as well, possibly damaging other files. Also, when the deduplication happens, the HAS_FRAGMENT flag needs to be set, otherwise the deduplication code thinks that there is one more block than there actually is. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-29	Cleanup: move deduplication code from data writer to fstree	David Oberhollenzer
	Since it is actually completely independend of libsqfs and only works on file_info_t lists, it can be safely moved over to libfstree and the data writer becomes less cluttered as a result. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28	Fix duplicate file accounting	David Oberhollenzer
	A file is a complete duplicate if: - It has no blocks, only a single fragment and that is a duplicate - It has blocks but no fragment and the blocks are duplicate - It has blocks and a fragment and both are duplicate The previous version only counted the last one. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28	Fix used bytes accounting when deduplicating file blocks	David Oberhollenzer
	If an entire file is eliminated, we need to reset the "used_bytes" counter, otherwise, ALL the table positions are way off. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28	Fix free() of stack pointer in id_table_read error path	David Oberhollenzer
	We didn't allocate the ID table, so we don't need to free() it when reading from disk fails. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28	Fix: return the correct value from data_reader_create	David Oberhollenzer
	Cut & paste misshap after mergining with fragment reader: If there are no fragments, data_reader_create should return the data reader, not 0! Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28	Add some nice statistics output to tar2sqfs and gensquashfs	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28	Add general purpose flags field to file_info_t	David Oberhollenzer
	Simplifies some task if we can just add a flag that a file has a framgent or that it has already been detected as a duplicate. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28	Implement data block deduplication	David Oberhollenzer
	The strategy is as follows: - At the beginning of every file, remember the current position - Once a file is done scan the list of existing files for the following: - Look for an existing file that has a block with the same size and checksum as the first non-sparse block of the current file - After that, every block in the current file has to match in size and checksum the ones in the file that we found, from that point onward - sparse blocks in either file are skipped - If we found a match, we update the current file to point to the first matching block and rewind the squashfs image to remove the newly written data This strategy should in theory be able to find an existing file where the on-disk data contains the on-disk data of the current file. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28	Implement fragment deduplication in data writer	David Oberhollenzer
	The strategy is simple: - The data writer function that write data/fragment blocks get access to the list files. - When writing a fragment, we look for an already written file that has a fragment with the same size and checksum. - If we find one, we throw away the fragment and reuse the existing one. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28	Unify common file start/end code from data writer in helper functions	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>