squashfs-tools-ng.git - A new set of tools and libraries for working with SquashFS images

Age	Commit message (Collapse)	Author
2022-09-20	block writer: move block comaprison to utility function	David Oberhollenzer
	Slightly modify the byte-for-byte comparison function to compare an arbitrary range in a file and move it to libutil. Instead of calling it for each block in the block writer, simply let it check an entire range in the block writer and compute the range position/size of the reference ahead, before looking for potential matches. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-09-20	block writer: remove open coded array	David Oberhollenzer
	Instead of open coding it, use the array_t type from libutil. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-07-08	Make sqfs_tree_node_get_path more robust	David Oberhollenzer
	Test against various invariants: - Every non-root node must have a name - The root node muts not have a name - The name must not be ".." or "." - The name must not contain '/' - The loop that chases parent pointers must terminate, i.e. we must never reach the starting state again (link loop). Furthermore, make sure the sum of all path components plus separators does not overflow. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-07-08	Move sqfs_tree_node_get_path to libsquashfs	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-07-08	Cleanup: move libutil headers to sub directory	David Oberhollenzer
	Move all the libutil stuff from the toplevel include/ to a util/ sub directory and fix up the includes that make use of them. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-06-02	Cleanup: libsqfs: simplify state handling in dir reader	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-06-02	Cleanup: libsqfs: sqfs_dir_reader_find_by_path	David Oberhollenzer
	Split out several repated patterns into helper functions and move the rest of the code back into dir_reader.c Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-06-02	Cleanup: libsqfs: merge dir cache code back into dir_reader.c	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-06-02	Cleanup: libsqfs: move directory iteration out of the directory reader	David Oberhollenzer
	Add a simple directory state object to the meta data reader and use that to iterate directory entries. The code for reading the directory listing is movde to readdir.c Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-06-01	Fix: libsqfs: do not report out of bounds positions from meta reader	David Oberhollenzer
	When asking the meta data reader for its current position and we just read to the end of a block, report the start of the next block as the current location. Otherwise, trying to seek to the resulting position immediately after reporting throws an out-of-bounds error. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-04-10	Remove builtin copy of zlib	David Oberhollenzer
	On GNU/Linux, *BSD or MacOS we can simply use the system default library. The copy was primarily only there for the Windows build. The build script for Windows has now been adapted to download and compile a shared library from a tarball. This removes a huge chunk of code from the git tree as well as the release tarballs. Additionally it gets rid of iffy things like removing the Zlib copyright/version strings, so the libsquashfs DLL doesn't export it. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-04-09	Add support for '.' and '..' entries in sqfs_dir_reader_t	David Oberhollenzer
	Two flags are added to the dir reader API, one for the create function that the dir reader should report those entries and one to the open function to suppress that if it was enabled. To implement the feature, a mapping of visited directory inodes is maintained internally, that mapps inode numbers to inode references. When opening a directory, state is maintained to generate the fake entries for '.' and '..'. Since all the other functions are based on the open/read/rewind API, no alterations need to be made. The tree scan function is modified, to use the suppress flag, so it does not accidentally catch those entries. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-04-05	libsqfs: move dir reader code to sub directory, add internal header	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-03-30	sqfs_dir_tree_destroy/sqfs_destroy: allow NULL input	Luca Boccassi
	Many library destructor functions (like free()) allow a NULL pointer as input, and do nothing in that case. This allows easier cleanup patterns: initialize pointers to NULL and then always pass them to the destroyer functions, no need for verbose goto/if-else patterns. Signed-off-by: Luca Boccassi <luca.boccassi@microsoft.com>
2022-03-10	Fix: guard against potential overflow in file size calculation	David Oberhollenzer
	The block_count is a size_t, so on 32 bit platforms the multiplication might be truncated before the comparison with filesz. On 64 bit platforms, it could potentially also overflow the 64 bit bounds of the data type. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-12-05	Fix: consistently use the widechar file API on Windows	David Oberhollenzer
	When opening files on windows, use the widechar versions and convert from (assumed) UTF-8 to UTF-16 as needed. Since the broken, code-page-random API may acutall be intended in some use cases, leave that option in through an additional flag. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-08-22	Tighten bounds checks in sqfs_dir_reader_reader	David Oberhollenzer
	Use the same size check as sqfs_dir_reader_open_dir and report EOF, even if it is possible to read the header itself, but nothing beyond that. Also check if it should be possible to read an entry header before attempting and report EOF if not. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-08-22	Fix half done initialization of sqfs_dir_reader_open_dir	David Oberhollenzer
	The sqfs_dir_reader_open_dir function tried to take a short-cut by returning early if the target directory is empty. However, this left some field unchanged from the previous directory. If iterating over a directory and then deciding to enter a sub-directory that happens to be empty, the directory reader will keep the settings for the current directory. After calling sqfs_dir_reader_rewind, the sub-directory will suddenly report the contents of the parent. A similar check is added to the rewind function to not track back on the meta data reader in that case. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-07-21	Fix libsquashfs directory writer size accounting	David Oberhollenzer
	The squashfs readdir() implementation in the Linux kernel returns non-existing "." and ".." entries for offsets 0 and 1, and after that reads from disk. For convenience, it was decided to store an off-by-3 value on disk instead of doing complex primary school math to adjust for this. This didn't show up until now, because the kernel implementation trusts the value from the directory header more than the actual size in the inode and happily reads 3 more than the inode would allow it to. This only showed up with 7-zip which subtracts 3 from the size and expects the result to be exact and bails if the directory headers suggest otherwise. And yes, I did consider making a "Holy Hand Granade of Antioch" reference, but consciously decided not to. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25	Add default cases for every switch block	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25	Remove casual un-const casting in various places	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25	libsquashfs: get rid of potentially unaligned access and VLAs	David Oberhollenzer
	The same problem with the meta data header again, 16 bit read from a buffer: copy the buffer data into a 16 bit variable instead of casting to something potentially unaligned. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-07	libsquashfs: fix: also preserve alignment flag in block processor	David Oberhollenzer
	Currently, when the block processor aggreagtes fragments into a fragment block, it applies the "don't compress" flag if any of the original framgnets has it set, but the "align to device block" flag is lost. This commit ensures that both flags get applied to the fragment block if set. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-07	libsquashfs: fix block alignment if requested	David Oberhollenzer
	1) If the block alignment flag is set, the padding bytes must be inserted _before_ recording the start position, otherwise the resulting image is not readable. 2) Also perform alignment if the flag is set on a fragment block. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-04-08	Fix: libsquashfs: add sqfs_free() function	David Oberhollenzer
	On systems like Windows, the dynamic library and applications can easily end up being linked against different runtime libraries, so applications cannot be expected to be able to free() any malloc'd pointer that the library returns. This commit adds an sqfs_free function so the application can pass pointers back to the library to call the correct free() implementation. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-30	libsqfs: block processor: Fix account for manually submitted blocks	David Oberhollenzer
	This was already in the original block processor but got dropped by accident when restructuring it. The problem manifests itself when manually submitting fragment blocks. They no longer get correct I/O queue tickets, clog up the queue and the processor eventually throws an internal error. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-25	Fix fail branch in block processor fragment backend	David Oberhollenzer
	Only clean up the fragment if it hasn't been re-assigned to the fragment block. The NULL check is definitely wrong, because we no longer re-assign it as NULL. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-24	Fix block processor queue accounting	David Oberhollenzer
	Dequeuing won't work if we have a backlog of 1 or 2 and the blocks are used for internal buffering. Take that into account, similar to the sync code. Also bump the minimum backlog to 3, just to make absolutely sure we cannot run into a dequeue loop trying to allocate a block. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-23	Fix windows build of the thread pool in libsquashfs	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-23	block processor: Re-implement exact fragment matching	David Oberhollenzer
	In the hash-table equals callback, if the hash and size match, do an exact, byte-for-byte comparison of the fragment in question. The fragment can either be in a fragment block that is in-flight (for which we have the in-flight list), in the current, unfinished fragment block, or it can be on disk. In the later case, the fragment block is resolved through the fragment table and read back from disk into a scratch buffer and decompressed. After that, the fragment is checked for byte-for-byte equality with the one we resolved through the hash table. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-23	block processor: keep duplicate copies of in-flight fragment blocks	David Oberhollenzer
	If we want full, byte-for byte, verification of fragments during de-duplication we need to check back with the blocks already written to disk, or with the ones that are in flight. The previous, extremely hacky approach simply locked up the thread pool and investigated the queues. For the new approach, we treat the thread pool as completely opaque and don't try to touch it. This commit modifies the block processor to keep duplicate copies of each submitted fragment block around, that are cleaned up once the block is dequeued and written to disk. So instead of touching the thread pool, we can simply investigate the in-fligth-block list and the current block, before resorting to reading back fragment blocks from the file. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-22	block processor: simplify backlog accounting	David Oberhollenzer
	Simply count the number of blocks we hand out (malloc'ed or recycled) and decrease the counter when we put blocks back for recycling. The sync() part becomes a little more complicated, because we can get stuck with a backlog of 1 or 2 because we have a fragment or current block buffer in use. We also need to accout for this when creating the processor, because we need to be able to request at least 2 blocks without stalling. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-22	Cleanup the block processor file structure	David Oberhollenzer
	A cleaner separation between common code, frontend code and backend code is made. The "is this byte blob zero" function is moved out to libutil (with test case and everything) with a more optimized implementation. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-21	Fix missing error code initialization	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-21	Cleanup: Rewrite block processor to use the libutil thread_pool_t	David Oberhollenzer
	Throw out the messy thread pool implementation and temporarily also remove the exact fragment matching for simplicity. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-07	Optionally use a pool allocator for rb-tree nodes	David Oberhollenzer
	This commit restructures the rbtree code to optionally use a pool allocator for the nodes. The option is made depenend on the presence of a pre-processor flag. To the configure script is added an option to enable/disable the use of custom allocators. It makes sense to still allow the malloc/free based routes for better ASAN based instrumentation. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-07	Rewrite the str_table to internally use the more opimized hash_table	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06	Fix: meta reader behaviour if accessing block at location 0	David Oberhollenzer
	Technically, this should never ever happen, because a SquashFS file always starts with a super block, which isn't wrapped in a meta data block, so a valid SquashFS file will never have a reason to read from offset 0. However, this does bite us when doing unit tests where the meta reader and writer are used on an otherwise empty file. When trying to read from offset 0, the caching code assumes that we already have that block, since tha block_offset got initialized to 0. This commit changes the initialization to set the current block location to the maximum 64 bit integer, a location we are never going to read from, since it will always be after the limit. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06	Cleanup: replace ad-hoc dynamic array in sqfs_xattr_writer_t	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06	Cleanup: repalce ad-hoc dynamic array used for export table	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06	Cleanup: replace ad-hoc dynamic array in sqfs_id_table_t	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06	Cleanup: replace ad-hoc dynamic array in sqfs_frag_table_t	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06	Store xattr writer block description in a red-black tree	David Oberhollenzer
	By storing the blocks in a tree, the de-duplication can lookup existing blocks in logartihmic instead of linear time. The linked list is still maintained, because we need to iterate over the blocks in creation order during serialization. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-02-28	added shared read access when opening sqfs image with read-only flags (win32)	Thomas Lang

2021-02-10	Always use the correct data type for realloc return value	David Oberhollenzer
	This commit mainly serves the static analysis tooling. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-01-19	libsqfs: Implement exact matching of fragments	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-01-19	Add a user pointer to the hash table implementation	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-01-19	libsqfs: Add a sqfs_block_processor_create_ex function	David Oberhollenzer
	This function creates a block processor from a structure describing it. A stub implementation for the old sqfs_block_processor_create is added that simply sets up such a struct and forwards the call. The current version of the description struct only contains the exact same parameters and a size field at the beginning. This approach is supposed to make extending the range of parameters easier without breaking ABI compatibillity. Currently already planned are: - Adding a sqfs_file_t pointer to double-check when deduplicating fragments. - When the scanning code reaches a usable state, add the abillity to pass scanned fragment data, so the block processor can be used for appending to an existing image. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-01-19	libsqfs: block processor: removed unused chunk next pointer	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-01-19	Fix: Move fragment consolidation back to block processor serial part	David Oberhollenzer
	Keeping a list of fragments stored away in the current fragment block and consolidating them in the thread pool takes them out of circulation. If we have a lot of tiny fragments, this can lead to a situation where all the limit is reached, but we cannot do anything, because we are waiting for a block to complete, but they are all attached to the current fragment block and the queue is empty. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>