squashfs-tools-ng.git - A new set of tools and libraries for working with SquashFS images

Age	Commit message (Collapse)	Author
2022-03-10	Windows: redirect standard I/O and convert text to UTF-16	David Oberhollenzer
	Preprocessor magic is used to redirect putc/fputc/fputs/printf/fprintf to custom implementations. The custom implementations try to figure out if we are printing to the console and, if so, convert the resulting strings to UTF-16 and print them through ConsoleWriteW. If the output is redirected to a file or a pipe, the original (presummed) UTF-8 is kept. Simply setting the console output codepage to UTF-8 does not work, because the standard I/O facilities of MSVCRT either does not support unicode (in non-wchar mode), or has half-broken support through fputs, which can still break up multi-byte sequences through its internal buffering. Likewise, changing the codepage and using ConsoleWriteA, or trying to use fputws did not work in a test VM either. This approach is the one that worked most consistently among the ones tried, but also has problems. E.g. it breaks when setting the codepage to UTF-8 manually (using `chcp 65001`). Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-03-09	Fix: Windows: libfstream: allocation size of stdout stream struct	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-01-29	Fix: libfstream: don't fail on Windows when reading from a pipe	David Oberhollenzer
	When piping the output of another program into tar2sqfs.exe, and the source program terminates, tar2sqfs.exe gets an ERROR_BROKEN_PIPE when the end is reached and it trys to pre-cache more data. This commit adds a work around, to propperly handle this as and end-of-file condition. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-12-14	Fix Windows main wrapper after mingw upgrade	David Oberhollenzer
	Apparently, mingw implicitly included stdlib.h indirectly from either windows.h or shellapi.h. After an upgrade, the windows build now fails with EXIT_FAILURE being undefined. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-12-05	Fix: consistently use the widechar file API on Windows	David Oberhollenzer
	When opening files on windows, use the widechar versions and convert from (assumed) UTF-8 to UTF-16 as needed. Since the broken, code-page-random API may acutall be intended in some use cases, leave that option in through an additional flag. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-12-05	Add a wrapper for the main function on Windows	David Oberhollenzer
	A macro and forward declaration are added to compat.h that rename the main() function programs using compat.h into sqfs_tools_main. An actual main() function is added to libcompat.a, that uses the shell API to get the UTF-16 command line arguments, convert them to UTF-8 and call sqfs_tools_main. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-11-24	Fix: libcommon: Correctly restore prefix path in mkdir_p on Windows	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-11-24	Fix: libfstream: Correctly handle FlushFileBuffers resturn status	David Oberhollenzer
	The Windows port uses FlushFileBuffers in libfstream for the implmentation of the file flush method. Unlike other winapi functions, this function returns a boolean and not an error code. Previously, the error code path was executed on success, printing a rather confusing error message, that this file already exists. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-08-22	Tighten bounds checks in sqfs_dir_reader_reader	David Oberhollenzer
	Use the same size check as sqfs_dir_reader_open_dir and report EOF, even if it is possible to read the header itself, but nothing beyond that. Also check if it should be possible to read an entry header before attempting and report EOF if not. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-08-22	Fix half done initialization of sqfs_dir_reader_open_dir	David Oberhollenzer
	The sqfs_dir_reader_open_dir function tried to take a short-cut by returning early if the target directory is empty. However, this left some field unchanged from the previous directory. If iterating over a directory and then deciding to enter a sub-directory that happens to be empty, the directory reader will keep the settings for the current directory. After calling sqfs_dir_reader_rewind, the sub-directory will suddenly report the contents of the parent. A similar check is added to the rewind function to not track back on the meta data reader in that case. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-07-21	Fix libsquashfs directory writer size accounting	David Oberhollenzer
	The squashfs readdir() implementation in the Linux kernel returns non-existing "." and ".." entries for offsets 0 and 1, and after that reads from disk. For convenience, it was decided to store an off-by-3 value on disk instead of doing complex primary school math to adjust for this. This didn't show up until now, because the kernel implementation trusts the value from the directory header more than the actual size in the inode and happily reads 3 more than the inode would allow it to. This only showed up with 7-zip which subtracts 3 from the size and expects the result to be exact and bails if the directory headers suggest otherwise. And yes, I did consider making a "Holy Hand Granade of Antioch" reference, but consciously decided not to. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-07-09	Fix printf format specifiers used for generating tarballs	David Oberhollenzer
	When processing files > 4G, using "%o" truncates the result and the tarball is not readable. This should have been discovered when auto-patching the printf format specifiers, but a cast was added instead and the issue was overlooked. This commit replaces the down-cast and printf format specifiers. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25	libfstream: sanity check the buffer size in the gzip stream compressor	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25	Add default cases for every switch block	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25	Remove casual un-const casting in various places	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25	libutil: cleanup alignment trickery in mempool	David Oberhollenzer
	- Store the return value of the page allocation directly into the pool variable instead of an intermediate unsigned char pointer. - Make the blob[] array the same type as the bitmap, this saves us manual alignment trickery. - Cleanup the pointer arithmetic, let the compiler do the sizeof() multiplication. - Use uintptr_t for the manual alignment of the data pointer, so we don't run into signdness problems there. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25	libsquashfs: get rid of potentially unaligned access and VLAs	David Oberhollenzer
	The same problem with the meta data header again, 16 bit read from a buffer: copy the buffer data into a 16 bit variable instead of casting to something potentially unaligned. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25	libcommon: remove potentially un-aligned access in LZO compressor	David Oberhollenzer
	When accessing the 16 bit header, don't cast the buffer pointer to an uint16_t pointer, the result might not be aligned propperly. Instead memcpy to and from an uint16_t. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25	libfstree: guard against possible overflow in readlink()	David Oberhollenzer
	in theory, say on a 32 bit system, we could have a 32 bit size_t and a 64 bit off_t. If the filesystem permitted this, we could then have a symlink with a target > 4G. Or the target is exacetely 4G, but adding a null-terminator could exceed addressable memory. This commit adds a check to guard against such an overflow and throw an error, instead of silently wrapping around. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25	libfstree: guard against link count and inode number overflow	David Oberhollenzer
	If the hard link counter or the inode number counter overflow the maximum representable value (for SquashFS 16 bit and 32 bit respecitively), abort with an error message. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25	libfstream: guard against potential integer overflows	David Oberhollenzer
	The differen compressor libraries use differnt integer types to tally the buffer sizes. The libfstream library uses size_t, which may be bigger than the actualy types, potentially causing an overflow if trying to compress to much at once. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-07	libsquashfs: fix: also preserve alignment flag in block processor	David Oberhollenzer
	Currently, when the block processor aggreagtes fragments into a fragment block, it applies the "don't compress" flag if any of the original framgnets has it set, but the "align to device block" flag is lost. This commit ensures that both flags get applied to the fragment block if set. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-07	libsquashfs: fix block alignment if requested	David Oberhollenzer
	1) If the block alignment flag is set, the padding bytes must be inserted _before_ recording the start position, otherwise the resulting image is not readable. 2) Also perform alignment if the flag is set on a fragment block. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-04	Fix: allow concatenated Bzip2 streams	David Oberhollenzer
	This is a followup to dd4e6ead142e58568aec89d76b0b2e867ee983f2. Basically the same problem occours with Bzip2, but it so far it wasn't possible to find a sampel that reproduces it. Unlike libxz, the libbz2 API does not support concatenated streams by itself and will choke when trying to decompress after the stream end, so this commit adds a workaround to simply initialize the decompressor on-the-fly and tear it down again when and end-of-stream is returned. The end-of-file condition is only set when there actually is no more data to read. Otherwise, the decompressor will be re-initialized in the next round. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-05-06	Fix: allow concatenated xz streams	David Oberhollenzer
	Some xz compressed tarballs (e.g. from kernel.org) are not made up of a single xz stream, but rather contain several, independendly compressed streams. In that case, the xz decompressor hits an LZMA_STREAM_END early on and reports EOF. If you are lucky, the tar reader bails (premature end-of-file). If you are unlucky, it happens exactely between two records and is interpeted as regular end-of-file. As this seems to be a normal use case for xz, it has a flag to just read across the seams and only report end-of-stream if the action is set to finish. This commit adds the flag to the initialization propperly sets the lzma_action depending on whether the underlying stream hit EOF or not. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-04-08	Fix: libsquashfs: add sqfs_free() function	David Oberhollenzer
	On systems like Windows, the dynamic library and applications can easily end up being linked against different runtime libraries, so applications cannot be expected to be able to free() any malloc'd pointer that the library returns. This commit adds an sqfs_free function so the application can pass pointers back to the library to call the correct free() implementation. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-30	Fix: don't throw an error if fsync() returns EINVAL	David Oberhollenzer
	This indicates that sync isn't possible on the underlying file descriptor (e.g. a pipe), which currently causes sqfs2tar to err if the output isn't written directly to a file. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-30	libsqfs: block processor: Fix account for manually submitted blocks	David Oberhollenzer
	This was already in the original block processor but got dropped by accident when restructuring it. The problem manifests itself when manually submitting fragment blocks. They no longer get correct I/O queue tickets, clog up the queue and the processor eventually throws an internal error. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-25	libfstree: allow the glob path to be empty	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-25	libfstree: add an assertion that root is not NULL	David Oberhollenzer
	If the path argument is "", we assume that referes to root and set the existing target node to the root node and skip ahead across the tree search. This leaves "name" uninitialized, which makes coverity panic, because fs->root could be NULL, going down the wrong path. Obviously, this should never, ever happen and there is no reasonable recovery strategy if it suddenly does, so simply add an assertion. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-25	Fix fail branch in block processor fragment backend	David Oberhollenzer
	Only clean up the fragment if it hasn't been re-assigned to the fragment block. The NULL check is definitely wrong, because we no longer re-assign it as NULL. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-25	libfstree: Allow / as argument for "glob" and "dir" commands	David Oberhollenzer
	This allows putting globbed files & directories into the filesystem root, as well as explicitly setting attributes of the root directory from the file lisiting. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-24	Provide Musl derived fallbacks for getopt/getopt_long/getsubopt	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-24	Port the pool allocator to Windows	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-24	Fix block processor queue accounting	David Oberhollenzer
	Dequeuing won't work if we have a backlog of 1 or 2 and the blocks are used for internal buffering. Take that into account, similar to the sync code. Also bump the minimum backlog to 3, just to make absolutely sure we cannot run into a dequeue loop trying to allocate a block. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-24	libfstree: implement directory scanning code for Windows	David Oberhollenzer
	It's rather simplistic and doesn't account for junction/reparse points, which is the closest thing Windows has to symlinks, hard links and mount points, but it's consistent with the unpacking code that assumes Windows only has files and directories. Using the 32 bit mingw toolchain, this seems to satisfy the unit tests on wine. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-23	Fix windows build of the thread pool in libsquashfs	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-23	block processor: Re-implement exact fragment matching	David Oberhollenzer
	In the hash-table equals callback, if the hash and size match, do an exact, byte-for-byte comparison of the fragment in question. The fragment can either be in a fragment block that is in-flight (for which we have the in-flight list), in the current, unfinished fragment block, or it can be on disk. In the later case, the fragment block is resolved through the fragment table and read back from disk into a scratch buffer and decompressed. After that, the fragment is checked for byte-for-byte equality with the one we resolved through the hash table. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-23	block processor: keep duplicate copies of in-flight fragment blocks	David Oberhollenzer
	If we want full, byte-for byte, verification of fragments during de-duplication we need to check back with the blocks already written to disk, or with the ones that are in flight. The previous, extremely hacky approach simply locked up the thread pool and investigated the queues. For the new approach, we treat the thread pool as completely opaque and don't try to touch it. This commit modifies the block processor to keep duplicate copies of each submitted fragment block around, that are cleaned up once the block is dequeued and written to disk. So instead of touching the thread pool, we can simply investigate the in-fligth-block list and the current block, before resorting to reading back fragment blocks from the file. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-22	Threadpool: pre-emtively dequeue items after enqueing	David Oberhollenzer
	When we already hold the mutex, try to pre-emtively dequeue items into a "safe queue". When actually asked to dequeue, take blocks from there first and avoid having to enter the critical section if possible. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-22	block processor: simplify backlog accounting	David Oberhollenzer
	Simply count the number of blocks we hand out (malloc'ed or recycled) and decrease the counter when we put blocks back for recycling. The sync() part becomes a little more complicated, because we can get stuck with a backlog of 1 or 2 because we have a fragment or current block buffer in use. We also need to accout for this when creating the processor, because we need to be able to request at least 2 blocks without stalling. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-22	Cleanup the block processor file structure	David Oberhollenzer
	A cleaner separation between common code, frontend code and backend code is made. The "is this byte blob zero" function is moved out to libutil (with test case and everything) with a more optimized implementation. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-21	Fix missing error code initialization	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-21	Rename thread pool serial implementation data structure	David Oberhollenzer
	Hopeing that coverity can now tell the two appart. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-21	Cleanup: Rewrite block processor to use the libutil thread_pool_t	David Oberhollenzer
	Throw out the messy thread pool implementation and temporarily also remove the exact fragment matching for simplicity. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-21	Add a thread pool implementation to libutil	David Oberhollenzer
	The thread pool enforces ordering of items during dequeue similar to the already existing implementation in libsqfs. The idea is to eventually pull this functionality out of the block processor and turn it into a cleaner, separately tested module. The thread pool is implemented as an abstract interface, so we can have multiple implementations around, including the serial fallback implementation which we can then always test, irregardless of the compile config and run through static analysis as well. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-21	Force 64 bit alignment of blocks managed by the pool allocator	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-20	Fix: libcompat: add missing stdio includes	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-20	Fix: add missing include path to libfstream if using builtin zlib	David Oberhollenzer
	Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-20	Add libcompat fallback implementation for fnmatch	David Oberhollenzer
	This has basically been copied over from Musl and slightly modifed. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>