aboutsummaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)Author
2021-07-09Fix printf format specifiers used for generating tarballsDavid Oberhollenzer
When processing files > 4G, using "%o" truncates the result and the tarball is not readable. This should have been discovered when auto-patching the printf format specifiers, but a cast was added instead and the issue was overlooked. This commit replaces the down-cast and printf format specifiers. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25libfstream: sanity check the buffer size in the gzip stream compressorDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25Add default cases for every switch blockDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25Remove casual un-const casting in various placesDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25libutil: cleanup alignment trickery in mempoolDavid Oberhollenzer
- Store the return value of the page allocation directly into the pool variable instead of an intermediate unsigned char pointer. - Make the blob[] array the same type as the bitmap, this saves us manual alignment trickery. - Cleanup the pointer arithmetic, let the compiler do the sizeof() multiplication. - Use uintptr_t for the manual alignment of the data pointer, so we don't run into signdness problems there. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25libsquashfs: get rid of potentially unaligned access and VLAsDavid Oberhollenzer
The same problem with the meta data header again, 16 bit read from a buffer: copy the buffer data into a 16 bit variable instead of casting to something potentially unaligned. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25libcommon: remove potentially un-aligned access in LZO compressorDavid Oberhollenzer
When accessing the 16 bit header, don't cast the buffer pointer to an uint16_t pointer, the result might not be aligned propperly. Instead memcpy to and from an uint16_t. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25libfstree: guard against possible overflow in readlink()David Oberhollenzer
*in theory*, say on a 32 bit system, we could have a 32 bit size_t and a 64 bit off_t. If the filesystem permitted this, we *could* then have a symlink with a target > 4G. Or the target is exacetely 4G, but adding a null-terminator could exceed addressable memory. This commit adds a check to guard against such an overflow and throw an error, instead of silently wrapping around. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25libfstree: guard against link count and inode number overflowDavid Oberhollenzer
If the hard link counter or the inode number counter overflow the maximum representable value (for SquashFS 16 bit and 32 bit respecitively), abort with an error message. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25libfstream: guard against potential integer overflowsDavid Oberhollenzer
The differen compressor libraries use differnt integer types to tally the buffer sizes. The libfstream library uses size_t, which may be bigger than the actualy types, potentially causing an overflow if trying to compress to much at once. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-07libsquashfs: fix: also preserve alignment flag in block processorDavid Oberhollenzer
Currently, when the block processor aggreagtes fragments into a fragment block, it applies the "don't compress" flag if any of the original framgnets has it set, but the "align to device block" flag is lost. This commit ensures that both flags get applied to the fragment block if set. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-07libsquashfs: fix block alignment if requestedDavid Oberhollenzer
1) If the block alignment flag is set, the padding bytes must be inserted _before_ recording the start position, otherwise the resulting image is not readable. 2) Also perform alignment if the flag is set on a fragment block. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-04Fix: allow concatenated Bzip2 streamsDavid Oberhollenzer
This is a followup to dd4e6ead142e58568aec89d76b0b2e867ee983f2. Basically the same problem occours with Bzip2, but it so far it wasn't possible to find a sampel that reproduces it. Unlike libxz, the libbz2 API does not support concatenated streams by itself and will choke when trying to decompress after the stream end, so this commit adds a workaround to simply initialize the decompressor on-the-fly and tear it down again when and end-of-stream is returned. The end-of-file condition is only set when there actually is no more data to read. Otherwise, the decompressor will be re-initialized in the next round. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-05-06Fix: allow concatenated xz streamsDavid Oberhollenzer
Some xz compressed tarballs (e.g. from kernel.org) are not made up of a single xz stream, but rather contain several, independendly compressed streams. In that case, the xz decompressor hits an LZMA_STREAM_END early on and reports EOF. If you are lucky, the tar reader bails (premature end-of-file). If you are unlucky, it happens exactely between two records and is interpeted as regular end-of-file. As this seems to be a normal use case for xz, it has a flag to just read across the seams and only report end-of-stream if the action is set to finish. This commit adds the flag to the initialization propperly sets the lzma_action depending on whether the underlying stream hit EOF or not. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-04-08Fix: libsquashfs: add sqfs_free() functionDavid Oberhollenzer
On systems like Windows, the dynamic library and applications can easily end up being linked against different runtime libraries, so applications cannot be expected to be able to free() any malloc'd pointer that the library returns. This commit adds an sqfs_free function so the application can pass pointers back to the library to call the correct free() implementation. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-30Fix: don't throw an error if fsync() returns EINVALDavid Oberhollenzer
This indicates that sync isn't possible on the underlying file descriptor (e.g. a pipe), which currently causes sqfs2tar to err if the output isn't written directly to a file. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-30libsqfs: block processor: Fix account for manually submitted blocksDavid Oberhollenzer
This was already in the original block processor but got dropped by accident when restructuring it. The problem manifests itself when manually submitting fragment blocks. They no longer get correct I/O queue tickets, clog up the queue and the processor eventually throws an internal error. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-25libfstree: allow the glob path to be emptyDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-25libfstree: add an assertion that root is not NULLDavid Oberhollenzer
If the path argument is "", we assume that referes to root and set the *existing* target node to the root node and skip ahead across the tree search. This leaves "name" uninitialized, which makes coverity panic, because fs->root could be NULL, going down the wrong path. Obviously, this should never, *ever* happen and there is no reasonable recovery strategy if it suddenly does, so simply add an assertion. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-25Fix fail branch in block processor fragment backendDavid Oberhollenzer
Only clean up the fragment if it hasn't been re-assigned to the fragment block. The NULL check is definitely wrong, because we no longer re-assign it as NULL. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-25libfstree: Allow / as argument for "glob" and "dir" commandsDavid Oberhollenzer
This allows putting globbed files & directories into the filesystem root, as well as explicitly setting attributes of the root directory from the file lisiting. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-24Provide Musl derived fallbacks for getopt/getopt_long/getsuboptDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-24Port the pool allocator to WindowsDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-24Fix block processor queue accountingDavid Oberhollenzer
Dequeuing won't work if we have a backlog of 1 or 2 and the blocks are used for internal buffering. Take that into account, similar to the sync code. Also bump the minimum backlog to 3, just to make absolutely sure we cannot run into a dequeue loop trying to allocate a block. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-24libfstree: implement directory scanning code for WindowsDavid Oberhollenzer
It's rather simplistic and doesn't account for junction/reparse points, which is the closest thing Windows has to symlinks, hard links and mount points, but it's consistent with the unpacking code that assumes Windows only has files and directories. Using the 32 bit mingw toolchain, this seems to satisfy the unit tests on wine. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-23Fix windows build of the thread pool in libsquashfsDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-23block processor: Re-implement exact fragment matchingDavid Oberhollenzer
In the hash-table equals callback, if the hash and size match, do an exact, byte-for-byte comparison of the fragment in question. The fragment can either be in a fragment block that is in-flight (for which we have the in-flight list), in the current, unfinished fragment block, or it can be on disk. In the later case, the fragment block is resolved through the fragment table and read back from disk into a scratch buffer and decompressed. After that, the fragment is checked for byte-for-byte equality with the one we resolved through the hash table. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-23block processor: keep duplicate copies of in-flight fragment blocksDavid Oberhollenzer
If we want full, byte-for byte, verification of fragments during de-duplication we need to check back with the blocks already written to disk, or with the ones that are in flight. The previous, extremely hacky approach simply locked up the thread pool and investigated the queues. For the new approach, we treat the thread pool as completely opaque and don't try to touch it. This commit modifies the block processor to keep duplicate copies of each submitted fragment block around, that are cleaned up once the block is dequeued and written to disk. So instead of touching the thread pool, we can simply investigate the in-fligth-block list and the current block, before resorting to reading back fragment blocks from the file. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-22Threadpool: pre-emtively dequeue items after enqueingDavid Oberhollenzer
When we already hold the mutex, try to pre-emtively dequeue items into a "safe queue". When actually asked to dequeue, take blocks from there first and avoid having to enter the critical section if possible. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-22block processor: simplify backlog accountingDavid Oberhollenzer
Simply count the number of blocks we hand out (malloc'ed or recycled) and decrease the counter when we put blocks back for recycling. The sync() part becomes a little more complicated, because we can get stuck with a backlog of 1 or 2 because we have a fragment or current block buffer in use. We also need to accout for this when creating the processor, because we need to be able to request at least 2 blocks without stalling. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-22Cleanup the block processor file structureDavid Oberhollenzer
A cleaner separation between common code, frontend code and backend code is made. The "is this byte blob zero" function is moved out to libutil (with test case and everything) with a more optimized implementation. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-21Fix missing error code initializationDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-21Rename thread pool serial implementation data structureDavid Oberhollenzer
Hopeing that coverity can now tell the two appart. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-21Cleanup: Rewrite block processor to use the libutil thread_pool_tDavid Oberhollenzer
Throw out the messy thread pool implementation and temporarily also remove the exact fragment matching for simplicity. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-21Add a thread pool implementation to libutilDavid Oberhollenzer
The thread pool enforces ordering of items during dequeue similar to the already existing implementation in libsqfs. The idea is to eventually pull this functionality out of the block processor and turn it into a cleaner, separately tested module. The thread pool is implemented as an abstract interface, so we can have multiple implementations around, including the serial fallback implementation which we can then *always* test, irregardless of the compile config and run through static analysis as well. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-21Force 64 bit alignment of blocks managed by the pool allocatorDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-20Fix: libcompat: add missing stdio includesDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-20Fix: add missing include path to libfstream if using builtin zlibDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-20Add libcompat fallback implementation for fnmatchDavid Oberhollenzer
This has basically been copied over from Musl and slightly modifed. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-07Optionally use a pool allocator for rb-tree nodesDavid Oberhollenzer
This commit restructures the rbtree code to optionally use a pool allocator for the nodes. The option is made depenend on the presence of a pre-processor flag. To the configure script is added an option to enable/disable the use of custom allocators. It makes sense to still allow the malloc/free based routes for better ASAN based instrumentation. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-07Implement a custom memory pool allocatorDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-07Rewrite the str_table to internally use the more opimized hash_tableDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Fix: meta reader behaviour if accessing block at location 0David Oberhollenzer
Technically, this should *never* **ever** happen, because a SquashFS file always starts with a super block, which isn't wrapped in a meta data block, so a valid SquashFS file will never have a reason to read from offset 0. However, this does bite us when doing unit tests where the meta reader and writer are used on an otherwise empty file. When trying to read from offset 0, the caching code assumes that we already have that block, since tha block_offset got initialized to 0. This commit changes the initialization to set the current block location to the maximum 64 bit integer, a location we are never going to read from, since it will always be after the limit. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Cleanup: replace ad-hoc dynamic array in sqfs_xattr_writer_tDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Cleanup: repalce ad-hoc dynamic array used for export tableDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Cleanup: replace ad-hoc dynamic array in sqfs_id_table_tDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Cleanup: replace ad-hoc dynamic array in sqfs_frag_table_tDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Add a generic implementation of a dynamic array to libutilDavid Oberhollenzer
The intention is to get rid of all the ad-hoc array implementations in the other components and cut down code size. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Store xattr writer block description in a red-black treeDavid Oberhollenzer
By storing the blocks in a tree, the de-duplication can lookup existing blocks in logartihmic instead of linear time. The linked list is still maintained, because we need to iterate over the blocks in creation order during serialization. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Add a context pointer to the rbtree key comparisonDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>