aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-03-23block processor: Re-implement exact fragment matchingDavid Oberhollenzer
In the hash-table equals callback, if the hash and size match, do an exact, byte-for-byte comparison of the fragment in question. The fragment can either be in a fragment block that is in-flight (for which we have the in-flight list), in the current, unfinished fragment block, or it can be on disk. In the later case, the fragment block is resolved through the fragment table and read back from disk into a scratch buffer and decompressed. After that, the fragment is checked for byte-for-byte equality with the one we resolved through the hash table. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-23block processor: keep duplicate copies of in-flight fragment blocksDavid Oberhollenzer
If we want full, byte-for byte, verification of fragments during de-duplication we need to check back with the blocks already written to disk, or with the ones that are in flight. The previous, extremely hacky approach simply locked up the thread pool and investigated the queues. For the new approach, we treat the thread pool as completely opaque and don't try to touch it. This commit modifies the block processor to keep duplicate copies of each submitted fragment block around, that are cleaned up once the block is dequeued and written to disk. So instead of touching the thread pool, we can simply investigate the in-fligth-block list and the current block, before resorting to reading back fragment blocks from the file. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-22Threadpool: pre-emtively dequeue items after enqueingDavid Oberhollenzer
When we already hold the mutex, try to pre-emtively dequeue items into a "safe queue". When actually asked to dequeue, take blocks from there first and avoid having to enter the critical section if possible. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-22block processor: simplify backlog accountingDavid Oberhollenzer
Simply count the number of blocks we hand out (malloc'ed or recycled) and decrease the counter when we put blocks back for recycling. The sync() part becomes a little more complicated, because we can get stuck with a backlog of 1 or 2 because we have a fragment or current block buffer in use. We also need to accout for this when creating the processor, because we need to be able to request at least 2 blocks without stalling. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-22Cleanup the block processor file structureDavid Oberhollenzer
A cleaner separation between common code, frontend code and backend code is made. The "is this byte blob zero" function is moved out to libutil (with test case and everything) with a more optimized implementation. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-21Fix missing error code initializationDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-21Rename thread pool serial implementation data structureDavid Oberhollenzer
Hopeing that coverity can now tell the two appart. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-21Cleanup: Rewrite block processor to use the libutil thread_pool_tDavid Oberhollenzer
Throw out the messy thread pool implementation and temporarily also remove the exact fragment matching for simplicity. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-21Add a thread pool implementation to libutilDavid Oberhollenzer
The thread pool enforces ordering of items during dequeue similar to the already existing implementation in libsqfs. The idea is to eventually pull this functionality out of the block processor and turn it into a cleaner, separately tested module. The thread pool is implemented as an abstract interface, so we can have multiple implementations around, including the serial fallback implementation which we can then *always* test, irregardless of the compile config and run through static analysis as well. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-21Force 64 bit alignment of blocks managed by the pool allocatorDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-20Fix: libcompat: add missing stdio includesDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-20Fix: add missing include path to libfstream if using builtin zlibDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-20Add libcompat fallback implementation for fnmatchDavid Oberhollenzer
This has basically been copied over from Musl and slightly modifed. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-07Optionally use a pool allocator for rb-tree nodesDavid Oberhollenzer
This commit restructures the rbtree code to optionally use a pool allocator for the nodes. The option is made depenend on the presence of a pre-processor flag. To the configure script is added an option to enable/disable the use of custom allocators. It makes sense to still allow the malloc/free based routes for better ASAN based instrumentation. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-07Implement a custom memory pool allocatorDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-07Update CHANGELOG.mdDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-07Rewrite the str_table to internally use the more opimized hash_tableDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-07Add a simple benchmark program for the xattr key/value recorderDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Fix wrong byte-swap macro in libsqfs table testDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Travis-CI: dump test-suite.log if make check failsDavid Oberhollenzer
Gets a little difficult to debug otherwise. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Fix libsqfs test build on OS XDavid Oberhollenzer
Add the missing compat.h header include so we have the correct endian conversion macros. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Add a simple test case for the xattr_writer_tDavid Oberhollenzer
The test case basically adds a few key/value pairs and make sure they are deduplicated correctly, including a case where they are added in a different order and a case where the value is stored OOL. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Add a basic test case for the libsqfs table codeDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Fix: meta reader behaviour if accessing block at location 0David Oberhollenzer
Technically, this should *never* **ever** happen, because a SquashFS file always starts with a super block, which isn't wrapped in a meta data block, so a valid SquashFS file will never have a reason to read from offset 0. However, this does bite us when doing unit tests where the meta reader and writer are used on an otherwise empty file. When trying to read from offset 0, the caching code assumes that we already have that block, since tha block_offset got initialized to 0. This commit changes the initialization to set the current block location to the maximum 64 bit integer, a location we are never going to read from, since it will always be after the limit. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Cleanup: replace ad-hoc dynamic array in sqfs_xattr_writer_tDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Cleanup: repalce ad-hoc dynamic array used for export tableDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Cleanup: replace ad-hoc dynamic array in sqfs_id_table_tDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Cleanup: replace ad-hoc dynamic array in sqfs_frag_table_tDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Add a generic implementation of a dynamic array to libutilDavid Oberhollenzer
The intention is to get rid of all the ad-hoc array implementations in the other components and cut down code size. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Store xattr writer block description in a red-black treeDavid Oberhollenzer
By storing the blocks in a tree, the de-duplication can lookup existing blocks in logartihmic instead of linear time. The linked list is still maintained, because we need to iterate over the blocks in creation order during serialization. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Add a context pointer to the rbtree key comparisonDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Add a copy function to the rb-tree implementationDavid Oberhollenzer
If we use the rb-tree in libsquashfs objects, we need to be able top copy an entire tree as part of the object. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Cleanup: replace the void-ptr with an inode-ptr in the file tree nodeDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Cleanup: add some structure to the test directoryDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-05Remove lz4 & zstd from corpus testDavid Oberhollenzer
Relying on the output of a compressor to exactely match an expected output is already not really a great idea, but for gzip, xz and lzo it has worked remarkably well so far. Perhaps because those are rather old and don't have much active development going on besides bug fixing. On the other hand, lz4 and zstd which are much younger seem to have more development going on and keep breaking between versions. This commit removes the zstd & lz4 corpus tests. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-02-28Update CHANGELOG.mdDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-02-28added shared read access when opening sqfs image with read-only flags (win32)Thomas Lang
2021-02-19Fix: libfstree: add an assert the canonicalize_name return valueDavid Oberhollenzer
Since the canonicalize_name function only fails if the path contains ".." and the one we are constructing from the scanned fstree (built using canonicalized names), it should NEVER fail. However, coverity does get concerned, because we are checking the return value elesewhere. So do what we do at other, similar locations and add an assert(). Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-02-19gensquashfs: Document the globbing featureDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-02-19Fix: canonicalize path names in glob pattern matchingDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-02-19libfstree: reject unknown glob options to allow future expansionsDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-02-19Add simple test cases for fstree globbingDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-02-19fstree_from_file: Add fnmatch() pattern matching to file globbingDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-02-19fstree_from_file: Implement basic file globbingDavid Oberhollenzer
The base path is passed to the fstree_from_file function and in turn to the individual callbacks. The line parsing function is modified to allow '*' as mode, uid and gid for specifically marked callbacks. A glob callback is added that internally uses the fstree_from_dir scanning functions in combination with a filter callback. Directory scanning flags are parsed from the extra arguments before interpreting it as a path fragment. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-02-19libfstree: Add a filter callback to the directory scanning functionDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-02-18gensquashfs: always construct input path during option processingDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-02-18libfstree: add a subdirectory scanning functionDavid Oberhollenzer
So we can scan a sub-directory within a the base directory without having to do string operations first. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-02-18fstree_from_dir: add filtering flags to skip certain inode typesDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-02-10cleanup: fstree_from_file: split & simplify line parsing functionDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-02-10Always use the correct data type for realloc return valueDavid Oberhollenzer
This commit mainly serves the static analysis tooling. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>