aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-04-22Import and use Mesa's hash tableMatt Turner
With `perf record`/`perf report` I saw that 30% of the time was spent in `sqfs_frag_table_find_tail_end` with tar2sqfs for a tarball containing the Gentoo ebuild repository (many thousands of small files). The reason was the bucketing hash table in frag_table.c: too many elements in too few buckets meant lots of walking over the linked lists. This patch replaces that hash table with the hash table implementation from Mesa. Its implementation is more complex (is is an open-addressing, linear-reprobing) hash table, but it is much better suited for the task. On my 4c/8t Skylake, the time to run tar2sqfs drops from 7.5s to less than 3s. CPU usage increases from ~207% to ~356%, presumably indicating an increase in available parallelism due to the removal of the hash table as a bottleneck. The `perf report` profile with this patch shows that the time spent in `sqfs_frag_table_find_tail_end` has dropped from ~30% to 0.01%. Output from ministat: x before + after N Min Max Median Avg Stddev x 20 7.476 7.685 7.5725 7.5615 0.051254268 + 20 2.79 2.901 2.846 2.84475 0.03543842 Difference at 95.0% confidence -4.71675 +/- 0.0282015 -62.3785% +/- 0.241477% (Student's t, pooled s = 0.0440618) I imported only the bits of the hash table implementation that were needed for frag_table.c. Among the changes I made after importing are - removed usage of ralloc, Mesa's recursive memory allocator - Replaced ralloc -> malloc ralloc_free -> free rzalloc_array -> calloc - Removed mem_ctx parameters - Added free()s to the appropriate places (valgrind confirms there are no leaks) - removed _mesa_-prefix from function names Fixes: #40 Signed-off-by: Matt Turner <mattst88@gmail.com>
2020-04-22COPYING: Fix a couple of typosMatt Turner
Signed-off-by: Matt Turner <mattst88@gmail.com>
2020-04-22Skip PAX global headersDavid Oberhollenzer
Tar archives can contain set two kinds of PAX headers: - local headers that modify the attributes of the next file - global headers that set defaults for all files The later is used "... not widely used", according to tar(5) and has been deliberately not implemented. Some programs (e.g. git-archive) *do* generate them (in the case of git, it stores the commit hash). This commit adds a code path that skips a PAX global header entirely and resumes tar parsing, instead of erroneusly reporting it as an entry. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-21Fix: add missing --with-gzip/--without-gzip configure handleDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-17Remove some configure time sizeof checksDavid Oberhollenzer
In libtar, the sizeof time_t checked when trying to store a time value. It is pointless using the preprocessor here, as we can simply do an if (sizeof(time_t) < ...) check and the compiler will take care optimizing away one or the other branch. After changing the libtar check and the corresponding unit tests, the sizeof check can be removed from configure.ac, along with other unused sizeof checks. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-17Cleanup: split read_header.c in libtar.aDavid Oberhollenzer
Simply moving the pax header decoding to a separate file and splitting out the common helper functions should be a good start. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-17tests: improve diagnostics output & make "release builds" workDavid Oberhollenzer
This commit adds a few macros and helper functions for the unit test programs. Those are used instead of asserts to provide more fine grained diagnostics on the one hand and on the other hand because they also work if NDEBUG is defined, unlike asserts that get eliminated in that case. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-16tar2sqfs & gensquashfs: Delete the output file on failureDavid Oberhollenzer
This commit changes the tar2sqfs & gensquashfs code to pass the exit status on to sqfs_writer_cleanup in libcommon. The function sqfs writer code in libcommon is changed to retain the output file name and delete it if the status passed to the cleanup function is anything other than EXIT_SUCCESS. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-16Propperly cast void pointer in sqfs_object_t inline functionDavid Oberhollenzer
Otherwise, C++ compilers will scream. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-16sqfs2tar: Fix trailing slashes for directory namesDavid Oberhollenzer
sqfs2tar is supposed to append slashes to directory names. Until now, it assumed a tree node to be a directory if it has children. This simple check obviously fails for empty directories. This commit fixes the check by actually testing the inode mode. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-01Fix missing header without LZOAlyssa Ross
lib/common/compress.c: In function 'compressor_get_default': lib/common/compress.c:39:2: warning: implicit declaration of function 'assert' [-Wimplicit-function-declaration] 39 | assert(0); | ^~~~~~ lib/common/compress.c:8:1: note: 'assert' is defined in header '<assert.h>'; did you forget to '#include <assert.h>'? 7 | #include "common.h" +++ |+#include <assert.h> 8 |
2020-03-30Bump release date for 0.9, fix version number formatingv0.9David Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-25Remove mingw based cross build from travis configDavid Oberhollenzer
- It can't find the headers on the travis CI machines, but works perfectly on the Ubuntu Bionic VM I set up localy. Don't know why yet. - The mkwinbins.sh script runs the unit tests through wine. I don't want to remove that from the script but I also don't want to install all of wine on the CI machines for every build. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-23Fix dependencies for mingw CI buildDavid Oberhollenzer
Now who would have thought we're gonna need some headers and libraries if we actually want to build software. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-22Add Windows cross build and S390 targets to the travis CI fileDavid Oberhollenzer
Why the windows cross build should be tested should be fairly self explanatory. The SYSTEM/390 build offers the possibillity to test on a big-endian target (besides being a rather uncommon target machine). Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-22Do not try to read back the compressor optionsDavid Oberhollenzer
None of the currently implemented compressors do anything with that data. They are all at the mercy of the data actually in the image. This commit removes the code from sqfs2tar and rdsquashfs that decodes the options, which also has the side effect of increasing compatibillity with some non-confirming images. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-22doxygen: fix some minor typosDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-22doxygen: add rudmentary main page to the API reference manualDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-22doxygen: propperly label generic inode helper functions as membersDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-22Update Doxygen fileDavid Oberhollenzer
Add description and example directory. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-22Bump version numberDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-22mkwinbins.sh: don't hard code version numberDavid Oberhollenzer
Use the same trick as the coverity script. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-22Update CHANGELOG.mdDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-20Overhaul README.mdDavid Oberhollenzer
Rewrite introduction. Move copyright section further up. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-19Fix: properly terminate the getopt_long arraysDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-19Fix compressor availability check in libcommonDavid Oberhollenzer
Initialize have_compressor to false before testing, to make the check work. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-19Fix destruction of NULL pointer in xattr reader cleanupDavid Oberhollenzer
This fixes a copy and paste error in the cleanup path, destroying a previously destroyed object again instead of the one being tested for. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-19Add references to the external testing & CI servicesDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-19Add a script to semi-automate coverity scan submissionsDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-19Add macOS workaround for lsetxattrDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-19Fix pthread_join check for valid thread handlesDavid Oberhollenzer
On Linux, checking for > 0 worked because pthread_t is internally an integer type. On other platforms (*caugh* Mac OS X *caugh*), it is typedefed to an opaque pointer, causing a warning if used in an integer relational comparison. The intended use is to allow the generic cleanup function to be used in the error path of the block processor creation function, while preventing pthread_join being called on threads that haven't been created at all. Since they are calloc'ed to 0, testing for non-zero values should suffice in both cases. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-19Fix xattr scanning on Mac OS XDavid Oberhollenzer
Mac OS X does not have llistxattr or lgetxattr. Instead, the listxattr and getxattr functions have additional an flag parameter that can be set to not follow symlinks. This commit adds a pre-processor define on OS X as a work around. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-19Improve travis CI build configurationDavid Oberhollenzer
- Build for arm64 and ppc64el as well as amd64. - Use a newer Ubuntu version. - Add a separate build for the serial (non-threaded) block processor. - Add a Mac OS X target. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-19Fix: don't use Linux specific st_mtim stat memberDavid Oberhollenzer
Always access the st_mtime field, which on Linux is #defined to a member of st_mtim. This fixes a travis CI build failure on Mac OS X. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-18Set a minimal libzstd version in autoconf scriptDavid Oberhollenzer
Version 1.3.1 is the first one which has fixed error codes exposed through the public API, which are used in the zstd compressor to determine whether compression *actually* failed or if the destination buffer was too small. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-18Fix build of lz4 compressor with older versions of liblz4David Oberhollenzer
Older versions of liblz4 don't define LZ4HC_CLEVEL_MAX. This commit adds a definition if liblz4 doesn't provide one. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-18Add basic travis CI configurationDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-18Add a test case for xxhash32David Oberhollenzer
Technically the code was imported by a third party library, but some modifications have been made. This commit adds a simple test case with some test vectors and expected results that have to match. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-18Add a test for the minimal rb-tree implementationDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-18Restore workaround for unaligned reads in xxhashDavid Oberhollenzer
The code was originally used inside the block processor, where 32 bit aligned data could be guaranteed. If it is available in libutil, I cannot possibly guarantee for alignment in future use elsewhere. Even for the block processor it was rather risky "remember this detail very well" buisness. This commit restores the unaligned read treatment of the original. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-18Cleanup: Move xxhash32 code to libutilDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-10Fix the printf format specifies (again)David Oberhollenzer
The PRIu64 et al are missing a "%" sign in front. Fixes: aaf7e68c75a907c3c08e83dfd2972665a0f1c1a3 Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-05Get rid of sqfs_compressor_existsDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-05Change the signature of sqfs_compressor_create to return an error codeDavid Oberhollenzer
Make sure the function has a way of telling the caller *why* it failed. This way, the function can convey whether it had an internal error, an allocation failure, whether the arguments are totaly nonsensical, or simply that the compressor *or specific configuration* is not supported. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-05Cleanup: Remove the E_ prefix from all libsquashfs enumeratorsDavid Oberhollenzer
Avoid namespace polution. Make sure all exportet symbols are prefixed with either sqfs_ or SQFS_. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-04Fix block writer inheritance of sqfs_object_tDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-04Cleanup: match xattr reader API closer to id table APIDavid Oberhollenzer
Instead of creating everything in the "create" function, cleanup and create/initialize stuff in a "load" function. This allows the xattr reader to be reset/re-used and adds the benefit of not having to lug around references to the super block, compressor and file (altough the later two are hidden inside the meta reader). Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-04Add a generic copying mechanism to sqfs_object_tDavid Oberhollenzer
This patch adds a deep-copy callback to sqfs_object_t and removes the copying mechanism from sqfs_compressor_t. This is also interesting for other types. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-04Add a deep copy function for the str_table_t helperDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-01Add a "do not deduplicate" block flagDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>