summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-05-23Update benchmark numbers for zstd, now that it uses correct parametersDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-21Update CHANGELOG.mdDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-21Fix: zstd: actually set the compression level from the optionsDavid Oberhollenzer
In the zstd compressor, the compression level from the configuration structure wasn't used at all. Instead, the zstd compressor was told to use level 0 and compressor options with that parameter were written to disk. This commit makes sure the level parameter is propperly initialized. Reported-by: Sébastien Gross Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-21hash table: switch to sqfs_* types, mark functions as hiddenDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-21Update CHANGELOG.mdDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-21Fix the semantics of the super block deduplicationDavid Oberhollenzer
Its purely informational, but make sure other programs don't print out scary messages that imply the data has been ineficiently. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-19Cleanup: move hash table header to include directoryDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-18libtar: fix size computation of PAX line lengthDavid Oberhollenzer
This commit attempts to fix the following two problems: - The number of digits computation returning an off-by-one result if the number is 10, or the resulting digit string starts with "10". This results in one-too-many padding bytes, corrupting the rest of the archive since the headers now don't start at multiples of 512 anymore. - Adding the line length prefix affects the line length (duh). If it grows far enough to require more digits, the result is a similar problem. This is a converging series that we need to compute the limit of. Unit tests for this still need to be added. Or maybe I can convince a bored undergrad student to provide an induction proof. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-16Update documentationDavid Oberhollenzer
- Some clarifications - Some typo fixes Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-07Fix checksums for the corpus tests now that -T actually worksDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-07Fix compilation on GCC4 and belowBrandon Maier
When compiling with GCC4 the following error occurs. > lib/util/rbtree.c:140: undefined reference to `__builtin_uaddl_overflow' This is because __builtin_uaddl_overflow() and the other __builtin_u{add,mul}{,l,ll}_overflow() functions are only defined in GNUC < 5 for Clang. When using GCC4 and below they are not defined. Since the SZ_ADD_OV and SZ_MUL_OV are only used to check 'size_t' type values. And overflow on add and multiply of unsigned types is defined behaviour (C Standard 6.2.5 paragraph 9). It's simple to write overflow functions for this specific case. These are based on the overflow wrappers from the SEI CERT C Standard INT30-C. [1] https://gcc.gnu.org/gcc-5/changes.html Signed-off-by: Brandon Maier <brandon.maier@rockwellcollins.com>
2020-05-04Expose more fine grained control values & flags on the XZ compressorDavid Oberhollenzer
This patch allows external users to fiddle with the XZ compressors compression strength, alignment and other values. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-04Fix: propperly set the last block flag if fragments are disabledDavid Oberhollenzer
If a file consisting of multiple blocks is produced, the last block is short and the don't fragment flag is set, the last block flag has to be set on the block when we flush it, so the processing pipeline does it's job correctly. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-03Actually run the directory pack test if corpora tests are desiredv0.9.1David Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-03Bump version numberDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-03Update README.mdDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-03Update CHANGELOG.mdDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-03Update man pagesDavid Oberhollenzer
Add missing options, rephrase some things to be a bit more clear and fix a bunch of typos. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-03Add a simple test script for the gensquashfs packdir + allroot use caseDavid Oberhollenzer
Since this is a fairly common use case, it deserves a simple test case to check out that e.g. option processing hasn't been botched up (again). As input directory, the licenses directory is used as it contains no intermediate build output and should change fairly infrequently. The test is enabled irregardless of the corpora-test option. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-03Fix: unify extra argument rejection in tar2sqfs & gensquashfsDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-03Fix: the --all-root option does not take an arugmentDavid Oberhollenzer
Change the "required_argument" to the correct "no_argument". Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-03Fix: use 0644 as default permissions when creating filesDavid Oberhollenzer
Until now, when packing or unpacking a SquashFS image, files where created with paranoid permissions (i.e. 0600). The rational behind this was that otherwise, the tools may inadvertently expose secrets, e.g. if a root user packs files that that aren't world readable, such as the /etc/shadows file, but the packed SquashFS image is, we have accidentally leaked this file to other users that can access the newly created SquashFS image. The same line of reasoning also applies when unpacking files. Unfortunately, this breaks a list of other, more common standard use cases (e.g. a build server where the an image is built by a deamon running as user X but then has to be accessed by another deamon running as Y). This commit changes to a more standard approach of using permissive file permissions by default and asking paranoid users to simply use a paranoid umask. For tar2sqfs & gensquashfs this simply means chaning the default permissions in the libsquashfs file implementation. For rdsquashfs on the other hand there is still the use case where the unpacked files get the permissions from the [secret] image, so setting a strict umask is not applicable and changing to permissive file mode leaks something. For this case a second code path needs to be added that derives the permissions from the ones in the image. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-27Fix gitignore binary pathsDavid Oberhollenzer
Only ignre them if they are in the top most directory, i.e. built in the source tree. Do not ignore directories named after the binaries! Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-27Cleanup/fix: gensquashfs: split directory scanning from xattr scanningDavid Oberhollenzer
On the one hand, this commit cleanes the code a bit by splitting the "scan directory contents" code from the "scan xattrs from directory contents" and moving the later in a seperate file. On the other hand, the xattr scanning is now done *after* the fstree is post processed, which includes sorting it. This way, the xattrs are always added in a deterministic, reproducible order. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-27Enable uint128_t pathMatt Turner
I forgot to enable this when I copied it over from Mesa. Mesa's meson configuration system checks that a C program using the uint128_t type compiles, but I think this is likely unnecessary. Simply check the macro that clang and gcc define. This cuts the .text size of hash_table.o by 160 bytes or about 4% on my system. Signed-off-by: Matt Turner <mattst88@gmail.com>
2020-04-27gensquashfs: Add options to globally override UID/GID valuesDavid Oberhollenzer
A common use case for mksquashfs is to simply pack a directory and set a magic option to force all user/group IDs to root. This commit adds similar options to gensquashfs to maek it better suited as a direct replacement for packing an input directory. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-27Add hash table code to libutil.aDavid Oberhollenzer
Not only does this build the hashtable into libutil.a, it also makes sure the headers end up in the distribution tarball. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-27Add propper license text for Mesa hash table implementationDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-27Cleanup directory structure of the binary programsDavid Oberhollenzer
Instead of having the binary programs in randomly named subdirectories, move all of them to a "bin" subdirectory, similar to the utility libraries that have subdirectories within "lib" and give the subdirectories the propper names (e.g. have gensquashfs source in a directory *actually* named "gensquashfs"). Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-27Fix tar2sqfs: Actually apply the block processor flagsDavid Oberhollenzer
This fixes a bug in tar2sqfs where the -T option has no effect, because the block processor flags were propperly generated, but not passed on. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-23Update benchmark dataDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-22Import and use Mesa's hash tableMatt Turner
With `perf record`/`perf report` I saw that 30% of the time was spent in `sqfs_frag_table_find_tail_end` with tar2sqfs for a tarball containing the Gentoo ebuild repository (many thousands of small files). The reason was the bucketing hash table in frag_table.c: too many elements in too few buckets meant lots of walking over the linked lists. This patch replaces that hash table with the hash table implementation from Mesa. Its implementation is more complex (is is an open-addressing, linear-reprobing) hash table, but it is much better suited for the task. On my 4c/8t Skylake, the time to run tar2sqfs drops from 7.5s to less than 3s. CPU usage increases from ~207% to ~356%, presumably indicating an increase in available parallelism due to the removal of the hash table as a bottleneck. The `perf report` profile with this patch shows that the time spent in `sqfs_frag_table_find_tail_end` has dropped from ~30% to 0.01%. Output from ministat: x before + after N Min Max Median Avg Stddev x 20 7.476 7.685 7.5725 7.5615 0.051254268 + 20 2.79 2.901 2.846 2.84475 0.03543842 Difference at 95.0% confidence -4.71675 +/- 0.0282015 -62.3785% +/- 0.241477% (Student's t, pooled s = 0.0440618) I imported only the bits of the hash table implementation that were needed for frag_table.c. Among the changes I made after importing are - removed usage of ralloc, Mesa's recursive memory allocator - Replaced ralloc -> malloc ralloc_free -> free rzalloc_array -> calloc - Removed mem_ctx parameters - Added free()s to the appropriate places (valgrind confirms there are no leaks) - removed _mesa_-prefix from function names Fixes: #40 Signed-off-by: Matt Turner <mattst88@gmail.com>
2020-04-22COPYING: Fix a couple of typosMatt Turner
Signed-off-by: Matt Turner <mattst88@gmail.com>
2020-04-22Skip PAX global headersDavid Oberhollenzer
Tar archives can contain set two kinds of PAX headers: - local headers that modify the attributes of the next file - global headers that set defaults for all files The later is used "... not widely used", according to tar(5) and has been deliberately not implemented. Some programs (e.g. git-archive) *do* generate them (in the case of git, it stores the commit hash). This commit adds a code path that skips a PAX global header entirely and resumes tar parsing, instead of erroneusly reporting it as an entry. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-21Fix: add missing --with-gzip/--without-gzip configure handleDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-17Remove some configure time sizeof checksDavid Oberhollenzer
In libtar, the sizeof time_t checked when trying to store a time value. It is pointless using the preprocessor here, as we can simply do an if (sizeof(time_t) < ...) check and the compiler will take care optimizing away one or the other branch. After changing the libtar check and the corresponding unit tests, the sizeof check can be removed from configure.ac, along with other unused sizeof checks. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-17Cleanup: split read_header.c in libtar.aDavid Oberhollenzer
Simply moving the pax header decoding to a separate file and splitting out the common helper functions should be a good start. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-17tests: improve diagnostics output & make "release builds" workDavid Oberhollenzer
This commit adds a few macros and helper functions for the unit test programs. Those are used instead of asserts to provide more fine grained diagnostics on the one hand and on the other hand because they also work if NDEBUG is defined, unlike asserts that get eliminated in that case. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-16tar2sqfs & gensquashfs: Delete the output file on failureDavid Oberhollenzer
This commit changes the tar2sqfs & gensquashfs code to pass the exit status on to sqfs_writer_cleanup in libcommon. The function sqfs writer code in libcommon is changed to retain the output file name and delete it if the status passed to the cleanup function is anything other than EXIT_SUCCESS. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-16Propperly cast void pointer in sqfs_object_t inline functionDavid Oberhollenzer
Otherwise, C++ compilers will scream. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-16sqfs2tar: Fix trailing slashes for directory namesDavid Oberhollenzer
sqfs2tar is supposed to append slashes to directory names. Until now, it assumed a tree node to be a directory if it has children. This simple check obviously fails for empty directories. This commit fixes the check by actually testing the inode mode. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-01Fix missing header without LZOAlyssa Ross
lib/common/compress.c: In function 'compressor_get_default': lib/common/compress.c:39:2: warning: implicit declaration of function 'assert' [-Wimplicit-function-declaration] 39 | assert(0); | ^~~~~~ lib/common/compress.c:8:1: note: 'assert' is defined in header '<assert.h>'; did you forget to '#include <assert.h>'? 7 | #include "common.h" +++ |+#include <assert.h> 8 |
2020-03-30Bump release date for 0.9, fix version number formatingv0.9David Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-25Remove mingw based cross build from travis configDavid Oberhollenzer
- It can't find the headers on the travis CI machines, but works perfectly on the Ubuntu Bionic VM I set up localy. Don't know why yet. - The mkwinbins.sh script runs the unit tests through wine. I don't want to remove that from the script but I also don't want to install all of wine on the CI machines for every build. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-23Fix dependencies for mingw CI buildDavid Oberhollenzer
Now who would have thought we're gonna need some headers and libraries if we actually want to build software. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-22Add Windows cross build and S390 targets to the travis CI fileDavid Oberhollenzer
Why the windows cross build should be tested should be fairly self explanatory. The SYSTEM/390 build offers the possibillity to test on a big-endian target (besides being a rather uncommon target machine). Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-22Do not try to read back the compressor optionsDavid Oberhollenzer
None of the currently implemented compressors do anything with that data. They are all at the mercy of the data actually in the image. This commit removes the code from sqfs2tar and rdsquashfs that decodes the options, which also has the side effect of increasing compatibillity with some non-confirming images. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-22doxygen: fix some minor typosDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-22doxygen: add rudmentary main page to the API reference manualDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-22doxygen: propperly label generic inode helper functions as membersDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>