| Age | Commit message (Collapse) | Author | 
|---|
|  | With `perf record`/`perf report` I saw that 30% of the time was spent in
`sqfs_frag_table_find_tail_end` with tar2sqfs for a tarball containing
the Gentoo ebuild repository (many thousands of small files).
The reason was the bucketing hash table in frag_table.c: too many
elements in too few buckets meant lots of walking over the linked lists.
This patch replaces that hash table with the hash table implementation
from Mesa. Its implementation is more complex (is is an open-addressing,
linear-reprobing) hash table, but it is much better suited for the task.
On my 4c/8t Skylake, the time to run tar2sqfs drops from 7.5s to less
than 3s. CPU usage increases from ~207% to ~356%, presumably indicating
an increase in available parallelism due to the removal of the hash
table as a bottleneck. The `perf report` profile with this patch shows
that the time spent in `sqfs_frag_table_find_tail_end` has dropped from
~30% to 0.01%.
Output from ministat:
x before
+ after
    N          Min          Max       Median          Avg        Stddev
x  20        7.476        7.685       7.5725       7.5615   0.051254268
+  20         2.79        2.901        2.846      2.84475    0.03543842
Difference at 95.0% confidence
	-4.71675 +/- 0.0282015
	-62.3785% +/- 0.241477%
	(Student's t, pooled s = 0.0440618)
I imported only the bits of the hash table implementation that were
needed for frag_table.c. Among the changes I made after importing are
    - removed usage of ralloc, Mesa's recursive memory allocator
      - Replaced ralloc -> malloc
		 ralloc_free -> free
                 rzalloc_array -> calloc
      - Removed mem_ctx parameters
      - Added free()s to the appropriate places (valgrind confirms there
	are no leaks)
    - removed _mesa_-prefix from function names
Fixes: #40
Signed-off-by: Matt Turner <mattst88@gmail.com> | 
|  | Signed-off-by: Matt Turner <mattst88@gmail.com> | 
|  | Tar archives can contain set two kinds of PAX headers:
 - local headers that modify the attributes of the next file
 - global headers that set defaults for all files
The later is used "... not widely used", according to tar(5)
and has been deliberately not implemented.
Some programs (e.g. git-archive) *do* generate them (in the case
of git, it stores the commit hash).
This commit adds a code path that skips a PAX global header entirely
and resumes tar parsing, instead of erroneusly reporting it as an
entry.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | In libtar, the sizeof time_t checked when trying to store a time value.
It is pointless using the preprocessor here, as we can simply do an
if (sizeof(time_t) < ...) check and the compiler will take care
optimizing away one or the other branch.
After changing the libtar check and the corresponding unit tests, the
sizeof check can be removed from configure.ac, along with other unused
sizeof checks.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Simply moving the pax header decoding to a separate file and splitting
out the common helper functions should be a good start.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | This commit adds a few macros and helper functions for the unit test
programs. Those are used instead of asserts to provide more fine
grained diagnostics on the one hand and on the other hand because
they also work if NDEBUG is defined, unlike asserts that get eliminated
in that case.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | This commit changes the tar2sqfs & gensquashfs code to pass the exit
status on to sqfs_writer_cleanup in libcommon.
The function sqfs writer code in libcommon is changed to retain the
output file name and delete it if the status passed to the cleanup
function is anything other than EXIT_SUCCESS.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Otherwise, C++ compilers will scream.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | sqfs2tar is supposed to append slashes to directory names. Until now,
it assumed a tree node to be a directory if it has children. This
simple check obviously fails for empty directories. This commit fixes
the check by actually testing the inode mode.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | lib/common/compress.c: In function 'compressor_get_default':
lib/common/compress.c:39:2: warning: implicit declaration of function 'assert' [-Wimplicit-function-declaration]
   39 |  assert(0);
      |  ^~~~~~
lib/common/compress.c:8:1: note: 'assert' is defined in header '<assert.h>'; did you forget to '#include <assert.h>'?
    7 | #include "common.h"
  +++ |+#include <assert.h>
    8 | | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | - It can't find the headers on the travis CI machines, but works
   perfectly on the Ubuntu Bionic VM I set up localy. Don't know why
   yet.
 - The mkwinbins.sh script runs the unit tests through wine. I don't want
   to remove that from the script but I also don't want to install all of
   wine on the CI machines for every build.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Now who would have thought we're gonna need some headers and libraries
if we actually want to build software.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Why the windows cross build should be tested should be fairly self
explanatory. The SYSTEM/390 build offers the possibillity to test
on a big-endian target (besides being a rather uncommon target
machine).
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | None of the currently implemented compressors do anything with that data.
They are all at the mercy of the data actually in the image.
This commit removes the code from sqfs2tar and rdsquashfs that decodes
the options, which also has the side effect of increasing compatibillity
with some non-confirming images.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Add description and example directory.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Use the same trick as the coverity script.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Rewrite introduction. Move copyright section further up.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Initialize have_compressor to false before testing, to make
the check work.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | This fixes a copy and paste error in the cleanup path, destroying a
previously destroyed object again instead of the one being tested for.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | On Linux, checking for > 0 worked because pthread_t is internally an
integer type. On other platforms (*caugh* Mac OS X *caugh*), it is
typedefed to an opaque pointer, causing a warning if used in an
integer relational comparison.
The intended use is to allow the generic cleanup function to be used
in the error path of the block processor creation function, while
preventing pthread_join being called on threads that haven't been
created at all. Since they are calloc'ed to 0, testing for non-zero
values should suffice in both cases.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Mac OS X does not have llistxattr or lgetxattr. Instead, the listxattr
and getxattr functions have additional an flag parameter that can be
set to not follow symlinks. This commit adds a pre-processor define on
OS X as a work around.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | - Build for arm64 and ppc64el as well as amd64.
 - Use a newer Ubuntu version.
 - Add a separate build for the serial (non-threaded) block processor.
 - Add a Mac OS X target.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Always access the st_mtime field, which on Linux is #defined to a
member of st_mtim.
This fixes a travis CI build failure on Mac OS X.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Version 1.3.1 is the first one which has fixed error codes exposed
through the public API, which are used in the zstd compressor to
determine whether compression *actually* failed or if the destination
buffer was too small.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Older versions of liblz4 don't define LZ4HC_CLEVEL_MAX. This commit
adds a definition if liblz4 doesn't provide one.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Technically the code was imported by a third party library, but some
modifications have been made. This commit adds a simple test case with
some test vectors and expected results that have to match.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | The code was originally used inside the block processor, where 32 bit
aligned data could be guaranteed. If it is available in libutil, I
cannot possibly guarantee for alignment in future use elsewhere. Even
for the block processor it was rather risky "remember this detail very
well" buisness.
This commit restores the unaligned read treatment of the original.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | The PRIu64 et al are missing a "%" sign in front.
Fixes: aaf7e68c75a907c3c08e83dfd2972665a0f1c1a3
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Make sure the function has a way of telling the caller *why* it failed.
This way, the function can convey whether it had an internal error, an
allocation failure, whether the arguments are totaly nonsensical, or
simply that the compressor *or specific configuration* is not supported.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Avoid namespace polution. Make sure all exportet symbols are prefixed
with either sqfs_ or SQFS_.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Instead of creating everything in the "create" function, cleanup and
create/initialize stuff in a "load" function. This allows the xattr
reader to be reset/re-used and adds the benefit of not having to
lug around references to the super block, compressor and file (altough
the later two are hidden inside the meta reader).
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | This patch adds a deep-copy callback to sqfs_object_t and removes the
copying mechanism from sqfs_compressor_t. This is also interesting for
other types.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> |