| Age | Commit message (Collapse) | Author | 
|---|
|  | Slightly modify the byte-for-byte comparison function to compare an
arbitrary range in a file and move it to libutil. Instead of calling
it for each block in the block writer, simply let it check an entire
range in the block writer and compute the range position/size of the
reference ahead, before looking for potential matches.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Move all the libutil stuff from the toplevel include/ to a util/
sub directory and fix up the includes that make use of them.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Split out several repated patterns into helper functions and move the
rest of the code back into dir_reader.c
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | On GNU/Linux, *BSD or MacOS we can simply use the system default
library. The copy was primarily only there for the Windows build.
The build script for Windows has now been adapted to download and
compile a shared library from a tarball.
This removes a huge chunk of code from the git tree as well as
the release tarballs. Additionally it gets rid of iffy things like
removing the Zlib copyright/version strings, so the libsquashfs DLL
doesn't export it.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Two flags are added to the dir reader API, one for the create function
that the dir reader should report those entries and one to the open
function to suppress that if it was enabled.
To implement the feature, a mapping of visited directory inodes is
maintained internally, that mapps inode numbers to inode references.
When opening a directory, state is maintained to generate the fake
entries for '.' and '..'. Since all the other functions are based on
the open/read/rewind API, no alterations need to be made. The tree
scan function is modified, to use the suppress flag, so it does not
accidentally catch those entries.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | On systems like Windows, the dynamic library and applications can
easily end up being linked against different runtime libraries, so
applications cannot be expected to be able to free() any malloc'd
pointer that the library returns.
This commit adds an sqfs_free function so the application can pass
pointers back to the library to call the correct free() implementation.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | A cleaner separation between common code, frontend code and backend
code is made.
The "is this byte blob zero" function is moved out to libutil (with
test case and everything) with a more optimized implementation.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Throw out the messy thread pool implementation and temporarily also
remove the exact fragment matching for simplicity.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | This commit restructures the rbtree code to optionally use a pool
allocator for the nodes. The option is made depenend on the presence
of a pre-processor flag.
To the configure script is added an option to enable/disable the use
of custom allocators. It makes sense to still allow the malloc/free
based routes for better ASAN based instrumentation.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | By storing the blocks in a tree, the de-duplication can lookup
existing blocks in logartihmic instead of linear time.
The linked list is still maintained, because we need to iterate
over the blocks in creation order during serialization.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | The source code of a modified liblz4 and zlib are included with the
option to compile them into libsquashfs if they are not available on
the system.
So far, the source code was included directly in the compressor sub
directory within libsqsuashfs. This commit moves the libraries out
into the lib directory.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | This commit moves the libsquashfs xattr related code into a sub
directory and splits the xattr writer code up into several files.
No actual code is changed.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | This commit breaks the common code up again by moving the data submission
code to a separate file, making both a little bit more readable.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | With `perf record`/`perf report` I saw that 30% of the time was spent in
`sqfs_frag_table_find_tail_end` with tar2sqfs for a tarball containing
the Gentoo ebuild repository (many thousands of small files).
The reason was the bucketing hash table in frag_table.c: too many
elements in too few buckets meant lots of walking over the linked lists.
This patch replaces that hash table with the hash table implementation
from Mesa. Its implementation is more complex (is is an open-addressing,
linear-reprobing) hash table, but it is much better suited for the task.
On my 4c/8t Skylake, the time to run tar2sqfs drops from 7.5s to less
than 3s. CPU usage increases from ~207% to ~356%, presumably indicating
an increase in available parallelism due to the removal of the hash
table as a bottleneck. The `perf report` profile with this patch shows
that the time spent in `sqfs_frag_table_find_tail_end` has dropped from
~30% to 0.01%.
Output from ministat:
x before
+ after
    N          Min          Max       Median          Avg        Stddev
x  20        7.476        7.685       7.5725       7.5615   0.051254268
+  20         2.79        2.901        2.846      2.84475    0.03543842
Difference at 95.0% confidence
	-4.71675 +/- 0.0282015
	-62.3785% +/- 0.241477%
	(Student's t, pooled s = 0.0440618)
I imported only the bits of the hash table implementation that were
needed for frag_table.c. Among the changes I made after importing are
    - removed usage of ralloc, Mesa's recursive memory allocator
      - Replaced ralloc -> malloc
		 ralloc_free -> free
                 rzalloc_array -> calloc
      - Removed mem_ctx parameters
      - Added free()s to the appropriate places (valgrind confirms there
	are no leaks)
    - removed _mesa_-prefix from function names
Fixes: #40
Signed-off-by: Matt Turner <mattst88@gmail.com> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | This commit removes the allocation helpers and string table functions
out of libsquashfs back into a "libutil.a". The problem of libsquashfs
exporting stuff that it shouldn't is resolved by retaining the internal
attributes and directly adding the source to libsquashfs instead of
trying to somehow link against libutil.la.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | On the one hand, benchmarking and profiling determined xxhash32 to be
faster than the zlib implementation of crc32, on the other hand
profiling determined that crc32 computation contributed signifficantly
to the overall runtime.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | This commit moves the entire block writing and deduplication of data
blocks over to a different data type named "block writer".
For simplicity, the interfaces of the block processor are left as is
and are turned into warppers. Likewise, most of the code in the block
writer is just verbatim from the block processor, to be cleaned up in
subsequent commits.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | This fixes both the name inside the file, as well as the file name
by adding the major version suffix.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Since they are both structured the same way using condition variables,
they are only a few defines away from removing code duplication.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | It's cleaner, more stable and works pretty much the same way as the
pthread version. The downside is that the minimum target for the
library is now Windows Vista, or Server 2008. But both are over a
decade old anyway, so this shouldn't be an issue.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | There were only a hand full of instances outside libsquashfs that used
the alloc code. In most cases, the thing allocated hat its size derived
from something already in memory anyway, so it is safe to assume its
size fits into a size_t.
At the same time, the opencoded Windows path conversion functions are
all unified into a single helper function.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | The liblzo2 library is licensed under GPLv2, so it is not possible to
distribute binaries of libsquashfs that link against liblzo2 under
LGPL.
This commit moves the LZO compressor implementation to libcommon,
where this isn't a problem, since the tools themselves are licensed
under GPLv3.
It removes the ability of libsquashfs to read or generate LZO compressed
SquashFS images, but the tools still can.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | The -static-libgcc flag has to be passed through the compiler with
a "-Wc," prefix, because libtool tries to be clever about linker
flags. If added directly to LDFLAGS, libtool removes it.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | This commit moves the generic unix implementation into a "unix"
subdirectory and adds a "win32" subdirectory with a winapi based
implementation.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | This commit implements the part of the API responsible for recoding
and deduplicating xattr key-value blocks.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> | 
|  | Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at> |