Age | Commit message (Collapse) | Author |
|
The differen compressor libraries use differnt integer types to tally
the buffer sizes. The libfstream library uses size_t, which may be
bigger than the actualy types, potentially causing an overflow if
trying to compress to much at once.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Currently, when the block processor aggreagtes fragments into a
fragment block, it applies the "don't compress" flag if any of the
original framgnets has it set, but the "align to device block" flag
is lost.
This commit ensures that both flags get applied to the fragment block
if set.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
1) If the block alignment flag is set, the padding bytes must be
inserted _before_ recording the start position, otherwise the
resulting image is not readable.
2) Also perform alignment if the flag is set on a fragment block.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This is a followup to dd4e6ead142e58568aec89d76b0b2e867ee983f2.
Basically the same problem occours with Bzip2, but it so far it wasn't
possible to find a sampel that reproduces it.
Unlike libxz, the libbz2 API does not support concatenated streams by
itself and will choke when trying to decompress after the stream end,
so this commit adds a workaround to simply initialize the decompressor
on-the-fly and tear it down again when and end-of-stream is returned.
The end-of-file condition is only set when there actually is no more
data to read. Otherwise, the decompressor will be re-initialized in
the next round.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Some xz compressed tarballs (e.g. from kernel.org) are not made up of
a single xz stream, but rather contain several, independendly
compressed streams. In that case, the xz decompressor hits
an LZMA_STREAM_END early on and reports EOF. If you are lucky, the tar
reader bails (premature end-of-file). If you are unlucky, it happens
exactely between two records and is interpeted as regular end-of-file.
As this seems to be a normal use case for xz, it has a flag to just
read across the seams and only report end-of-stream if the action
is set to finish.
This commit adds the flag to the initialization propperly sets the
lzma_action depending on whether the underlying stream hit EOF or not.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
On systems like Windows, the dynamic library and applications can
easily end up being linked against different runtime libraries, so
applications cannot be expected to be able to free() any malloc'd
pointer that the library returns.
This commit adds an sqfs_free function so the application can pass
pointers back to the library to call the correct free() implementation.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This indicates that sync isn't possible on the underlying file
descriptor (e.g. a pipe), which currently causes sqfs2tar to err if
the output isn't written directly to a file.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This was already in the original block processor but got dropped by
accident when restructuring it.
The problem manifests itself when manually submitting fragment blocks.
They no longer get correct I/O queue tickets, clog up the queue and
the processor eventually throws an internal error.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
If the path argument is "", we assume that referes to root and set
the *existing* target node to the root node and skip ahead across
the tree search. This leaves "name" uninitialized, which makes
coverity panic, because fs->root could be NULL, going down the wrong
path.
Obviously, this should never, *ever* happen and there is no reasonable
recovery strategy if it suddenly does, so simply add an assertion.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Only clean up the fragment if it hasn't been re-assigned to the
fragment block. The NULL check is definitely wrong, because we
no longer re-assign it as NULL.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This allows putting globbed files & directories into the filesystem
root, as well as explicitly setting attributes of the root directory
from the file lisiting.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Dequeuing won't work if we have a backlog of 1 or 2 and the blocks
are used for internal buffering. Take that into account, similar to
the sync code. Also bump the minimum backlog to 3, just to make
absolutely sure we cannot run into a dequeue loop trying to allocate
a block.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
It's rather simplistic and doesn't account for junction/reparse
points, which is the closest thing Windows has to symlinks, hard
links and mount points, but it's consistent with the unpacking code
that assumes Windows only has files and directories.
Using the 32 bit mingw toolchain, this seems to satisfy the unit
tests on wine.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
In the hash-table equals callback, if the hash and size match, do an
exact, byte-for-byte comparison of the fragment in question. The
fragment can either be in a fragment block that is in-flight (for which
we have the in-flight list), in the current, unfinished fragment block,
or it can be on disk.
In the later case, the fragment block is resolved through the fragment
table and read back from disk into a scratch buffer and decompressed.
After that, the fragment is checked for byte-for-byte equality with
the one we resolved through the hash table.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
If we want full, byte-for byte, verification of fragments during
de-duplication we need to check back with the blocks already written
to disk, or with the ones that are in flight.
The previous, extremely hacky approach simply locked up the thread
pool and investigated the queues. For the new approach, we treat the
thread pool as completely opaque and don't try to touch it.
This commit modifies the block processor to keep duplicate copies of
each submitted fragment block around, that are cleaned up once the
block is dequeued and written to disk. So instead of touching the
thread pool, we can simply investigate the in-fligth-block list and
the current block, before resorting to reading back fragment blocks
from the file.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
When we already hold the mutex, try to pre-emtively dequeue items into
a "safe queue". When actually asked to dequeue, take blocks from there
first and avoid having to enter the critical section if possible.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Simply count the number of blocks we hand out (malloc'ed or recycled)
and decrease the counter when we put blocks back for recycling.
The sync() part becomes a little more complicated, because we can get
stuck with a backlog of 1 or 2 because we have a fragment or current
block buffer in use. We also need to accout for this when creating the
processor, because we need to be able to request at least 2 blocks
without stalling.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
A cleaner separation between common code, frontend code and backend
code is made.
The "is this byte blob zero" function is moved out to libutil (with
test case and everything) with a more optimized implementation.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Hopeing that coverity can now tell the two appart.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Throw out the messy thread pool implementation and temporarily also
remove the exact fragment matching for simplicity.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
The thread pool enforces ordering of items during dequeue similar
to the already existing implementation in libsqfs. The idea is to
eventually pull this functionality out of the block processor and
turn it into a cleaner, separately tested module.
The thread pool is implemented as an abstract interface, so we can
have multiple implementations around, including the serial fallback
implementation which we can then *always* test, irregardless of the
compile config and run through static analysis as well.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This has basically been copied over from Musl and slightly modifed.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit restructures the rbtree code to optionally use a pool
allocator for the nodes. The option is made depenend on the presence
of a pre-processor flag.
To the configure script is added an option to enable/disable the use
of custom allocators. It makes sense to still allow the malloc/free
based routes for better ASAN based instrumentation.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Technically, this should *never* **ever** happen, because a SquashFS
file always starts with a super block, which isn't wrapped in a meta
data block, so a valid SquashFS file will never have a reason to read
from offset 0.
However, this does bite us when doing unit tests where the meta reader
and writer are used on an otherwise empty file. When trying to read
from offset 0, the caching code assumes that we already have that
block, since tha block_offset got initialized to 0.
This commit changes the initialization to set the current block
location to the maximum 64 bit integer, a location we are never
going to read from, since it will always be after the limit.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
The intention is to get rid of all the ad-hoc array implementations
in the other components and cut down code size.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
By storing the blocks in a tree, the de-duplication can lookup
existing blocks in logartihmic instead of linear time.
The linked list is still maintained, because we need to iterate
over the blocks in creation order during serialization.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
If we use the rb-tree in libsquashfs objects, we need to be able
top copy an entire tree as part of the object.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
|
|
Since the canonicalize_name function only fails if the path
contains ".." and the one we are constructing from the scanned
fstree (built using canonicalized names), it should NEVER fail.
However, coverity does get concerned, because we are checking the
return value elesewhere. So do what we do at other, similar locations
and add an assert().
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
The base path is passed to the fstree_from_file function and in turn
to the individual callbacks.
The line parsing function is modified to allow '*' as mode, uid and gid
for specifically marked callbacks.
A glob callback is added that internally uses the fstree_from_dir scanning
functions in combination with a filter callback.
Directory scanning flags are parsed from the extra arguments before
interpreting it as a path fragment.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|