Age | Commit message (Collapse) | Author |
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
In the hash-table equals callback, if the hash and size match, do an
exact, byte-for-byte comparison of the fragment in question. The
fragment can either be in a fragment block that is in-flight (for which
we have the in-flight list), in the current, unfinished fragment block,
or it can be on disk.
In the later case, the fragment block is resolved through the fragment
table and read back from disk into a scratch buffer and decompressed.
After that, the fragment is checked for byte-for-byte equality with
the one we resolved through the hash table.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
If we want full, byte-for byte, verification of fragments during
de-duplication we need to check back with the blocks already written
to disk, or with the ones that are in flight.
The previous, extremely hacky approach simply locked up the thread
pool and investigated the queues. For the new approach, we treat the
thread pool as completely opaque and don't try to touch it.
This commit modifies the block processor to keep duplicate copies of
each submitted fragment block around, that are cleaned up once the
block is dequeued and written to disk. So instead of touching the
thread pool, we can simply investigate the in-fligth-block list and
the current block, before resorting to reading back fragment blocks
from the file.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
When we already hold the mutex, try to pre-emtively dequeue items into
a "safe queue". When actually asked to dequeue, take blocks from there
first and avoid having to enter the critical section if possible.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Simply count the number of blocks we hand out (malloc'ed or recycled)
and decrease the counter when we put blocks back for recycling.
The sync() part becomes a little more complicated, because we can get
stuck with a backlog of 1 or 2 because we have a fragment or current
block buffer in use. We also need to accout for this when creating the
processor, because we need to be able to request at least 2 blocks
without stalling.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
A cleaner separation between common code, frontend code and backend
code is made.
The "is this byte blob zero" function is moved out to libutil (with
test case and everything) with a more optimized implementation.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Hopeing that coverity can now tell the two appart.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Throw out the messy thread pool implementation and temporarily also
remove the exact fragment matching for simplicity.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
The thread pool enforces ordering of items during dequeue similar
to the already existing implementation in libsqfs. The idea is to
eventually pull this functionality out of the block processor and
turn it into a cleaner, separately tested module.
The thread pool is implemented as an abstract interface, so we can
have multiple implementations around, including the serial fallback
implementation which we can then *always* test, irregardless of the
compile config and run through static analysis as well.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This has basically been copied over from Musl and slightly modifed.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit restructures the rbtree code to optionally use a pool
allocator for the nodes. The option is made depenend on the presence
of a pre-processor flag.
To the configure script is added an option to enable/disable the use
of custom allocators. It makes sense to still allow the malloc/free
based routes for better ASAN based instrumentation.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Technically, this should *never* **ever** happen, because a SquashFS
file always starts with a super block, which isn't wrapped in a meta
data block, so a valid SquashFS file will never have a reason to read
from offset 0.
However, this does bite us when doing unit tests where the meta reader
and writer are used on an otherwise empty file. When trying to read
from offset 0, the caching code assumes that we already have that
block, since tha block_offset got initialized to 0.
This commit changes the initialization to set the current block
location to the maximum 64 bit integer, a location we are never
going to read from, since it will always be after the limit.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
The intention is to get rid of all the ad-hoc array implementations
in the other components and cut down code size.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
By storing the blocks in a tree, the de-duplication can lookup
existing blocks in logartihmic instead of linear time.
The linked list is still maintained, because we need to iterate
over the blocks in creation order during serialization.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
If we use the rb-tree in libsquashfs objects, we need to be able
top copy an entire tree as part of the object.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
|
|
Since the canonicalize_name function only fails if the path
contains ".." and the one we are constructing from the scanned
fstree (built using canonicalized names), it should NEVER fail.
However, coverity does get concerned, because we are checking the
return value elesewhere. So do what we do at other, similar locations
and add an assert().
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
The base path is passed to the fstree_from_file function and in turn
to the individual callbacks.
The line parsing function is modified to allow '*' as mode, uid and gid
for specifically marked callbacks.
A glob callback is added that internally uses the fstree_from_dir scanning
functions in combination with a filter callback.
Directory scanning flags are parsed from the extra arguments before
interpreting it as a path fragment.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
So we can scan a sub-directory within a the base directory without
having to do string operations first.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit mainly serves the static analysis tooling.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This function creates a block processor from a structure describing it.
A stub implementation for the old sqfs_block_processor_create is added
that simply sets up such a struct and forwards the call.
The current version of the description struct only contains the exact
same parameters and a size field at the beginning.
This approach is supposed to make extending the range of parameters
easier without breaking ABI compatibillity.
Currently already planned are:
- Adding a sqfs_file_t pointer to double-check when deduplicating
fragments.
- When the scanning code reaches a usable state, add the abillity
to pass scanned fragment data, so the block processor can be used
for appending to an existing image.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Keeping a list of fragments stored away in the current fragment block
and consolidating them in the thread pool takes them out of circulation.
If we have a lot of tiny fragments, this can lead to a situation where
all the limit is reached, but we cannot do anything, because we are
waiting for a block to complete, but they are all attached to the
current fragment block and the queue is empty.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
It looks like the last commit missed a couple more occurences
where '\' was treated incorrectly.
Fixes were still needed in sqfs_dir_reader_find_by_path and
sqfs_dir_reader_get_full_hierarchy.
This path is used in extras/browse.c.
|
|
All paths were canonicalized internally, which includes filtering
sequences of slashes and converting backslashes to slashes.
Furthermore, when unpacking files, filenames are sanity checked
and rejected if they contain forward OR backward slashes.
This is a problem on Unix-like systems, where files containing
backslashes are a legitimate use case (*cough* SystemD *cough*).
This patch removes the backslash conversion from the canonicalization
and modifies the sanity check to reject backslashes only on Windows.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
- Instead of using the fstree root, let the caller specify it.
- Add a flag to prevent recursion into sub directories.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Instead of comparing (compresed, disk-size, checksum) tuples to find
block matches, do an exact, byte-for-byte comparison of the data
stored on disk to avoid the possibility of a spurious colision.
Since this is the desired behaviour, make it the default, optionally
overrideable through a flag.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|