Age | Commit message (Collapse) | Author |
|
In the hash-table equals callback, if the hash and size match, do an
exact, byte-for-byte comparison of the fragment in question. The
fragment can either be in a fragment block that is in-flight (for which
we have the in-flight list), in the current, unfinished fragment block,
or it can be on disk.
In the later case, the fragment block is resolved through the fragment
table and read back from disk into a scratch buffer and decompressed.
After that, the fragment is checked for byte-for-byte equality with
the one we resolved through the hash table.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
If we want full, byte-for byte, verification of fragments during
de-duplication we need to check back with the blocks already written
to disk, or with the ones that are in flight.
The previous, extremely hacky approach simply locked up the thread
pool and investigated the queues. For the new approach, we treat the
thread pool as completely opaque and don't try to touch it.
This commit modifies the block processor to keep duplicate copies of
each submitted fragment block around, that are cleaned up once the
block is dequeued and written to disk. So instead of touching the
thread pool, we can simply investigate the in-fligth-block list and
the current block, before resorting to reading back fragment blocks
from the file.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
A cleaner separation between common code, frontend code and backend
code is made.
The "is this byte blob zero" function is moved out to libutil (with
test case and everything) with a more optimized implementation.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Throw out the messy thread pool implementation and temporarily also
remove the exact fragment matching for simplicity.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This function creates a block processor from a structure describing it.
A stub implementation for the old sqfs_block_processor_create is added
that simply sets up such a struct and forwards the call.
The current version of the description struct only contains the exact
same parameters and a size field at the beginning.
This approach is supposed to make extending the range of parameters
easier without breaking ABI compatibillity.
Currently already planned are:
- Adding a sqfs_file_t pointer to double-check when deduplicating
fragments.
- When the scanning code reaches a usable state, add the abillity
to pass scanned fragment data, so the block processor can be used
for appending to an existing image.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Keeping a list of fragments stored away in the current fragment block
and consolidating them in the thread pool takes them out of circulation.
If we have a lot of tiny fragments, this can lead to a situation where
all the limit is reached, but we cannot do anything, because we are
waiting for a block to complete, but they are all attached to the
current fragment block and the queue is empty.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Fragment deduplication really doesn't belong into the public API of
the fragment table.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This function allows submission of raw blocks to the block processor,
completely bypassing the file API.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit modifies the block processor to support associating a user
data pointer with data blocks that it forwards to the block writer,
which is modified to accept an optional user data pointer.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Instead of merging fragments into the fragment block inside the
process_completed_fragment function, store a linked list of fragments
in the fragment block and do the actual merging (several memcpy calls
totaling of up to 1M of data in worst case) in the worker thread
instead of the locked, serial path.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Instead of freeing/allocating blocks all the time in the locked,
serial path, use a free list to "recycle" blocks. Once a block is
no longer used, throw it onto the free list. If a new block is,
needed try to get one from the free list before calling malloc.
After a few iterations, the block processor should stop allocating
new blocks and only re-use the ones it already has.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
If the block processor allocates and dynamically resizes inodes on
the fly, we can add data indefinitely without knowing the size of
the file ahead of time.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit removes the allocation helpers and string table functions
out of libsquashfs back into a "libutil.a". The problem of libsquashfs
exporting stuff that it shouldn't is resolved by retaining the internal
attributes and directly adding the source to libsquashfs instead of
trying to somehow link against libutil.la.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Implement the io-queue based design as outline in doc/parallelism.txt
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
On the one hand, benchmarking and profiling determined xxhash32 to be
faster than the zlib implementation of crc32, on the other hand
profiling determined that crc32 computation contributed signifficantly
to the overall runtime.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit moves all of the fragment/block accounting in the block
processor over to the writing end of the pipeline. In order to do
this, the sparse blocks are allowed to bubble through the pipeline.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Again, the generic init/cleanup functions do way too many things that
are specific to the thread pool implementation. Thanks to the splitting
up of the block processor, they also have become quite trivial. This
commit moves those functions into their respective implementations,
allowing even further simplificiation.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Under the assumption that block processing is CPU bound and not I/O
bound, this is entirely useless.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Make every dynamically allocated, opaque data structure inherit from
a common sqfs_object_t structure with common entry points (e.g. destroy).
This removes tons of public API functions and replaces them with a
simple sqfs_destroy instead. If semantics of the (until now implicit)
object system need to be extended, it can be much more conveniantely
done this way.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
It was basically built around the block processor and exposed way too
many internals. Removing it from other places was mostly trivial. This
commit completely removes it from the public API and even most of the
libsquashfs internals.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit moves the entire block writing and deduplication of data
blocks over to a different data type named "block writer".
For simplicity, the interfaces of the block processor are left as is
and are turned into warppers. Likewise, most of the code in the block
writer is just verbatim from the block processor, to be cleaned up in
subsequent commits.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|