aboutsummaryrefslogtreecommitdiff
path: root/lib/util
AgeCommit message (Collapse)Author
2023-04-30Move the pattern matching from gensquashfs to dir_tree_iterator_tDavid Oberhollenzer
A simple unit test is added that mainly checks for the behavior of recursing into a sub-tree and only matching the children at the end, but not reporting the parents that don't match. The behavior is inteded to immitate the `find` command. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-04-29Move dir entry remapping from gensquashfs to libutilDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-04-29Cleanup: libutil: split functionality of dir tree iterator nextDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-04-29libutil: Add an option to the dir_tree_iterator_t to add a path prefixDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-04-29Move type based filtering to libutil dir_tree_iterator_tDavid Oberhollenzer
Limited unit testing for the flags is added, particularly the abillity to recurse into sub-directories, but not report the parents as individual entries. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-04-29libutil: Add a stacked, recursive directory tree iteratorDavid Oberhollenzer
The concept is simple: Use the existing, platform dependent iterator to walk a directory. If a directory entry is encountered, recurse into it using the open_subdir handler, reconstruct the full path for any entries discovered using the directory stack. An additional function is added to skip a sub-hierarchy. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-04-21libutil: unix: simplify/unify directory iterator error handlingDavid Oberhollenzer
Instead of printing out error messages, return an errro ID and match the behavior with the Windows implementation. Also, don't check first if the struct stat says it is a link, the readlinkat() system call will fail if it isn't. Avoid confusing the deputy. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-04-21Fix: libutil: type check bug in unix directory iterator read_linkDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-04-21libutil: Add a method to the directory iterator to open a sub directoryDavid Oberhollenzer
This is also the reason we need to lug around the original directory path on Windows. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-04-21libutil: win32: clenaup dir iterator initialization, retain the pathDavid Oberhollenzer
Instead of dropping the path immediately, store it inside the structure. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-04-21libutil: simplify win32 directory iterator state handlingDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-04-17Add unit test for directory iteratorDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-04-17Implement a version of the directory iterator for UnixDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-04-12Split out Windows directory iteration code to a dir_iterator_t typeDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-01-31Reintegrate test code with library codeDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-01-31Move library source into src sub-directoryDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-11-18Add a single, central base64 decoderDavid Oberhollenzer
Similar to the hex blob decoder, we need this once for tar and once for the filemap xattr parser. Simply add a single, central implementation to libutil, with a simple unit test, and then use it in both libtar and gensquashfs. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-11-18Add a single, central hex blob decoderDavid Oberhollenzer
Since we need it twice (once for tar, once for the filemap xattr parser), add a single, central implementation to libutil, add a unit test for that implementation and then use it in both libtar and gensquashfs. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-11-04Fix: update mempool accounting when freeing an objectDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-09-20block writer: move block comaprison to utility functionDavid Oberhollenzer
Slightly modify the byte-for-byte comparison function to compare an arbitrary range in a file and move it to libutil. Instead of calling it for each block in the block writer, simply let it check an entire range in the block writer and compute the range position/size of the reference ahead, before looking for potential matches. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-07-08Cleanup: move source date epoch code back to libutilDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-07-08Cleanup: move filename_sane & canonicalize_path functions to libutilDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-07-08Cleanup: move mkdir_p from libcommon to libutilDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-07-08Cleanup: move test.h to libutilDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-07-08Cleanup: move libutil headers to sub directoryDavid Oberhollenzer
Move all the libutil stuff from the toplevel include/ to a util/ sub directory and fix up the includes that make use of them. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-03-11More defensive programming in mem_pool_allocateDavid Oberhollenzer
Abort and retry in situations that should logically _never_ _ever_ happen. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-03-10Fix warning if __SIZEOF_INT128__ is not definedDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25libutil: cleanup alignment trickery in mempoolDavid Oberhollenzer
- Store the return value of the page allocation directly into the pool variable instead of an intermediate unsigned char pointer. - Make the blob[] array the same type as the bitmap, this saves us manual alignment trickery. - Cleanup the pointer arithmetic, let the compiler do the sizeof() multiplication. - Use uintptr_t for the manual alignment of the data pointer, so we don't run into signdness problems there. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-24Port the pool allocator to WindowsDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-22Threadpool: pre-emtively dequeue items after enqueingDavid Oberhollenzer
When we already hold the mutex, try to pre-emtively dequeue items into a "safe queue". When actually asked to dequeue, take blocks from there first and avoid having to enter the critical section if possible. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-22Cleanup the block processor file structureDavid Oberhollenzer
A cleaner separation between common code, frontend code and backend code is made. The "is this byte blob zero" function is moved out to libutil (with test case and everything) with a more optimized implementation. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-21Rename thread pool serial implementation data structureDavid Oberhollenzer
Hopeing that coverity can now tell the two appart. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-21Add a thread pool implementation to libutilDavid Oberhollenzer
The thread pool enforces ordering of items during dequeue similar to the already existing implementation in libsqfs. The idea is to eventually pull this functionality out of the block processor and turn it into a cleaner, separately tested module. The thread pool is implemented as an abstract interface, so we can have multiple implementations around, including the serial fallback implementation which we can then *always* test, irregardless of the compile config and run through static analysis as well. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-21Force 64 bit alignment of blocks managed by the pool allocatorDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-07Optionally use a pool allocator for rb-tree nodesDavid Oberhollenzer
This commit restructures the rbtree code to optionally use a pool allocator for the nodes. The option is made depenend on the presence of a pre-processor flag. To the configure script is added an option to enable/disable the use of custom allocators. It makes sense to still allow the malloc/free based routes for better ASAN based instrumentation. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-07Implement a custom memory pool allocatorDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-07Rewrite the str_table to internally use the more opimized hash_tableDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Add a generic implementation of a dynamic array to libutilDavid Oberhollenzer
The intention is to get rid of all the ad-hoc array implementations in the other components and cut down code size. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Add a context pointer to the rbtree key comparisonDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-03-06Add a copy function to the rb-tree implementationDavid Oberhollenzer
If we use the rb-tree in libsquashfs objects, we need to be able top copy an entire tree as part of the object. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-01-19Add a user pointer to the hash table implementationDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-06-09Cleanup: mark sqfs_xattr_writer_flush writer argument as constDavid Oberhollenzer
It does not make any changes to the writer itself, so mark it as const. This also requires some similar changes to the string table. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-21hash table: switch to sqfs_* types, mark functions as hiddenDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-19Cleanup: move hash table header to include directoryDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-27Enable uint128_t pathMatt Turner
I forgot to enable this when I copied it over from Mesa. Mesa's meson configuration system checks that a C program using the uint128_t type compiles, but I think this is likely unnecessary. Simply check the macro that clang and gcc define. This cuts the .text size of hash_table.o by 160 bytes or about 4% on my system. Signed-off-by: Matt Turner <mattst88@gmail.com>
2020-04-27Add hash table code to libutil.aDavid Oberhollenzer
Not only does this build the hashtable into libutil.a, it also makes sure the headers end up in the distribution tarball. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-22Import and use Mesa's hash tableMatt Turner
With `perf record`/`perf report` I saw that 30% of the time was spent in `sqfs_frag_table_find_tail_end` with tar2sqfs for a tarball containing the Gentoo ebuild repository (many thousands of small files). The reason was the bucketing hash table in frag_table.c: too many elements in too few buckets meant lots of walking over the linked lists. This patch replaces that hash table with the hash table implementation from Mesa. Its implementation is more complex (is is an open-addressing, linear-reprobing) hash table, but it is much better suited for the task. On my 4c/8t Skylake, the time to run tar2sqfs drops from 7.5s to less than 3s. CPU usage increases from ~207% to ~356%, presumably indicating an increase in available parallelism due to the removal of the hash table as a bottleneck. The `perf report` profile with this patch shows that the time spent in `sqfs_frag_table_find_tail_end` has dropped from ~30% to 0.01%. Output from ministat: x before + after N Min Max Median Avg Stddev x 20 7.476 7.685 7.5725 7.5615 0.051254268 + 20 2.79 2.901 2.846 2.84475 0.03543842 Difference at 95.0% confidence -4.71675 +/- 0.0282015 -62.3785% +/- 0.241477% (Student's t, pooled s = 0.0440618) I imported only the bits of the hash table implementation that were needed for frag_table.c. Among the changes I made after importing are - removed usage of ralloc, Mesa's recursive memory allocator - Replaced ralloc -> malloc ralloc_free -> free rzalloc_array -> calloc - Removed mem_ctx parameters - Added free()s to the appropriate places (valgrind confirms there are no leaks) - removed _mesa_-prefix from function names Fixes: #40 Signed-off-by: Matt Turner <mattst88@gmail.com>
2020-03-18Restore workaround for unaligned reads in xxhashDavid Oberhollenzer
The code was originally used inside the block processor, where 32 bit aligned data could be guaranteed. If it is available in libutil, I cannot possibly guarantee for alignment in future use elsewhere. Even for the block processor it was rather risky "remember this detail very well" buisness. This commit restores the unaligned read treatment of the original. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-18Cleanup: Move xxhash32 code to libutilDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-03-04Add a deep copy function for the str_table_t helperDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>