aboutsummaryrefslogtreecommitdiff
path: root/lib/sqfs/data_writer
AgeCommit message (Collapse)Author
2020-01-26Cleanup: Move fragment deduplication code over to fragment tableDavid Oberhollenzer
This removes further clutter from the data writer. Any future efforts on making fragment by hash lookup faster can focus on that area only and don't clutter the block processor. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-01-24Cleanup: use fragment table primitive in data writerDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-01-24Cleanup: remove single use helper functions from data writerDavid Oberhollenzer
This commit moves the single use helper functions that are called from worker thread context into the worker thread function. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-01-19Cleanup: remove the payload pointers from sqfs_inode_generic_tDavid Oberhollenzer
There are 3 types of extra payload: - Directory index - File block sizes - Symlink target This commit removes the type specific pointers and modifies the code to use the payload area directly. To simplify the file block case and mitigate alignment issues, the type of the extra field is changed to sqfs_u32. For symlink target, the extra field can simply be cast to a character pointer (it had to be cast anyway for most uses). For block sizes, probably the most common usecase, it can be used as is. For directory indices, there is a helper function anyway. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-13Merge windows and pthread thread pool implementationsDavid Oberhollenzer
Since they are both structured the same way using condition variables, they are only a few defines away from removing code duplication. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-13Cleanup data writerDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-12Don't kick off the threads until the queue is fullDavid Oberhollenzer
The idea is as follows: Initially let the user submit blocks until the queue is filled, then kick of the threads. Every thread will end up getting a block without any waits until they completely deplete the queue. Assuming the threads take longer to process the data than it takes the main thread to do I/O and submit new blocks, the queue should stay mostly filled with minimal wait times. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-12Fix thread pool queue accountingDavid Oberhollenzer
- ONLY manipulate the back log counter in the main thread. - Fix the order of operations when submitting blocks. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-08Replace SetEvent synchronization with condition variablesDavid Oberhollenzer
It's cleaner, more stable and works pretty much the same way as the pthread version. The downside is that the minimum target for the library is now Windows Vista, or Server 2008. But both are over a decade old anyway, so this shouldn't be an issue. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-08Add native Windows port of the multi-threaded data writerDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-05Minor data writer cleanupDavid Oberhollenzer
Move "do block" function over to the rest of the block related code and internalizie the pthread worker structure. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-05Fix pthread data writer interfering with signal handlingDavid Oberhollenzer
If a signal is sent to a process, the POSIX spec says that ANY thread could be picked ARBITRARILY to handle the signal. In our case this could be one of the internal worker threads. The problem here is that an unsuspecting user of the library might suddenly have their signal handler run in parallel to their main thread and run into weird concurrency issues, because they didn't expect that to happen. In fact, the libsquashfs API tries to be transparent about whether or not it uses a thread pool internally and does everything other than number crunching (e.g. I/O that may happen through user supplied callbacks) in the same thread as the one that submitts the blocks. Unfortunately, pthread doesn't have a way to set a signal mask for the new thread and setting it inside the thread is racy (i.e. a signal might be delivered before the worker thread sets the mask). The only portable and non-racy way to do this, is to disable all signals in the calling thread that sets up the data writer, create the threads (which will inherit the mask) and then resetore the previous signal mask, hoping for the best. The downside to this is that signals may be lost in that short time. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-11-25Cleanup: move overflow safe alloc code into libsquashfsDavid Oberhollenzer
There were only a hand full of instances outside libsquashfs that used the alloc code. In most cases, the thing allocated hat its size derived from something already in memory anyway, so it is safe to assume its size fits into a size_t. At the same time, the opencoded Windows path conversion functions are all unified into a single helper function. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-10-07Cleanup: move libutil related headers to "util" sub directoryDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-10-07Fix spelling of "align"David Oberhollenzer
Before the misspelled version has a chance to become stable API. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-10-07Improve ABI backward & forward compatibillityDavid Oberhollenzer
- Padd the compressor config union - 128 bytes aught to be enough for everyone, i.e. future compressors. - Insist that the padding space is initialized to 0. If a field gets added to an existing compressor, it can test for 0 as a sentinel value. - Add a size field to the hook structure, aka "the Microsoft way". - The explanation is in the comment. - Don't make the Microsoft mistake of checking for >=, insist on *exact* size match. Future users will need a fallback if their hooks are rejected. But at least they will be rejected instead of silently not being used. - Add an unsupported flag check to the dir tree reader. - Add a basic abi unit test that, for now, checks the size of the compressor config struct fields. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-29Cleanup: rename "compress.h" to "compressor.h"David Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-29Bring back the don't fragment flagDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-29Make the data writer padding hookableDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-29Fix data writer: remember where we stored the paddingDavid Oberhollenzer
If we don't store the padding location in the deduplication list, the block deduplication might try to deduplicate across two files with padding in between, which can't be done. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-28Fix missing mutex unlick in data writer error pathDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-27Cleanup: replace fixed with data types with typedefsDavid Oberhollenzer
This is a fully automated search and replace, i.e. I ran this: git grep -l uint8_t | xargs sed -i 's/uint8_t/sqfs_u8/g' git grep -l uint16_t | xargs sed -i 's/uint16_t/sqfs_u16/g' git grep -l uint32_t | xargs sed -i 's/uint32_t/sqfs_u32/g' git grep -l uint64_t | xargs sed -i 's/uint64_t/sqfs_u64/g' git grep -l int8_t | xargs sed -i 's/int8_t/sqfs_s8/g' git grep -l int16_t | xargs sed -i 's/int16_t/sqfs_s16/g' git grep -l int32_t | xargs sed -i 's/int32_t/sqfs_s32/g' git grep -l int64_t | xargs sed -i 's/int64_t/sqfs_s64/g' and than added the appropriate definitions to sqfs/predef.h The whole point being better compatibillity with platforms that may not have an stdint.h with the propper definitions. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-27Cleanup: merge data.h into block.hDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-26Replace the data writer enqueue with "append buffer to current file"David Oberhollenzer
This commit turns the file interface into an actual, generic file interface and does away with having to move around blocks outside the data writer. Instead the data writer takes over full control and responsibility of dividing the input data up into blocks propperly. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-26Add file API stub to sqfs data writerDavid Oberhollenzer
Basically move the state tracking from the old data writer over to the new one. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-25Add the ability to hook into the data writer block writingDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-25Move sqfs_block_t to its own headerDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-25Rename block processor to sqfs_data_writer_tDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>