summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-01-19libsqfs: block processor: backport exact fragment matchingDavid Oberhollenzer
This commit is an amalgamation of the commits on master that implement exact matching of fragment blocks during deduplication. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-01-19libsqfs: implement exact matching in the default block writer.David Oberhollenzer
Instead of comparing (compresed, disk-size, checksum) tuples to find block matches, do an exact, byte-for-byte comparison of the data stored on disk to avoid the possibility of a spurious colision. Since this is the desired behaviour, make it the default, optionally overrideable through a flag. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-01-19Fix: Move fragment consolidation back to block processor serial partDavid Oberhollenzer
Keeping a list of fragments stored away in the current fragment block and consolidating them in the thread pool takes them out of circulation. If we have a lot of tiny fragments, this can lead to a situation where all the limit is reached, but we cannot do anything, because we are waiting for a block to complete, but they are all attached to the current fragment block and the queue is empty. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-01-15Fix more normalization of slashes in filenames.Scott Moser
It looks like the last commit missed a couple more occurences where '\' was treated incorrectly. Fixes were still needed in sqfs_dir_reader_find_by_path and sqfs_dir_reader_get_full_hierarchy. This path is used in extras/browse.c.
2020-12-29Fix normalization of slashes in filenamesDavid Oberhollenzer
All paths were canonicalized internally, which includes filtering sequences of slashes and converting backslashes to slashes. Furthermore, when unpacking files, filenames are sanity checked and rejected if they contain forward OR backward slashes. This is a problem on Unix-like systems, where files containing backslashes are a legitimate use case (*cough* SystemD *cough*). This patch removes the backslash conversion from the canonicalization and modifies the sanity check to reject backslashes only on Windows. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-12-29Fix: libsquashfs: xattr_writer: return NULL if calloc failsDavid Oberhollenzer
Instead of dereferencing the NULL pointer and crashing. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-12-29Minor "late night typing" fixes in documentationDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-11-05Patch level release 1.0.3v1.0.3David Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-11-02Backport changes to builtin copy of zlibDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-11-02Backport changes to package scriptsDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-11-02Backport changes to the test setupDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-11-02Backport changes to the benchmark writeupDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-10-31Add a test case for tar2sqfs root becomes file & link filteringDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-10-31Fix: tar2sqfs: if --root-becomes is used, also retarget linksDavid Oberhollenzer
In addition to skipping non-prefixed files and stripping the prefix off of entries we accept, the targets of links also have to be altered, since they can be absolute paths with the root prefix attached. This can affect symbolic links as well. Altough they are allowed to point into nowhere, across filesystem boundaries, they may also be absolute paths refering to existing locations in the filesystem, so no distinction is made by default. However, the later may be intended (e.g. only a subdirectory is packed into SquashFS and then mounted at that location), so a command line switch is added to disable symlink retargetting. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-09-29Fix: rdsquashfs: describe: don't escape the prefixed file input pathDavid Oberhollenzer
If rdsquashfs describes a filesystem, is configured to use a prefix-path and a file contains a space, the resulting input path has the prefix printed as is and the rest of the string in quotation marks. gensquashfs simply takes the entire rest of the line as is as its input path and no longer finds the file. Fix this by omitting the escapes and quotation marks for the input path. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-09-24Fix: tar2sqfs: when skipping non-prefixed files, also skip contentsDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-09-24Fix: tar2sqfs: don't touch the path after determining it is brokenDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-09-03Patch level release 1.0.2v1.0.2David Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-09-03Fix integer bounds checking in GNU tar sparse format 1.0 parserDavid Oberhollenzer
- Make sure the file actually has that many records before trying to read one and fail if not. - Use the helper macros for size_t overflow checking instead of assuming size_t == uint64_t. - Impose a "reasonable" upper bound on the number of data segments and insist that there is at least one entry. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-09-03Remove tar test cases for file flags (not applicable)David Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-09-03Cleanup: reduce tar test cases to a few generic C filesDavid Oberhollenzer
This commit removes the existing tar test cases that simply call the generic test case function with several different paths with generic test case source files that are parameneterized via the pro-processor. For each tar archive, a separate test case is generated. On the one hand, this reduces the test source code to practically nothing. On the other hand, a test binary is generated for every distinct test case, instead of one per group and we get more detailed insights if something goes wrong. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-09-03Cleanup copy & paste mess in the tar parser test casesDavid Oberhollenzer
Many of the patterns tested are very repetetive. This commit moves the two common test cases out into helper functions and uses them for the test cases. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-09-02Fix nonexistant gnu tar sparse format 1.0 supportDavid Oberhollenzer
Contrary to previous claims, support for the GNU tar sparse format 1.0 was missing entirely (the newest of their 3 different sparse mapping formats). This oversight wasn't caught, because the unit test was compiling the wrong source file and tar2sqfs had no problem processing the test file because it is still a valid POSIX-ish tar archive (but the sparse part was missing and the mapping embedded in the file). Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-08-29Add package scripts to the release tarballDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-08-26Fix rdsquashfs unpack under Windows if a directory existsDavid Oberhollenzer
Behave the same way as the POSIX port and do not treat that as an error. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-08-26Fix tree node path generation for detached sub treesDavid Oberhollenzer
The function sqfs_tree_node_get_path is used in several places within rdsquashfs to produce a path for a tree node, either when describing the file system, or when unpacking it. Unpacking can be done on sub-trees as well as the entire tree, in which case the root of the sub-tree has its parent pointer removed, so the full path terminates at the new root. This works with directories, since they receive special case handling anyway, but fails if the sub-tree to unpack is only a single file because the sqfs_tree_node_get_path function assumes that we are at the tree root and returns "/" as a path, which gets normalized to "". This commit adds a workaround to the function to simply use the nodes name (if available) in that case instead. The describe case in rdsquashfs is unaffacted, since it always starts at the root. Likewise, the sqfs2tar case should also be unaffacted, since it already employs special case handling for the [sub] tree root node. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-08-20Add spec file to build RPM packages.Sébastien Gross
Signed-off-by: Sébastien Gross <seb•ɑƬ•chezwam•ɖɵʈ•org>
2020-08-16Add a libtar test case for a completely filled link target fieldDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-08-16Fix libtar treatment of link targets that fill the header fieldDavid Oberhollenzer
The tar header has a 100 byte field for symlink and hard link targets. If the target is longer than 100 bytes, an extension header has to be used. However, it is perfectly valid to fill all 100 bytes to the brim without adding a null terminator. In case of a symlink, this can result in garbage link targets, while for hard links it results in an immediate error since the target cannot be resolved later on. This commit attempts to fix the problem by replacing the strdup of the link target with an strndup that copies at most the size of the target header field. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-08-12Add a "--stat" option to rdsquashfsDavid Oberhollenzer
This commit adds a --stat option to rdsquashfs that dumps a lot of information about and inode that tunred out to be usefull in debugging. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-08-12Add a clarifying note about documentation & build system copyrightDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-08-12Fix block processor single block with don't fragment flag bugDavid Oberhollenzer
This commit fixes a bug where the block processor state machine would not add the "last block" flag if there is only one not entirely filled block and the "don't fragment" flag is set. If the flag isn't set, the inode start block position is not updated and points to the beginning of the image instead. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-08-04Cleanup: move zlib/lz4 code from lib/sqfs/comp/ to lib/David Oberhollenzer
The source code of a modified liblz4 and zlib are included with the option to compile them into libsquashfs if they are not available on the system. So far, the source code was included directly in the compressor sub directory within libsqsuashfs. This commit moves the libraries out into the lib directory. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-08-04Add configuration for GitHub CodeQL static analysis toolsDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-08-03Patch level release 1.0.1v1.0.1David Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-08-03format.txt Fix magic text and remove stray tabsAnatoli Babenia
2020-07-29Fix: xattr reader: read the header after seaking to an OOL valueDavid Oberhollenzer
If an xattr value is stored OOL, the value actually holds an 8 byte reference to another, previously stored value. This reference points to the header that we need to read to know the actual size of the value before reading it, not the value itself, so after reading the reference and seeking to it, the xattr reader needs to read the actual header. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-06-29Document the file name limit imposed by the kernel implementationDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-06-20Fix block bounds checking in libsquashfs data readerDavid Oberhollenzer
Instead of doing the fragile size comparison in both loops, simply bail from the function if offset is out of bounds, clamp the size to the available range of the file and abail if it is zero. As a result, a lot of checks can be removed and the function will not return data beyond EOF. This problem occoured with files that have a short last block instead of a fragment. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-06-14.travis.yml: Install readline libraryMatt Turner
extras/browse is conditionally built if readline is available. Signed-off-by: Matt Turner <mattst88@gmail.com>
2020-06-14extras: Pass flags argument to fix buildMatt Turner
Missed in commit 259a98985b4f (Add flags to functions that might logically be expanded in the future) Signed-off-by: Matt Turner <mattst88@gmail.com>
2020-06-13Fix: don't include alloca.h on systems that don't provide this headerv1.0.0David Oberhollenzer
This commit fixes a build issue on BSD based systems, where alloca is defined in stdlib.h and there is no such thing as "alloca.h". Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-06-13Update changelogDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-06-13Bump the so version number for libsquashfsDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-06-13Minor documentation updateDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-06-12Fix the misspelled corpora test configure optionDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-06-12Add an explicit defition for the libsquashfs so versionDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-06-11Add flags to functions that might logically be expanded in the futureDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-06-09Cleanup: mark sqfs_xattr_writer_flush writer argument as constDavid Oberhollenzer
It does not make any changes to the writer itself, so mark it as const. This also requires some similar changes to the string table. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-06-09Cleanup: remove refcount adjusting in sqfs_xattr_writer_endDavid Oberhollenzer
After finding a match, reducing the reference count of the matched elements and increasing them afterwards leaves the reference count identical, because they refere to the same entries. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>