aboutsummaryrefslogtreecommitdiff
path: root/lib/fstree
AgeCommit message (Collapse)Author
2020-12-29Fix normalization of slashes in filenamesDavid Oberhollenzer
All paths were canonicalized internally, which includes filtering sequences of slashes and converting backslashes to slashes. Furthermore, when unpacking files, filenames are sanity checked and rejected if they contain forward OR backward slashes. This is a problem on Unix-like systems, where files containing backslashes are a legitimate use case (*cough* SystemD *cough*). This patch removes the backslash conversion from the canonicalization and modifies the sanity check to reject backslashes only on Windows. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-23Simplify hard link handlingDavid Oberhollenzer
- For now, enforce that hard links don't point to a directories. - Instead of doing the swaping trickery, just reorder the flat list and hand out new inode numbers. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-23Minor cleanup in inode allocationDavid Oberhollenzer
- Remove unnecessary counter argument, we already have the total count. - Remove the return status, there is no failure branch. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-23Bring back the flat list of inodes in libfstreeDavid Oberhollenzer
It makes further processing simpler and doesn't leak the abstraction into upper layers. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-22Add hard link support to gensquashfs and tar2sqfsDavid Oberhollenzer
In libtar, set a special flag if the header is actually a hard link. In tar2sqfs, create a hard link node and skip the rest for hard links. Also refues to set the root attributes from a hard link, it may refere to a node that we have missed earlier, there is nothing else that we can do here. In fstree_from_file, add a "link" command for adding hard links. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-22Add basic support for handling and serializing hard linksDavid Oberhollenzer
In libfstree, add a function to add a hard link to the fstree. The hard links stores the target in the data.target field, canonicalizes the target and sets a sentinel mode. A second function is used to resolve link, i.e. replacing it with a direct pointer, setting another sentinel mode and increasing the targets link count. The post process function tries to resolve unresolved hard links and only allocates inode numbers for nodes that aren't hard links. If the target node of a hard link does not have an inode number yet, the two need to be swapped, since this is also the order in which they are serialized. The serialization function in libcommon simply has to skip hard link nodes and when writing directory entries, use the inode num/ref of the target node. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-19Split the libfstree add_by_path tree traversal function outDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-18Add an explicit link count to the fstree nodesDavid Oberhollenzer
Gets initialized to 2 for directories, 1 for all other types. The count of the parent node is automatically incremented. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-18Rename fstree "slink_target" to "target"David Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-18Move is_filename_sane to libfstree, add test casesDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-18Cleanup: internalize some fstree functionsDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-18Cleanup: merge the fstree post processing functionsDavid Oberhollenzer
Instead of having 3 different functions for sorting the tree, numbering the nodes and generating a file list, that all have to be used in the right order, this commit merges them into a single "fstree_post_process" function. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-16Remove fstree inode tableDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-11-25Cleanup: remove what is left of libutilDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-11-25Cleanup: move overflow safe alloc code into libsquashfsDavid Oberhollenzer
There were only a hand full of instances outside libsquashfs that used the alloc code. In most cases, the thing allocated hat its size derived from something already in memory anyway, so it is safe to assume its size fits into a size_t. At the same time, the opencoded Windows path conversion functions are all unified into a single helper function. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-11-24Cleanup: move canonicalize_name back to libfstree.aDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-10-28Add macro for printf format specifier for size_tDavid Oberhollenzer
The MSVC runtime is a wierdo C89 platform with some cherry picked features from C99 (which does not include the "%zu" format specifier). This commit adds a macro with a size dependend format specifier to be used instead. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-10-07Cleanup: move libutil related headers to "util" sub directoryDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-29Fix inode numbering: always start with 1, use 0 as parent for the rootDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-29Cleanup: fstree no longer has any use for the block sizeDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-29Cleanup: fstree_from_file does not need to change working directoryDavid Oberhollenzer
The file_info_t no longer stores the size or other such information, so there is no need to do a stat on the input file. This also means that gensquashfs no longer needs to change the working directory when using the function. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-28Replace fstree/sqfshelper xattr code with sqfs_xattr_writer_tDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-28Move fstree selinux code to gensquashfsDavid Oberhollenzer
Same rational as for the dir-scanner code: It's actually the only user and it is going to get a lot closer integerated with libsquashfs. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-28Move fstree_from_dir to gensquashfs codeDavid Oberhollenzer
It's actually the only user and the dir-scanner xattr code is going to get a lot closer integerated with libsquashfs. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-27Add a header for platform compatibillity fluffDavid Oberhollenzer
- We don't have "endian.h" everywhere. On some BSDs its in sys and on some BSDs the macros have different names. - We definitely don't have sysmacros.h on non-Unix-like systems. - Likewise for sys/types.h, sys/stat.h and their contents. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-27Cleanup: replace fixed with data types with typedefsDavid Oberhollenzer
This is a fully automated search and replace, i.e. I ran this: git grep -l uint8_t | xargs sed -i 's/uint8_t/sqfs_u8/g' git grep -l uint16_t | xargs sed -i 's/uint16_t/sqfs_u16/g' git grep -l uint32_t | xargs sed -i 's/uint32_t/sqfs_u32/g' git grep -l uint64_t | xargs sed -i 's/uint64_t/sqfs_u64/g' git grep -l int8_t | xargs sed -i 's/int8_t/sqfs_s8/g' git grep -l int16_t | xargs sed -i 's/int16_t/sqfs_s16/g' git grep -l int32_t | xargs sed -i 's/int32_t/sqfs_s32/g' git grep -l int64_t | xargs sed -i 's/int64_t/sqfs_s64/g' and than added the appropriate definitions to sqfs/predef.h The whole point being better compatibillity with platforms that may not have an stdint.h with the propper definitions. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-27Cleanup: remove most of the payload pointer magic from libfstreeDavid Oberhollenzer
Now that dir_info_t and file_info_t have reasonably small, use them in tree_node_t directly instead of doing pointer arithmetic magic on the payload area. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-25Remove no-longer-used cruft from libfstreeDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-20Large round of dead code removalDavid Oberhollenzer
Remove all the library functions that no longer have any users. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-20Move canonicalize_name back to libutilDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-20Move "optimize unpack order" to from fstree to rdsquashfsDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-20Remove parallel unpackingDavid Oberhollenzer
Parallel unpacking didn't really improve the speed that much. Actually sorting the files for optimized unpack order improved speed much more than the parallel unpacker. Furthermore, the fork based parallel unpacker was actually pretty messy to begin with. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-14Remove fstree file flagsDavid Oberhollenzer
As a side effect, this requires the data writer to keep track of statistics. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-14Move data deduplication from fstree code to data writerDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-11Cleanup Automake file for librariesDavid Oberhollenzer
Split the signel file up into several small ones and use a variable for the public headers instead of duplicating them. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-01Move some application specific stuff out of libutilDavid Oberhollenzer
This commit does the following: - canonicalize_name is moved to libfstree - source_date_epoch is only used inside libfstree, so it's also moved over and can later be completely internalized - print_version is moved over to sqfshelper. Mainly so it doesn't end up in libsquashfs.so for no sane reason. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-26Tune the paranoia down a bitDavid Oberhollenzer
size_t is guaranteed to be large enough to measure the size of things in memory, so when doing exactely that (e.g. strlen(a) + strlen(b)), checking for overflow is pointless since both objects are already in memory. If the addition would overflow, the two strings would occupy more memory than addressable. (Possible exception being some kind of harward style architecture with the two strings being in different kinds of memory of course.) Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-25Size accounting + alloc() overflow checking, round #3David Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-23Size accounting + alloc() overflow checking, round #2David Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-23Some simple search/replace cases for allocationDavid Oberhollenzer
This commit exchanges some malloc(x + y * z) patterns that can be found with a simple git grep and are obvious for the new wrappers. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-19Fix file list generation: break any pre-existing connectionDavid Oberhollenzer
If the linked list pointer was already used before, break up the connection so we don't risk running into a loop or something when regenerating the list. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-19Fix memory leak in dir-scan error code pathDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-18cleanup: internalize deduplication list in data_writerDavid Oberhollenzer
This change removes the need for passing a list of files around for deduplication. Also the deduplication code no longer needs to worry about order, since the file being deduplicated is only added after deduplication is done. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-11Add gensquashfs option to read xattrs from input filesDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-11Add --one-file-system option to gensquashfsDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-11Replace fstree_from_dir boolean with flag fieldDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-04Improve file unpacking orderDavid Oberhollenzer
This commit moves the file unpacking order & job scheduling to a libfstree function. The ordering is improved by making sure fragment blocks are not extracted more than once and files with data blocks are extracted in order. This way, serial unpacking of a 2GiB Debian live image could be reduced from ~5' on my test machine to ~3.5', whereas parallel unpacking stays roughly the same (~3' for -j 4). Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-04Fix functions with side effect being used inside assertsDavid Oberhollenzer
If -DNDEBUG is set, the entire thing is omitted from the output. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-03cleanup: remove left over atime/ctime codeDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-02Implement support for SOURCE_DATE_EPOCH environment variableDavid Oberhollenzer
reproducible-builds.org suggests the use of an environment variable as a source for time stamps: https://reproducible-builds.org/specs/source-date-epoch/ This commit adds support for setting the default mtime from the variable, if it is set and only defaulting to 0 if not. The timestamp given by the command line switch takes precedence. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>