summaryrefslogtreecommitdiff
path: root/lib/fstree
AgeCommit message (Collapse)Author
2019-09-29Cleanup: fstree_from_file does not need to change working directoryDavid Oberhollenzer
The file_info_t no longer stores the size or other such information, so there is no need to do a stat on the input file. This also means that gensquashfs no longer needs to change the working directory when using the function. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-28Replace fstree/sqfshelper xattr code with sqfs_xattr_writer_tDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-28Move fstree selinux code to gensquashfsDavid Oberhollenzer
Same rational as for the dir-scanner code: It's actually the only user and it is going to get a lot closer integerated with libsquashfs. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-28Move fstree_from_dir to gensquashfs codeDavid Oberhollenzer
It's actually the only user and the dir-scanner xattr code is going to get a lot closer integerated with libsquashfs. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-27Add a header for platform compatibillity fluffDavid Oberhollenzer
- We don't have "endian.h" everywhere. On some BSDs its in sys and on some BSDs the macros have different names. - We definitely don't have sysmacros.h on non-Unix-like systems. - Likewise for sys/types.h, sys/stat.h and their contents. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-27Cleanup: replace fixed with data types with typedefsDavid Oberhollenzer
This is a fully automated search and replace, i.e. I ran this: git grep -l uint8_t | xargs sed -i 's/uint8_t/sqfs_u8/g' git grep -l uint16_t | xargs sed -i 's/uint16_t/sqfs_u16/g' git grep -l uint32_t | xargs sed -i 's/uint32_t/sqfs_u32/g' git grep -l uint64_t | xargs sed -i 's/uint64_t/sqfs_u64/g' git grep -l int8_t | xargs sed -i 's/int8_t/sqfs_s8/g' git grep -l int16_t | xargs sed -i 's/int16_t/sqfs_s16/g' git grep -l int32_t | xargs sed -i 's/int32_t/sqfs_s32/g' git grep -l int64_t | xargs sed -i 's/int64_t/sqfs_s64/g' and than added the appropriate definitions to sqfs/predef.h The whole point being better compatibillity with platforms that may not have an stdint.h with the propper definitions. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-27Cleanup: remove most of the payload pointer magic from libfstreeDavid Oberhollenzer
Now that dir_info_t and file_info_t have reasonably small, use them in tree_node_t directly instead of doing pointer arithmetic magic on the payload area. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-25Remove no-longer-used cruft from libfstreeDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-20Large round of dead code removalDavid Oberhollenzer
Remove all the library functions that no longer have any users. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-20Move canonicalize_name back to libutilDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-20Move "optimize unpack order" to from fstree to rdsquashfsDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-20Remove parallel unpackingDavid Oberhollenzer
Parallel unpacking didn't really improve the speed that much. Actually sorting the files for optimized unpack order improved speed much more than the parallel unpacker. Furthermore, the fork based parallel unpacker was actually pretty messy to begin with. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-14Remove fstree file flagsDavid Oberhollenzer
As a side effect, this requires the data writer to keep track of statistics. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-14Move data deduplication from fstree code to data writerDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-11Cleanup Automake file for librariesDavid Oberhollenzer
Split the signel file up into several small ones and use a variable for the public headers instead of duplicating them. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-09-01Move some application specific stuff out of libutilDavid Oberhollenzer
This commit does the following: - canonicalize_name is moved to libfstree - source_date_epoch is only used inside libfstree, so it's also moved over and can later be completely internalized - print_version is moved over to sqfshelper. Mainly so it doesn't end up in libsquashfs.so for no sane reason. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-26Tune the paranoia down a bitDavid Oberhollenzer
size_t is guaranteed to be large enough to measure the size of things in memory, so when doing exactely that (e.g. strlen(a) + strlen(b)), checking for overflow is pointless since both objects are already in memory. If the addition would overflow, the two strings would occupy more memory than addressable. (Possible exception being some kind of harward style architecture with the two strings being in different kinds of memory of course.) Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-25Size accounting + alloc() overflow checking, round #3David Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-23Size accounting + alloc() overflow checking, round #2David Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-23Some simple search/replace cases for allocationDavid Oberhollenzer
This commit exchanges some malloc(x + y * z) patterns that can be found with a simple git grep and are obvious for the new wrappers. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-19Fix file list generation: break any pre-existing connectionDavid Oberhollenzer
If the linked list pointer was already used before, break up the connection so we don't risk running into a loop or something when regenerating the list. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-19Fix memory leak in dir-scan error code pathDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-18cleanup: internalize deduplication list in data_writerDavid Oberhollenzer
This change removes the need for passing a list of files around for deduplication. Also the deduplication code no longer needs to worry about order, since the file being deduplicated is only added after deduplication is done. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-11Add gensquashfs option to read xattrs from input filesDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-11Add --one-file-system option to gensquashfsDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-11Replace fstree_from_dir boolean with flag fieldDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-04Improve file unpacking orderDavid Oberhollenzer
This commit moves the file unpacking order & job scheduling to a libfstree function. The ordering is improved by making sure fragment blocks are not extracted more than once and files with data blocks are extracted in order. This way, serial unpacking of a 2GiB Debian live image could be reduced from ~5' on my test machine to ~3.5', whereas parallel unpacking stays roughly the same (~3' for -j 4). Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-04Fix functions with side effect being used inside assertsDavid Oberhollenzer
If -DNDEBUG is set, the entire thing is omitted from the output. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-03cleanup: remove left over atime/ctime codeDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-08-02Implement support for SOURCE_DATE_EPOCH environment variableDavid Oberhollenzer
reproducible-builds.org suggests the use of an environment variable as a source for time stamps: https://reproducible-builds.org/specs/source-date-epoch/ This commit adds support for setting the default mtime from the variable, if it is set and only defaulting to 0 if not. The timestamp given by the command line switch takes precedence. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-30Add propper copyright headers to all source filesDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-29Cleanup: move deduplication code from data writer to fstreeDavid Oberhollenzer
Since it is actually completely independend of libsqfs and only works on file_info_t lists, it can be safely moved over to libfstree and the data writer becomes less cluttered as a result. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-29Simplify fstree sortingZachary Dremann
For merging, the use of a pointer to a pointer can simplify linked list operations For sorting, find the half-way point of the list in a single iteration over the list Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28Fix missing initialization of file fragment fieldsDavid Oberhollenzer
Despite having a flag for that now, they still need to be initialized because they are written straight to disk. Fixes: d4d1854aaed867d28ebfc97afb3518254ab6fd4b Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28Add general purpose flags field to file_info_tDavid Oberhollenzer
Simplifies some task if we can just add a flag that a file has a framgent or that it has already been detected as a duplicate. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-28Add fragment and block checksum fields to file_info_tDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-25Make sure symlink in fstree_mknode is always set when creating a symlinkDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-25Generate linear file list in fstreeDavid Oberhollenzer
Instead of doing DFS on the fly in gensquashfs, churn out a linked list of all files in an archive. Future improvements in packing strategies can go into this file. This can also be usefull for other purposes in the future, such as file deduplication or as a work queue for the unpacker. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-24libfstree: fix signed/unsigned comparisonsDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-24cleanup: remove atime/ctime processing codeDavid Oberhollenzer
This commit removes all the code for parsing and processing atime/ctime and values and related test code. Caring about those is kind of pointless because squashfs can only store mtime in inodes. The only relevant place is when generating a struct stat from a squashfs inode or an fstree node. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-24Enable largefile supportMatt Turner
Requires that config.h be included before other headers, since the macro _FILE_OFFSET_BITS changes the definitions of things like 'struct stat'. I chose to simply include it at the top of every C file and at immediately after the double-inclusion guards of every header. Signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-22Add a way to optionally keep the original time stampsDavid Oberhollenzer
First of all, this commit adds a mod_time field to a tree node. When creating the tree node, the field is set from the struct stat. When scanning a directory, the time stamps from the input are used if set. Second, the libsqfs code that reads inodes is modified to store the mod_time from the inode in the fstree node and to write the tree node into a generated inode. Finally, tar2sqfs is modified to optionally keep the timestamps from the tar archive instead of setting defaults. gensquashfs is similarly modified to keep the input timestamps if specified. The result is as follows: - sqfs2tar will always carry the timestamps from the squashfs over to the tar ball. - tar2sqfs will set defaults, unless explicitly asked to preserve the mtime from the tar ball. - gensquashfs can optionally preserve the mtime from the input hierarchy it processes if only --pack-dir is specified. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-21Cleanup xattr handlingDavid Oberhollenzer
- Store them in a struct instead of a hacky uint64_t with magic shifts - Split up key/value pair write function to write_key and write_value - Move the size accounting into those functions respectively Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-21Keep track of xattr key & value references AFTER deduplicationDavid Oberhollenzer
This commit adds a reference count functionality to the string table implementation and uses this functionality in the fstree code to count how often each key and value is referenced by the deduplicated Xattr blocks. This is needed to support deduplication through out-of-band storage of xattrs. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-17fstree: add support for spaces in filenamesDavid Oberhollenzer
This commit adds a mechanism to fstree_from_file to support filenames with spaces in them by quoting the entire string. Quote marks can still be used inside file names by escaping them with a backslash. Back slashes (if that is your thing) can also be escaped. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-12fstree: mknode: initialize fragment data, add extra blocksize slotDavid Oberhollenzer
This commit makes sure that regular file tree nodes have one more slot than necessary to support files that don't have fragments. Also, it initializes the fragment data, which should help to deduplicate some code ahead. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-10Make coverity happyDavid Oberhollenzer
The string is actually always null-terminated, since the calloc above adds one more byte than what we tell readlink about. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-07Fix regression in fstree_from_file device node formatDavid Oberhollenzer
The format is documented as "<c|b> major minor" but the parser was accidentally changed to require a colon in between. Fixes: 9864ea5b2045f4bd72633152d71dd1c7f8b0b7f9 Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-03cleanup: move tree node from path function to libfstree.aDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-07-01Remove never used overflow error message in fstree_from_fileDavid Oberhollenzer
Bug found using scan-build. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>