Age | Commit message (Collapse) | Author |
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
- Split block reading code out from "dump_blocks" into precache_data_block,
similar to precache_fragment_block
- Merge the code paths for fragment/data block reading and uncompression
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit creates a new data structure called 'sqfs_reader_t' that
takes care of all the repetetive tasks like opening the file, reading
the super block, creating the compressor, deserializing an fstree and
creating a data reader.
This in turn makes it possible to remove all the duplicate code from
rdsquashfs and sqfs2tar.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit moves the file unpacking order & job scheduling to a libfstree
function. The ordering is improved by making sure fragment blocks are not
extracted more than once and files with data blocks are extracted in order.
This way, serial unpacking of a 2GiB Debian live image could be reduced
from ~5' on my test machine to ~3.5', whereas parallel unpacking stays
roughly the same (~3' for -j 4).
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
reproducible-builds.org suggests the use of an environment variable
as a source for time stamps:
https://reproducible-builds.org/specs/source-date-epoch/
This commit adds support for setting the default mtime from the variable,
if it is set and only defaulting to 0 if not. The timestamp given by the
command line switch takes precedence.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit patches the tar writer to generate a PAX header with SCHILY
xattr key/value pairs if requested.
The Schily format is used for two reasons:
- It is simple
- It is apparently more widely supported than the libarchive format
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Since it is actually completely independend of libsqfs and only works
on file_info_t lists, it can be safely moved over to libfstree and
the data writer becomes less cluttered as a result.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Simplifies some task if we can just add a flag that a file has a framgent
or that it has already been detected as a duplicate.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
The strategy is simple:
- The data writer function that write data/fragment blocks get
access to the list files.
- When writing a fragment, we look for an already written file that has
a fragment with the same size and checksum.
- If we find one, we throw away the fragment and reuse the existing one.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
After the table read unification, there wasn't much left of the fragment
reader and the remains could easily be moved over to the data reader.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Instead of doing DFS on the fly in gensquashfs, churn out a linked list
of all files in an archive.
Future improvements in packing strategies can go into this file.
This can also be usefull for other purposes in the future, such as file
deduplication or as a work queue for the unpacker.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit attempts to make the generic table writer more readable.
A few changes are made, including heap allocation of the block list.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
struct stat uses time_t to store time values. On some 32 bit systems,
this may be a 32 bit integer.
This patch adds a broken-out 64 bit time value to tar_header_decoded_t
and makes sure to clamp the value to +/- (2^32 - 1) if required when
writing it back to a struct stat.
Reported-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Requires that config.h be included before other headers, since the macro
_FILE_OFFSET_BITS changes the definitions of things like 'struct stat'.
I chose to simply include it at the top of every C file and at
immediately after the double-inclusion guards of every header.
Signed-off-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
First of all, this commit adds a mod_time field to a tree node. When
creating the tree node, the field is set from the struct stat. When
scanning a directory, the time stamps from the input are used if set.
Second, the libsqfs code that reads inodes is modified to store the
mod_time from the inode in the fstree node and to write the tree node
into a generated inode.
Finally, tar2sqfs is modified to optionally keep the timestamps from
the tar archive instead of setting defaults. gensquashfs is similarly
modified to keep the input timestamps if specified.
The result is as follows:
- sqfs2tar will always carry the timestamps from the squashfs over
to the tar ball.
- tar2sqfs will set defaults, unless explicitly asked to preserve
the mtime from the tar ball.
- gensquashfs can optionally preserve the mtime from the input
hierarchy it processes if only --pack-dir is specified.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
- Store them in a struct instead of a hacky uint64_t with magic shifts
- Split up key/value pair write function to write_key and write_value
- Move the size accounting into those functions respectively
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit adds a reference count functionality to the string table
implementation and uses this functionality in the fstree code to
count how often each key and value is referenced by the deduplicated
Xattr blocks. This is needed to support deduplication through
out-of-band storage of xattrs.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
If read_retry fails to read the expected amount of data (EOF or otherwise),
it is almost always an error.
This commit renames read_retry to read_data and moves error handling
into the function, making a lot of error handling code redundant.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
If write_retry fails to write everything, it is *always* an error.
This commit renames write_retry to write_data and moves error handling
into the function, making a lot of error handling code redundant.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Digging around in kernel internals and mksquashfs reveals that it is
actually a buffer offset into the raw directory buffer. The error
hasn't been noted until now because of the bug fixed by a5428e0.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
The added flags allow controlling the following on a per file level:
- forcing a file to be written uncompressed
- forcing a file to not have a fragment, i.e. the last truncated block
actually being written as a block
- padding a file to be alligned to device block size
The flags are not yet exposed to anything user controllable (such as
command line flags).
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Instead of writing meta data blocks directly to disk, the writer can
now alternatively keep the blocks in memory until explicitly told to
write to disk.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
The directory listing stores a signed difference of the inode number.
Actually treating it as signed saves emitting extra headers if hard
links or file deduplication are finally implemented.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Some experiments seem to indicate that the various GNU extensions are
more widely supported than their POSIX equivalents[1]. Possibly because
they are easier to implement and possibly because of the wide spread
use of GNU tar.
This commit replaces the PAX writer in the write_tar_header implementation
with a GNU extension based writer.
The writer is also cleaned up by removing all global state. The record
counter is moved outside into the tar2sqfs program and passed in as
function argument.
[1] https://dev.gentoo.org/~mgorny/articles/portability-of-tar-features.html
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit attempts to split some of the monolitic tar parsing code up
into multiple functions in seperate files. Also, some code duplication
(like reading a record into memory which was implemented twice) is
removed.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit broadly does the following things:
- Rename and move the sparse mapping structure to libutil
- Add a function to the data writer for writing condensed versions
of sparse files, given the mapping.
- This shares code with the already existing function for regular
files. The shared code is moved to a common helper function.
- Add support to tar2sqfs for repacking sparse files.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit adds support for packing sparse files into squashfs images
as follows:
- In the data writer: simply detect zero blocks and write a zero to the
block size field and don't emit any data. Record the number of bytes
saved this way. For fragments, set the fragment offset to invalid.
- In the inode writer: write out the number of bytes saved for sparse
files. If there should be a fragment but there is none, append a block
count of 0.
- In the data reader: if the block size is 0, read nothing from disk and
emit an empty block. Do the same if the fragment is missing.
- In the inode reader: restore the number of bytes saved for sparse files.
The sparse files can be packed and unpacked, but the unpacking will not
create sparse files for now.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Instead of decomposing a default string in gensquashfs option processing,
move that to fstree_init instead and pass the option string directly to
fstree_init.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit removes handling of compressor names from gensquashfs. Instead,
functions are added to libcompress to obtain name from ID, ID from name
and to print out defaults.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|