Age | Commit message (Collapse) | Author |
|
This commit adds a function to libsquashfs.so for writing generic inodes
to a meta writer and another function to libsqfshelper.a that turns a
tree node to an inode. That way, the tree serialization code can be
expressed in terms of those functions and a bulk of the independend code
can be moved over to libsquashfs.so
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit adds a directory writer to libsquashfs that wrapps a meta
data writer and provides a higher-level interface for writing directory
entries. Under the hood it enforces the rules that squashfs insists
upon.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
As an opaque struct it has a chance to change its layout in the future
without breaking ABI compatibiliy.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit moves stuff like printing help text, command line option
processing and enumerating available processors on stdout out of
the generic compressor code.
The option string is replaced with a structure that directly exposese
the tweakable parameters for all compressors. A function for parsing
the command line arguments into this structure is added in sqfshelper.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Move declarations for stuff that is defined in libsquashfs.so into the
public headers and declarations for stuff that isn't, out of there.
Also move the meta reader/writer helper functions to their respective
headers.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
The idea is to make libsquashfs.a independend of libfstree.a, so it becomes
a general purpose squashfs manipulation library. All the high level glue code
for libfstree.a and utilites that are overly specific with to tools are moved
to a seperate librarby.
This commit makes the first step by moving the stuff with dependencies on
libfstree to a seperate library.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
The SquashFS kernel implementation insists that a directory header is
followed by no more than an upper bound of entries, way less than what
the filed itself actually supports.
This commit makes sure that the meta_reader_read_dir_header function
also enforces that same limit.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
In all cases where metadata blocks are read, we can roughly (in some
cases even preciesly) say in what range those metadata blocks will be,
so it makes sense to throw an error if an attempt is made to wander
outside this range.
Furthermore, when reading from an uncompressed block, it is more reasonable
to check against the actual block bounds than to padd it with 0 bytes.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit exchanges some malloc(x + y * z) patterns that can be found
with a simple git grep and are obvious for the new wrappers.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
The tree deserializer does a recursive depth-first search to populate
the directory tree, moving back and forth between the directory listing
containing the inode references and the inode table pointing to the
list of child inodes. It is completely unaware of hard links and creates
duplicate nodes instead.
It is possible to create a malicious SquashFS image that contains a
directory that contains as child a reference to its own inode. This
can also be done transitively (i.e. directory contains its own parent
or grand parent), leading to infinite recursion (actually finite, since
it terminates once all stack memory is exhausted).
This commit adds a simple check to see if a node has the same inode
number as any of its would-be parents. If it does, the node is discarded
and a warning message is emitted.
Other cases with arbitrary layers of indirection could be constructed
as well (e.g. dir 'a' contains hard link to 'b' and 'b' one back to 'a'),
but the sub hierarchies are always expanded, this check should catch that
too.
Reported-by: Zachary Dremann <dremann@gmail.com>
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
An inode can be of extended type for reasons other than having extended
attributes and simply set the xattr ID to 0xFFFFFFFF to indicate that
it doesn't have extended attributes.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
It is optimized to the maximum and if we already use zlib anyway,
why not use zlib crc32? This also makes zlib a hard dependency which
also means the whole "do we have a compressor" sanity check in the
build system can be removed.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This change removes the need for passing a list of files around for
deduplication. Also the deduplication code no longer needs to worry
about order, since the file being deduplicated is only added after
deduplication is done.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Only padd it if the *extracted* size is less then block size. Doing it
with the compressed size results in garbled blocks. Especially because
most of them are less than block size when compressed.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
- Split block reading code out from "dump_blocks" into precache_data_block,
similar to precache_fragment_block
- Merge the code paths for fragment/data block reading and uncompression
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit creates a new data structure called 'sqfs_reader_t' that
takes care of all the repetetive tasks like opening the file, reading
the super block, creating the compressor, deserializing an fstree and
creating a data reader.
This in turn makes it possible to remove all the duplicate code from
rdsquashfs and sqfs2tar.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
If we failed to create the root node, we don't need to cleanup the
fstree_t which would attempt to recursively cleanup the root node.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
We need to get the position _before_ writing the header, otherwise the
reader has no way to know the length of the value.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Data blocks need to be deduplicated before attempting to write a fragment.
In the current attempt if the data blocks are found to be duplicates but
the fragment isn't, the flushed fragments are purged as well, possibly
damaging other files.
Also, when the deduplication happens, the HAS_FRAGMENT flag needs to be
set, otherwise the deduplication code thinks that there is one more block
than there actually is.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Since it is actually completely independend of libsqfs and only works
on file_info_t lists, it can be safely moved over to libfstree and
the data writer becomes less cluttered as a result.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
A file is a complete duplicate if:
- It has no blocks, only a single fragment and that is a duplicate
- It has blocks but no fragment and the blocks are duplicate
- It has blocks and a fragment and both are duplicate
The previous version only counted the last one.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
If an entire file is eliminated, we need to reset the "used_bytes" counter,
otherwise, ALL the table positions are way off.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
We didn't allocate the ID table, so we don't need to free() it when
reading from disk fails.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Cut & paste misshap after mergining with fragment reader: If there are
no fragments, data_reader_create should return the data reader, not 0!
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Simplifies some task if we can just add a flag that a file has a framgent
or that it has already been detected as a duplicate.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
The strategy is as follows:
- At the beginning of every file, remember the current position
- Once a file is done scan the list of existing files for the following:
- Look for an existing file that has a block with the same size and
checksum as the first non-sparse block of the current file
- After that, every block in the current file has to match in size and
checksum the ones in the file that we found, from that point onward
- sparse blocks in either file are skipped
- If we found a match, we update the current file to point to the first
matching block and rewind the squashfs image to remove the newly written
data
This strategy should in theory be able to find an existing file where the
on-disk data *contains* the on-disk data of the current file.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
The strategy is simple:
- The data writer function that write data/fragment blocks get
access to the list files.
- When writing a fragment, we look for an already written file that has
a fragment with the same size and checksum.
- If we find one, we throw away the fragment and reuse the existing one.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|