Age | Commit message (Collapse) | Author |
|
When processing files > 4G, using "%o" truncates the result and the
tarball is not readable. This should have been discovered when
auto-patching the printf format specifiers, but a cast was added
instead and the issue was overlooked.
This commit replaces the down-cast and printf format specifiers.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit rewrites the libtar write paths to use libfstream insead of
a FILE pointer. Also, the libcommon file extraction function is remodeled
to use libfstream.
In accordance, rdsquashfs, sqfs2tar and sqfsdiff have some minor
adjustments made to work with the ported libtar and libcommon.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
- Make sure the file actually has that many records before trying
to read one and fail if not.
- Use the helper macros for size_t overflow checking instead of
assuming size_t == uint64_t.
- Impose a "reasonable" upper bound on the number of data segments
and insist that there is at least one entry.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Contrary to previous claims, support for the GNU tar sparse format 1.0
was missing entirely (the newest of their 3 different sparse mapping
formats). This oversight wasn't caught, because the unit test was
compiling the wrong source file and tar2sqfs had no problem processing
the test file because it is still a valid POSIX-ish tar archive (but
the sparse part was missing and the mapping embedded in the file).
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
The tar header has a 100 byte field for symlink and hard link targets.
If the target is longer than 100 bytes, an extension header has to be
used.
However, it is perfectly valid to fill all 100 bytes to the brim
without adding a null terminator. In case of a symlink, this can
result in garbage link targets, while for hard links it results in
an immediate error since the target cannot be resolved later on.
This commit attempts to fix the problem by replacing the strdup of
the link target with an strndup that copies at most the size of the
target header field.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
- Move the xattr extraction and repacking to xattr.c
- Don't on-the-fly delete the tar xattr list, use the function
from libtar.a
- Split minor tasks into static helper functions
- creating a libtar xattr struct from libsqfs xattr data
- finding a hard link entry from current path and inode number
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit attempts to fix the following two problems:
- The number of digits computation returning an off-by-one result
if the number is 10, or the resulting digit string starts
with "10". This results in one-too-many padding bytes, corrupting
the rest of the archive since the headers now don't start at
multiples of 512 anymore.
- Adding the line length prefix affects the line length (duh). If it
grows far enough to require more digits, the result is a similar
problem. This is a converging series that we need to compute the
limit of.
Unit tests for this still need to be added. Or maybe I can convince a
bored undergrad student to provide an induction proof.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Tar archives can contain set two kinds of PAX headers:
- local headers that modify the attributes of the next file
- global headers that set defaults for all files
The later is used "... not widely used", according to tar(5)
and has been deliberately not implemented.
Some programs (e.g. git-archive) *do* generate them (in the case
of git, it stores the commit hash).
This commit adds a code path that skips a PAX global header entirely
and resumes tar parsing, instead of erroneusly reporting it as an
entry.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
In libtar, the sizeof time_t checked when trying to store a time value.
It is pointless using the preprocessor here, as we can simply do an
if (sizeof(time_t) < ...) check and the compiler will take care
optimizing away one or the other branch.
After changing the libtar check and the corresponding unit tests, the
sizeof check can be removed from configure.ac, along with other unused
sizeof checks.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Simply moving the pax header decoding to a separate file and splitting
out the common helper functions should be a good start.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit tries to untangle the logic of parsing and sanitizing the
pax header length field and the associated bounds checks.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
In libtar, set a special flag if the header is actually a hard link.
In tar2sqfs, create a hard link node and skip the rest for hard links.
Also refues to set the root attributes from a hard link, it may refere
to a node that we have missed earlier, there is nothing else that we
can do here.
In fstree_from_file, add a "link" command for adding hard links.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Extract the filename correctly.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Actually parse the length field and insist it matches the line length,
then use the length field to advance to the next line.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
In most cases, including unistd.h and fcntl.h was a left over anyway.
In the cases where it was not, move it to compat.h.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Instead, use stdio FILE pointers. On POSIX systems, use fileno to get
the file descriptor and hopefully create sparase files.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Its the only user. The other code doesn't touch raw file
descriptors anymore.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
It's only ever used for padding tarballs.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
- We don't have "endian.h" everywhere. On some BSDs its in sys and
on some BSDs the macros have different names.
- We definitely don't have sysmacros.h on non-Unix-like systems.
- Likewise for sys/types.h, sys/stat.h and their contents.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This is a fully automated search and replace, i.e. I ran this:
git grep -l uint8_t | xargs sed -i 's/uint8_t/sqfs_u8/g'
git grep -l uint16_t | xargs sed -i 's/uint16_t/sqfs_u16/g'
git grep -l uint32_t | xargs sed -i 's/uint32_t/sqfs_u32/g'
git grep -l uint64_t | xargs sed -i 's/uint64_t/sqfs_u64/g'
git grep -l int8_t | xargs sed -i 's/int8_t/sqfs_s8/g'
git grep -l int16_t | xargs sed -i 's/int16_t/sqfs_s16/g'
git grep -l int32_t | xargs sed -i 's/int32_t/sqfs_s32/g'
git grep -l int64_t | xargs sed -i 's/int64_t/sqfs_s64/g'
and than added the appropriate definitions to sqfs/predef.h
The whole point being better compatibillity with platforms that may
not have an stdint.h with the propper definitions.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This only exists for tar2sqfs. Move the sparse file map to libtar
and add the ability to do this into the stind sqfs_file_t abstraction,
so it acts like a normal file but internally stitches the data
together from the sparse implementation.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Split the signel file up into several small ones and use a variable
for the public headers instead of duplicating them.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
If an extension header is rejected because its too big, the error path
would print the size as size_t, altough it is an uint64_t. On 64 bit
systems, this works because size_t is a 64 bit unsigned integer, on
32 bit systems, not so much.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit patches the tar writer to generate a PAX header with SCHILY
xattr key/value pairs if requested.
The Schily format is used for two reasons:
- It is simple
- It is apparently more widely supported than the libarchive format
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Since they are read directly into memory, blindly allocating the size
from the tar ball is probably a bad idea.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
struct stat uses time_t to store time values. On some 32 bit systems,
this may be a 32 bit integer.
This patch adds a broken-out 64 bit time value to tar_header_decoded_t
and makes sure to clamp the value to +/- (2^32 - 1) if required when
writing it back to a struct stat.
Reported-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit removes all the code for parsing and processing atime/ctime
and values and related test code.
Caring about those is kind of pointless because squashfs can only store
mtime in inodes. The only relevant place is when generating a struct
stat from a squashfs inode or an fstree node.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Requires that config.h be included before other headers, since the macro
_FILE_OFFSET_BITS changes the definitions of things like 'struct stat'.
I chose to simply include it at the top of every C file and at
immediately after the double-inclusion guards of every header.
Signed-off-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Until now, the tar checksum verification simply copied the header,
recomputed the checksum and compared it byte-for-byte with the
original.
However, not all implementations store the checksum the same way. For
instance, git-archive generates tar balls that use the same format as
for other octal numbers.
This patch makes the checksum verification more lenient by parsing the
checksum from the header and comparing it with the computed value
instead of copying the entire block and insisting on byte-for-byte
equivalence.
The result is better interoperabillity with existing tools and perhaps
slightly faster processing since the block doesn't have to be copied.
Reported-by: Matt Turner <mattst88@gmail.com>
Suggested-by: René Scharfe <l.s.r@web.de>
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
If read_retry fails to read the expected amount of data (EOF or otherwise),
it is almost always an error.
This commit renames read_retry to read_data and moves error handling
into the function, making a lot of error handling code redundant.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
If write_retry fails to write everything, it is *always* an error.
This commit renames write_retry to write_data and moves error handling
into the function, making a lot of error handling code redundant.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
Some experiments seem to indicate that the various GNU extensions are
more widely supported than their POSIX equivalents[1]. Possibly because
they are easier to implement and possibly because of the wide spread
use of GNU tar.
This commit replaces the PAX writer in the write_tar_header implementation
with a GNU extension based writer.
The writer is also cleaned up by removing all global state. The record
counter is moved outside into the tar2sqfs program and passed in as
function argument.
[1] https://dev.gentoo.org/~mgorny/articles/portability-of-tar-features.html
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
If the tar header name is exactely 100 bytes long, it does not have a
trailing null byte. Using strlen/strcpy on it effectively concatenates
the following fields contents to it.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|
|
This commit attempts to split some of the monolitic tar parsing code up
into multiple functions in seperate files. Also, some code duplication
(like reading a record into memory which was implemented twice) is
removed.
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
|