aboutsummaryrefslogtreecommitdiff
path: root/lib/tar
AgeCommit message (Collapse)Author
2023-06-11libio: move istream buffer logic into interall callbacksDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-06-11libio: remove precache from istream_advance_bufferDavid Oberhollenzer
Since the user has to call istream_get_buffered_data afterwards anyway, we can do the precache lazily. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-06-10libio: add desired read size to istream_get_buffered_dataDavid Oberhollenzer
This properly maps to all of our use cases and makes istream_precache obsolete. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-06-09libtar: overhaul buffer management in istream_t wrapperDavid Oberhollenzer
As soon as we no longer have any data to read, unlock/drop the parent iterator_t object. Also, make sure we get the buffer count right, not all data might have been consumed yet when precache is called. Remove the precache/read loop in the non-sparse case, we have already established that there is data available. If it is insufficient, the user will simply call precache again once it's used up, which istream_get_buffered_data forwards to a precache call in the underlying stream. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-06-09libio: remove eof flag from istream_t interfaceDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-06-09libio: eliminate direct access of the interal bufferDavid Oberhollenzer
Instead, go through helper functions, which in a next step can be moved inside the implementation. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-06-08Move tar compressor auto wrapping code from tar2sqfs into libtarDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-06-05Move dir_entry_xattr_t from libio to libsquashfsDavid Oberhollenzer
The structure and functions are renamed to sqfs_xattr_* instead, an additional helper is added to accept an encoded xattr. Documentation and unit test are added as well. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-06-05libio: remove buffer_offset from istream_tDavid Oberhollenzer
Instead, make the buffer const, let the user adjust the pointer and size. The offset can then be inferred in precache. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-06-04libio: Move istream_t precache logic into backend implementationDavid Oberhollenzer
The end goal is to remove direct buffer access from the istream_t interfaces and make that opaque. For the tar implementation, this already safes us needless buffer copying, as we essentially allow the user to read-through from the underlying stream. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-06-01libio: split dir_entry_t from dir_iterator_t, add create helperDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-05-22libtar: add a dir_iterator_t implementation for tar filesDavid Oberhollenzer
The existing istream_t wrapper is mered into this one as well, we can open the files via the iterators open_file_ro function. Unit tests and tar2sqfs are modified accordingly. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-05-16libtar: replace tar_xattr_t with dir_entry_xattr_t structDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-02-21libtar: generate entire xattr header in a single bufferDavid Oberhollenzer
Having it all in one buffer allows us the re-use the "generat GNU record" function. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-02-20Remove ostream_printfDavid Oberhollenzer
By cobbling together the xattr lines manually in libtar, the need (and thus the function itself) are removed. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-02-12libtar: Add a test for the tar writing codeDavid Oberhollenzer
Generate a simple tarball and compare it with a reference. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-02-08libtar: remove need for skip_padding functionDavid Oberhollenzer
In the istream implementation, automatically skip the padding when we reach end-of-file. Also skip file AND padding when we destroy the object. Replace the remaining instances with a simple istream_skip instead and remove the wrapper from libtar. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-02-08libtar: Add an istream_t implementationDavid Oberhollenzer
The tar_istream_t reads the data from a tar file, having been given the header, and synthesizes zero bytes for sparse regions. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-02-04libtar: internalize the declaration of read_octalDavid Oberhollenzer
Use read_number in the places that remain. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-02-04libtar: simplfy parsing of old GNU sparse formatDavid Oberhollenzer
There was some code duplication for extracting the sparse entry fields from the start record and the subsequent extended record. This commit introduces a data structure for both and unifies the parsing code paths. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-02-04libtar: some minor cleanupsDavid Oberhollenzer
- Use is_memory_zero from libutil - Move checksum update function to tar writer code - Move checksum verify function to tar reader code - Only export the function to compute the checksum Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-01-31Reintegrate test code with library codeDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-01-31Move library source into src sub-directoryDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2023-01-19libtar: simplify padd_file functionDavid Oberhollenzer
We have an "append_sparse" function in libio, with a fallback, so use that. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-11-18Add a single, central base64 decoderDavid Oberhollenzer
Similar to the hex blob decoder, we need this once for tar and once for the filemap xattr parser. Simply add a single, central implementation to libutil, with a simple unit test, and then use it in both libtar and gensquashfs. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-11-18Add a single, central hex blob decoderDavid Oberhollenzer
Since we need it twice (once for tar, once for the filemap xattr parser), add a single, central implementation to libutil, add a unit test for that implementation and then use it in both libtar and gensquashfs. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-07-08Cleanup: split libtar header, move to sub directoryDavid Oberhollenzer
Some of the on-disk format internals are moved to a separate header and some of the stuff from internal.h is moved to that format header. C++ guards are added in addtion. Everything PAX related is moved to pax_header.c, some internal functions are marked as static. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-03-30Cleanup: remove struct stat from libtarDavid Oberhollenzer
The idea was originally to use struct stat in the libfstree code, so we can simply hose data read from a directory into the fstree_t. The struct was then also used with libtar, for simpler interoperation, but it turned out to introduce a lot of platform quirks and causes more trouble than it's worth. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-03-30Cleanup: table driven pax header parsingDavid Oberhollenzer
Instead of having a long if-else-if chain, replace the PAX header field parsing with a table driven approach. Altough it is more code, it is hopefully more readable, maintainable, extensible and it dedupliates some of the value parsing code. The GNU.sparse parsing is left as is, because it requires maintaining state. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-03-30Cleanup: pax header parsingDavid Oberhollenzer
Split the key/value pairs right in the header and terminate the key name. This way, some of the magic numbers can be eliminated. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2022-03-10Cleanup libtar mkxattr, explicitly null-terminate stringsDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-07-09Fix printf format specifiers used for generating tarballsDavid Oberhollenzer
When processing files > 4G, using "%o" truncates the result and the tarball is not readable. This should have been discovered when auto-patching the printf format specifiers, but a cast was added instead and the issue was overlooked. This commit replaces the down-cast and printf format specifiers. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25Add default cases for every switch blockDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2021-06-25Remove casual un-const casting in various placesDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-09-16Remodel libtar/tar2sqfs to read data from an istream_tDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-09-16Remodel file extraction tools to use libfstreamDavid Oberhollenzer
This commit rewrites the libtar write paths to use libfstream insead of a FILE pointer. Also, the libcommon file extraction function is remodeled to use libfstream. In accordance, rdsquashfs, sqfs2tar and sqfsdiff have some minor adjustments made to work with the ported libtar and libcommon. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-09-03Fix integer bounds checking in GNU tar sparse format 1.0 parserDavid Oberhollenzer
- Make sure the file actually has that many records before trying to read one and fail if not. - Use the helper macros for size_t overflow checking instead of assuming size_t == uint64_t. - Impose a "reasonable" upper bound on the number of data segments and insist that there is at least one entry. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-09-02Fix nonexistant gnu tar sparse format 1.0 supportDavid Oberhollenzer
Contrary to previous claims, support for the GNU tar sparse format 1.0 was missing entirely (the newest of their 3 different sparse mapping formats). This oversight wasn't caught, because the unit test was compiling the wrong source file and tar2sqfs had no problem processing the test file because it is still a valid POSIX-ish tar archive (but the sparse part was missing and the mapping embedded in the file). Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-08-16Fix libtar treatment of link targets that fill the header fieldDavid Oberhollenzer
The tar header has a 100 byte field for symlink and hard link targets. If the target is longer than 100 bytes, an extension header has to be used. However, it is perfectly valid to fill all 100 bytes to the brim without adding a null terminator. In case of a symlink, this can result in garbage link targets, while for hard links it results in an immediate error since the target cannot be resolved later on. This commit attempts to fix the problem by replacing the strdup of the link target with an strndup that copies at most the size of the target header field. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-30Cleanup: sqfs2tar: break up and simplify the repacking codeDavid Oberhollenzer
- Move the xattr extraction and repacking to xattr.c - Don't on-the-fly delete the tar xattr list, use the function from libtar.a - Split minor tasks into static helper functions - creating a libtar xattr struct from libsqfs xattr data - finding a hard link entry from current path and inode number Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-05-18libtar: fix size computation of PAX line lengthDavid Oberhollenzer
This commit attempts to fix the following two problems: - The number of digits computation returning an off-by-one result if the number is 10, or the resulting digit string starts with "10". This results in one-too-many padding bytes, corrupting the rest of the archive since the headers now don't start at multiples of 512 anymore. - Adding the line length prefix affects the line length (duh). If it grows far enough to require more digits, the result is a similar problem. This is a converging series that we need to compute the limit of. Unit tests for this still need to be added. Or maybe I can convince a bored undergrad student to provide an induction proof. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-22Skip PAX global headersDavid Oberhollenzer
Tar archives can contain set two kinds of PAX headers: - local headers that modify the attributes of the next file - global headers that set defaults for all files The later is used "... not widely used", according to tar(5) and has been deliberately not implemented. Some programs (e.g. git-archive) *do* generate them (in the case of git, it stores the commit hash). This commit adds a code path that skips a PAX global header entirely and resumes tar parsing, instead of erroneusly reporting it as an entry. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-17Remove some configure time sizeof checksDavid Oberhollenzer
In libtar, the sizeof time_t checked when trying to store a time value. It is pointless using the preprocessor here, as we can simply do an if (sizeof(time_t) < ...) check and the compiler will take care optimizing away one or the other branch. After changing the libtar check and the corresponding unit tests, the sizeof check can be removed from configure.ac, along with other unused sizeof checks. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-04-17Cleanup: split read_header.c in libtar.aDavid Oberhollenzer
Simply moving the pax header decoding to a separate file and splitting out the common helper functions should be a good start. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2020-02-28Cleanup pax header parser a littleDavid Oberhollenzer
This commit tries to untangle the logic of parsing and sanitizing the pax header length field and the associated bounds checks. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-23Add libtar.a function to create hardlink recordsDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-22Add hard link support to gensquashfs and tar2sqfsDavid Oberhollenzer
In libtar, set a special flag if the header is actually a hard link. In tar2sqfs, create a hard link node and skip the rest for hard links. Also refues to set the root attributes from a hard link, it may refere to a node that we have missed earlier, there is nothing else that we can do here. In fstree_from_file, add a "link" command for adding hard links. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-15Fix tar GNU sparse header parsingDavid Oberhollenzer
Extract the filename correctly. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-13Better support for reading/writing non-ASCII xattr values from/to tarDavid Oberhollenzer
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
2019-12-13Make the PAX header parser more strictDavid Oberhollenzer
Actually parse the length field and insist it matches the line length, then use the length field to advance to the next line. Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>