From b33efe3e0ef9d9d475f16719f5f28ce8e6a85f3e Mon Sep 17 00:00:00 2001 From: David Oberhollenzer Date: Fri, 7 Jun 2019 13:28:18 +0200 Subject: Update README file Remove agressive remarks and give credit where credit is due. Generally express thinks in a more positive way. Signed-off-by: David Oberhollenzer --- README | 131 ++++++++++++++++++++++++++++++++++------------------------------- 1 file changed, 69 insertions(+), 62 deletions(-) diff --git a/README b/README index fd68598..ff08801 100644 --- a/README +++ b/README @@ -2,73 +2,38 @@ About ***** -This directory contains the source code of a collection of tools for working -with SquashFS file systems. +SquashFS is a highly compressed, read only file system often used as a root fs +on embedded devices, live systems or simply as a compressed archive format. -The `gensquashfs` program takes as input a file listing similar to the -program `gen_init_cpio` in the Linux kernel source tree and produces a -SquashFS image. +Think of it as a .tar.gz that you can mount (or XZ, LZO, LZ4, ZSTD). -The input list precisely specifies the directory structure, all permission bits -and all UIDs & GIDs of who owns what. +The file system itself and the user space tooling were originally developed by +Phillip Lougher with third party contributions that have accumulated over time. -The tool doesn't care if the directories, symlinks, device special files, -etc... actually exist when packing a SquashFS image. The only thing that -really has to exist are the input _files_ that can be placed arbitrarily in -the file system by specifying input and target locations. +Unfortunately, the original user space tooling does not support a lot of +standard use cases, the source code of the tools is in a pretty deteriorated +state and apparently no longer maintained. -An SELinux labeling file can be specified to add SELinux tags. +This package contains the source code of a complete rewrite of the user space +tools that attempt to address many of the problems of the old tools: -All directory entries are sorted by name and processed sequentially. All time -stamps in the SquashFS image are set to a command line specified value (or 0 -by default). Thus the entire process should be deterministic, i.e. same input -produces byte-for-byte the same output. + - Reproducible SquashFS images, i.e. deterministic packing without + any local time stamps. + - Linux `gen_init_cpio` like file listing for micro managing the + file system contents, permissions, and ownership without having to replicate + the file system (and especially permissions) locally. + - Support for SELinux contexts file (see selabel_file(5)) to generate + SELinux labels. + - Structured and (hopefully) more readable source code that should be better + maintainable in the long run. -The `rdsquashfs` program can read a SquashFS image and produce file listings, -extract individual files or entire sub trees of the image to a desired -location. +The tools in this package have different names, so they can be installed in +together with the existing tools: - - Why not use the official squashfs-tools? - **************************************** - -The mksquashfs utility is semi-broken and generally a PITA to work with. - -For the typical use case of SquashFS (i.e. as rootfs for a live distro or an -embedded system), it should be blindingly obvious that I might want to micro -manage what goes into the file system, that UIDs/GIDs of the host system are -garbage inside the image and that setting the desired permissions (e.g. suid) -or SELinux labels on the input is completely out of the question. Also, it -would be really cool if the whole thing was reproducible. - -All of this seems to have been an afterthought with mksquashfs. Some of it can -be achieved with exclusion options, "pseudo files" or "filters" (a completely -undocumented feature that not even `--help` tells you about). - -My main gripes with mksquashfs were the following: - - - I need to precisely replicate the entire filesystem for packing, even tough - the only thing actually needed are in theory the regular files. - - Files in the input FS but not in the pseudo file are still packed - but with garbage UID/GID from the host system. - - When I want files that are not owned by root, the root inode will get - garbage UID/GID from the host system and there is no way to change this. - - mtime is read from the input file system and there is no way to override it. - - Data is packed by a thread pool, i.e. in a non-deterministic way. - - Extended attributes are read from the input file system, i.e. the only way - to get SELinux labels into the SquashFS filesystem is to set them on the - input data. - -That's at least what I can think of right now from the top of my head. - -It would be preferable to fix mksquashfs itself, but the source code is a -horrid dumpster fire. It turned out to be easier to understand the structure -of SquashFS by reading the available documentation plus kernel source and -implementing the desired feature set from scratch. - -Furthermore, upstream seems to be unmaintained at the moment and the mailing -list appears to be about as dead as SourceForge that hosts it. + - `gensquashfs` can be used to produce SquashFS images from `gen_init_cpio` + like file listings or simply pack an input directory. + - `rdsquashfs` can be used to inspect and unpack SquashFS images. Limitations @@ -76,8 +41,7 @@ list appears to be about as dead as SourceForge that hosts it. At the moment, the following things still require some work: - - documentation - - testing + - more testing - extended attributes - currently limited to SELinux labeling only - rdsquashfs ignores them entirely @@ -85,6 +49,49 @@ At the moment, the following things still require some work: storage but this is currently not used yet. - empty directories cannot have xattrs. The way I understand it, this is a design flaw in SquashFS. I hope I'm missing something here. - - sparse files (not implemented yet) + - sparse files (not implemented yet; lots of zeros compress good anyway :P) - hard links (not implemented yet; do we even want this?) + - File deduplication (not implemented; do we even need this?) - NFS export tables (not implemented yet) + + + Future plans + ************ + +In addition to the above, the following things would be really nice to +have eventually: + + - A tool for merging multiple images into one + - A tool for splitting an image + - A diff tool + - Diff of the directory tree of two images + - Diff of the file meta data in two images + - File level diffs + - Combinations of the above in a still human readable form + - [IN PROGRESS] A *complete* specification of the on-disk format and all the + arbitrary checks enforced by the kernel. + - Patching kernel and user space to support SquashFS on top of UBI + - Patching kernel and user space to support ACLs + + + Copyright & License + ******************* + +The source code in this package has been written by me, David Oberhollenzer, +in 2019 and is released under the terms and conditions of the GNU General +Public License version 3 or later. + +To the best of my knowledge, no code has been copied over from the original +SquashFS tools. The kernel documentation, the kernel headers and this web site +have been used as main sources for understanding SquashFS: + + https://dr-emann.github.io/squashfs/ + +Some additional information (such as xattr implementation) has been gathered +from various mailing lists and other web sources. + +Compressor implementations are primarily based on the documentation of the +compression libraries. + +The existing unsquashfs tool and kernel implementation were used for trial and +error testing during development. -- cgit v1.2.3