aboutsummaryrefslogtreecommitdiff
path: root/README
diff options
context:
space:
mode:
Diffstat (limited to 'README')
-rw-r--r--README131
1 files changed, 69 insertions, 62 deletions
diff --git a/README b/README
index fd68598..ff08801 100644
--- a/README
+++ b/README
@@ -2,73 +2,38 @@
About
*****
-This directory contains the source code of a collection of tools for working
-with SquashFS file systems.
+SquashFS is a highly compressed, read only file system often used as a root fs
+on embedded devices, live systems or simply as a compressed archive format.
-The `gensquashfs` program takes as input a file listing similar to the
-program `gen_init_cpio` in the Linux kernel source tree and produces a
-SquashFS image.
+Think of it as a .tar.gz that you can mount (or XZ, LZO, LZ4, ZSTD).
-The input list precisely specifies the directory structure, all permission bits
-and all UIDs & GIDs of who owns what.
+The file system itself and the user space tooling were originally developed by
+Phillip Lougher with third party contributions that have accumulated over time.
-The tool doesn't care if the directories, symlinks, device special files,
-etc... actually exist when packing a SquashFS image. The only thing that
-really has to exist are the input _files_ that can be placed arbitrarily in
-the file system by specifying input and target locations.
+Unfortunately, the original user space tooling does not support a lot of
+standard use cases, the source code of the tools is in a pretty deteriorated
+state and apparently no longer maintained.
-An SELinux labeling file can be specified to add SELinux tags.
+This package contains the source code of a complete rewrite of the user space
+tools that attempt to address many of the problems of the old tools:
-All directory entries are sorted by name and processed sequentially. All time
-stamps in the SquashFS image are set to a command line specified value (or 0
-by default). Thus the entire process should be deterministic, i.e. same input
-produces byte-for-byte the same output.
+ - Reproducible SquashFS images, i.e. deterministic packing without
+ any local time stamps.
+ - Linux `gen_init_cpio` like file listing for micro managing the
+ file system contents, permissions, and ownership without having to replicate
+ the file system (and especially permissions) locally.
+ - Support for SELinux contexts file (see selabel_file(5)) to generate
+ SELinux labels.
+ - Structured and (hopefully) more readable source code that should be better
+ maintainable in the long run.
-The `rdsquashfs` program can read a SquashFS image and produce file listings,
-extract individual files or entire sub trees of the image to a desired
-location.
+The tools in this package have different names, so they can be installed in
+together with the existing tools:
-
- Why not use the official squashfs-tools?
- ****************************************
-
-The mksquashfs utility is semi-broken and generally a PITA to work with.
-
-For the typical use case of SquashFS (i.e. as rootfs for a live distro or an
-embedded system), it should be blindingly obvious that I might want to micro
-manage what goes into the file system, that UIDs/GIDs of the host system are
-garbage inside the image and that setting the desired permissions (e.g. suid)
-or SELinux labels on the input is completely out of the question. Also, it
-would be really cool if the whole thing was reproducible.
-
-All of this seems to have been an afterthought with mksquashfs. Some of it can
-be achieved with exclusion options, "pseudo files" or "filters" (a completely
-undocumented feature that not even `--help` tells you about).
-
-My main gripes with mksquashfs were the following:
-
- - I need to precisely replicate the entire filesystem for packing, even tough
- the only thing actually needed are in theory the regular files.
- - Files in the input FS but not in the pseudo file are still packed
- but with garbage UID/GID from the host system.
- - When I want files that are not owned by root, the root inode will get
- garbage UID/GID from the host system and there is no way to change this.
- - mtime is read from the input file system and there is no way to override it.
- - Data is packed by a thread pool, i.e. in a non-deterministic way.
- - Extended attributes are read from the input file system, i.e. the only way
- to get SELinux labels into the SquashFS filesystem is to set them on the
- input data.
-
-That's at least what I can think of right now from the top of my head.
-
-It would be preferable to fix mksquashfs itself, but the source code is a
-horrid dumpster fire. It turned out to be easier to understand the structure
-of SquashFS by reading the available documentation plus kernel source and
-implementing the desired feature set from scratch.
-
-Furthermore, upstream seems to be unmaintained at the moment and the mailing
-list appears to be about as dead as SourceForge that hosts it.
+ - `gensquashfs` can be used to produce SquashFS images from `gen_init_cpio`
+ like file listings or simply pack an input directory.
+ - `rdsquashfs` can be used to inspect and unpack SquashFS images.
Limitations
@@ -76,8 +41,7 @@ list appears to be about as dead as SourceForge that hosts it.
At the moment, the following things still require some work:
- - documentation
- - testing
+ - more testing
- extended attributes
- currently limited to SELinux labeling only
- rdsquashfs ignores them entirely
@@ -85,6 +49,49 @@ At the moment, the following things still require some work:
storage but this is currently not used yet.
- empty directories cannot have xattrs. The way I understand it, this is a
design flaw in SquashFS. I hope I'm missing something here.
- - sparse files (not implemented yet)
+ - sparse files (not implemented yet; lots of zeros compress good anyway :P)
- hard links (not implemented yet; do we even want this?)
+ - File deduplication (not implemented; do we even need this?)
- NFS export tables (not implemented yet)
+
+
+ Future plans
+ ************
+
+In addition to the above, the following things would be really nice to
+have eventually:
+
+ - A tool for merging multiple images into one
+ - A tool for splitting an image
+ - A diff tool
+ - Diff of the directory tree of two images
+ - Diff of the file meta data in two images
+ - File level diffs
+ - Combinations of the above in a still human readable form
+ - [IN PROGRESS] A *complete* specification of the on-disk format and all the
+ arbitrary checks enforced by the kernel.
+ - Patching kernel and user space to support SquashFS on top of UBI
+ - Patching kernel and user space to support ACLs
+
+
+ Copyright & License
+ *******************
+
+The source code in this package has been written by me, David Oberhollenzer,
+in 2019 and is released under the terms and conditions of the GNU General
+Public License version 3 or later.
+
+To the best of my knowledge, no code has been copied over from the original
+SquashFS tools. The kernel documentation, the kernel headers and this web site
+have been used as main sources for understanding SquashFS:
+
+ https://dr-emann.github.io/squashfs/
+
+Some additional information (such as xattr implementation) has been gathered
+from various mailing lists and other web sources.
+
+Compressor implementations are primarily based on the documentation of the
+compression libraries.
+
+The existing unsquashfs tool and kernel implementation were used for trial and
+error testing during development.