diff options
Diffstat (limited to 'doc')
| -rw-r--r-- | doc/benchmark.ods | bin | 53962 -> 53760 bytes | |||
| -rw-r--r-- | doc/parallelism.txt | 58 | 
2 files changed, 29 insertions, 29 deletions
| diff --git a/doc/benchmark.ods b/doc/benchmark.odsBinary files differ index 6ea8871..62ee480 100644 --- a/doc/benchmark.ods +++ b/doc/benchmark.ods diff --git a/doc/parallelism.txt b/doc/parallelism.txt index 3c3afb1..315a631 100644 --- a/doc/parallelism.txt +++ b/doc/parallelism.txt @@ -150,7 +150,7 @@    - As uncompressed tarball:           ~6.5GiB (7,008,118,272)    - As LZ4 compressed SquashFS image:  ~3.1GiB (3,381,751,808)    - As LZO compressed SquashFS image:  ~2.5GiB (2,732,015,616) -  - As zstd compressed SquashFS image: ~2.4GiB (2,536,910,848) +  - As zstd compressed SquashFS image: ~2.1GiB (2,295,017,472)    - As gzip compressed SquashFS image: ~2.3GiB (2,471,276,544)    - As lzma compressed SquashFS image: ~2.0GiB (2,102,169,600)    - As XZ compressed SquashFS image:   ~2.0GiB (2,098,466,816) @@ -164,7 +164,7 @@    AMD Ryzen 7 3700X    32GiB DDR4 RAM -  Fedora 31 with Linux 5.5.17 +  Fedora 31   2.4) Results @@ -172,23 +172,23 @@   The raw timing results are as follows:   Jobs    XZ          lzma        gzip        LZO         LZ4      zstd - serial  17m39.613s  16m10.710s   9m56.606s  13m22.337s  12.159s  28.493s -      1  17m38.050s  15m49.753s   9m46.948s  13m06.705s  11.908s  28.926s -      2   9m26.712s   8m24.706s   5m08.152s   6m53.872s   7.395s  16.381s -      3   6m29.733s   5m47.422s   3m33.235s   4m44.407s   6.069s  11.949s -      4   5m02.993s   4m30.361s   2m43.447s   3m39.825s   5.864s   9.917s -      5   4m07.959s   3m40.860s   2m13.454s   2m59.395s   5.749s   8.803s -      6   3m30.514s   3m07.816s   1m53.641s   2m32.461s   5.926s   8.359s -      7   3m04.009s   2m43.765s   1m39.742s   2m12.536s   6.281s   8.264s -      8   2m45.050s   2m26.996s   1m28.776s   1m58.253s   6.395s   7.844s -      9   2m34.993s   2m18.868s   1m21.668s   1m50.461s   6.890s   7.915s -     10   2m27.399s   2m11.214s   1m15.461s   1m44.060s   7.225s   8.157s -     11   2m20.068s   2m04.592s   1m10.286s   1m37.749s   7.557s   8.448s -     12   2m13.131s   1m58.710s   1m05.957s   1m32.596s   8.127s   8.652s -     13   2m07.472s   1m53.481s   1m02.041s   1m27.982s   8.704s   9.210s -     14   2m02.365s   1m48.773s   1m00.337s   1m24.444s   9.494s  10.547s -     15   1m58.298s   1m45.079s     58.348s   1m21.445s  10.192s  11.427s -     16   1m55.940s   1m42.176s     56.615s   1m19.030s  10.964s  12.889s + serial  17m39.613s  16m10.710s   9m56.606s  13m22.337s  12.159s  9m33.600s +      1  17m38.050s  15m49.753s   9m46.948s  13m06.705s  11.908s  9m23.445s +      2   9m26.712s   8m24.706s   5m08.152s   6m53.872s   7.395s  5m 1.734s +      3   6m29.733s   5m47.422s   3m33.235s   4m44.407s   6.069s  3m30.708s +      4   5m02.993s   4m30.361s   2m43.447s   3m39.825s   5.864s  2m44.418s +      5   4m07.959s   3m40.860s   2m13.454s   2m59.395s   5.749s  2m16.745s +      6   3m30.514s   3m07.816s   1m53.641s   2m32.461s   5.926s  1m57.607s +      7   3m04.009s   2m43.765s   1m39.742s   2m12.536s   6.281s  1m43.734s +      8   2m45.050s   2m26.996s   1m28.776s   1m58.253s   6.395s  1m34.500s +      9   2m34.993s   2m18.868s   1m21.668s   1m50.461s   6.890s  1m29.820s +     10   2m27.399s   2m11.214s   1m15.461s   1m44.060s   7.225s  1m26.176s +     11   2m20.068s   2m04.592s   1m10.286s   1m37.749s   7.557s  1m22.566s +     12   2m13.131s   1m58.710s   1m05.957s   1m32.596s   8.127s  1m18.883s +     13   2m07.472s   1m53.481s   1m02.041s   1m27.982s   8.704s  1m16.218s +     14   2m02.365s   1m48.773s   1m00.337s   1m24.444s   9.494s  1m14.175s +     15   1m58.298s   1m45.079s     58.348s   1m21.445s  10.192s  1m12.134s +     16   1m55.940s   1m42.176s     56.615s   1m19.030s  10.964s  1m11.049s   The file "benchmark.ods" contains those values, values derived from this and   charts depicting the results. @@ -196,15 +196,15 @@   2.5) Discussion - Most obviously, the results indicate that LZ4 and zstd compression are clearly - I/O bound and not CPU bound. They don't benefit from parallelization beyond - 2-4 worker threads and even that benefit is marginal with efficiency + Most obviously, the results indicate that LZ4, unlike the other compressors, + is clearly I/O bound and not CPU bound and doesn't benefit from parallelization + beyond 2-4 worker threads and even that benefit is marginal with efficiency   plummetting immediately. - The other compressors (XZ, lzma, gzip, lzo) are clearly CPU bound. Speedup - increases linearly until about 8 cores, but with a slope < 1, as evident by - efficiency linearly decreasing and reaching 80% for 8 cores. + The other compressors are clearly CPU bound. Speedup increases linearly until + about 8 cores, but with a slope < 1, as evident by efficiency linearly + decreasing and reaching 80% for 8 cores.   A reason for this sub-linear scaling may be the choke point introduced by the   creation of fragment blocks, that *requires* a synchronization. To test this @@ -230,10 +230,10 @@   As a side effect, this benchmark also produces some insights into the   compression ratio and throughput of the supported compressors. Indicating that   for the Debian live image, XZ clearly provides the highest data density, while - LZ4 is clearly the fastest compressor available, directly followed by zstd - which has a much better compression ratio than LZ4, comparable to the gzip - compressor, while being almost 50 times faster. The throughput of the zstd - compressor is truly impressive, considering the compression ratio it achieves. + LZ4 is clearly the fastest compressor available. + + The throughput of the zstd compressor is comparable to gzip, while the + resulting compression ratio is closer to LZMA.   Repeating the benchmark without tail-end-packing and with fragments completely   disabled would also show the effectiveness of tail-end-packing and fragment | 
