Playing around with ZFS

Using a bunch of USB drives for some testing

Written on 12/15/2012 by Patrick Bregman

Just for fun I constructed a ZFS filesystem out of four 8GByte USB sticks. This was also to get a feel for the system and now what I could expect. And, primarily, I wanted to toy around with the different compression algorithms because I could not find a good review of them. I had already created the ZFS filesystem a few days ago, and wanted to work with it this morning. But for some reason all the ZFS command could say was that it couldn't find any pools. Shit. Nice start of the day...

Because of this I decided to make a new pool, because there was no real data on it anyway. This time I went for a slightly different setup. Instead of a RAID-Z pool of 3 drives and 1 drive as cache, I went for a RAID10 setup. Simply said, I have two sets of two USB drives that are mirroring each other, and on top of that is a striping set, dividing the data between the disks. This leads to an almost double write performance. Instead of doing a very low 5MByte/s I'm now doing roughly 8MByte/s. Still, I don't have the feeling that anything is cached in RAM like it should. But that's for another time.

For now, let's focus on the compression algorithms. ZFS gives us a few options, those are:

  • ZLE - A very simple, very fast compression method. All it does is compressing runs of zeroes
  • GZIP - Good old gzip, you probably now it from the gzip command on your Unix/Linux/Mac OS X box with compression speed set to 6 (gzip -6)
  • GZIP-9 - As above, but with the compression speed set to 9 (gzip -9)
  • LZJB - Default for ZFS, not much information available. Only that it derives from LZRW1...

I'm not a compression expert, so I'm not going to talk about how all the different compression systems work. This is primarily because my knowledge of C/C++ is next to nothing, and all that code is written in C.

To test the different compression methods, I tried a few different things. Again, a little list:

  • A lot of bytes in the format 0x00 0xFF 0x00 0xFF, see GitHub for the source
  • Decompress the kernel source code 5 times into the ZFS volume
  • Write 512MByte of zero bytes to a file

To begin, I tried out writing roughly 128MB of 0x00 0xFF bytes with my zfs_test program which you can find on GitHub. This finished within a second on most compressions, except for ZLE. This is expected, as ZLE only compresses runs of zeroes and they aren't in this file. The results:

  • ZLE: No compression, filesize is 128MB or a ratio of 1:1 (1 byte on disk represents one byte of the input)
  • GZIP: 244.58x, filesize is 531kB or a ratio of 1:244.58 (1 byte on disk represents 244.58 bytes of the input)
  • GZIP-9: 244.58x, filesize is 531kB or a ratio of 1:244.58 (1 byte on disk represents 244.58 bytes of the input)
  • LZJB: 28.30x, filesize is 4627kB or a ratio of 1:28.30 (1 byte on disk represents 28.30 bytes of the input)

Not bad, gzip seems to do pretty good! LZJB is still compressing it quite a lot, but then again it is a very compressible file. Let's move on and see what the different compression algorithms manage to do with the kernel source. For this, I downloaded the 3.6.9 kernel from The Linux Kernel Archives, which was the latest version when I tried this before. Anyway, I downloaded the .tar.xv archive, which weights in at only 66MByte. When I uncompressed this to the different compressed ZFS volumes, this is what I got:

  • ZLE: 1.04x compression, filesize is 460MB or a ratio of 1:1.04
  • GZIP: 3.53x compression, filesize is 140MB or a ratio of 1:3.53
  • GZIP-9: 3.54x compression, filesize is 140MB or a ratio of 1:3.54
  • LZJB: 2.00x compression, filesize is 242MB or a ratio of 1:2.00

During the extraction of the archives I couldn't see any process requiring more CPU time than the xv process, so none of the compression methods really used a lot of CPU time on my Intel Xeon E3 1265LV2. They might show some CPU usage when running on a lower spec'ed CPU like the Pentium and Celeron CPUs some people like to use for ZFS builds. Moving on to the last test, I deleted all the kernel files and wrote a 512MByte file to all volumes.

  • ZLE: 1.00x compression, filesize is 512b.
  • GZIP: 1.00x compression, filesize is 512b.
  • GZIP-9: 1.00x compression, filesize is 512b.
  • LZJB: 1.00x compression, filesize is 512b.

Wait, what's this? 1.00x compression of a 512MB file makes a 512 byte file? What magic did you do! Well, to be honest, that's what I want to know. I wrote all the files with dd if=/dev/zero of=test.bin bs=8M count=64, which gave me 4 files of 537MB. And this is also not the way you make sparse files as far as I know. So apparently this is a little issue in the ZFS code. If I calculate the ratio I get a compression ratio of 1:1048576, or 1:1M if you prefer that.

(UPDATE: Apparently this is normal behavior. ZFS makes files which exist solely of 0x00 bytes (zero bytes) sparse by default. So it doesn't even hit the compression algorithms in this case. Then it all makes sense again.)

While doing this test I noticed something funny. The write speed between the different compression algorithmns varied quite a lot. Normal gzip wrote the slowest of them all at "just" 371MByte/s, while LZJB managed to be the fastest with 1.1GByte/s. gzip-9 did a very welcome 977MByte/s while ZLE does only 577MByte/s.

blog comments powered by Disqus