AWS EC2 Performance Storage cloud

ZFS is the best thing since sliced bread. It is both a file system and volume manager combined into complete awesomeness. It has features that will blow any competition out of water, here are just a few:

  • 128-bit file system, so could store 256 quadrillion ZB (a ZB is a billion TB.)
  • Checksums are stored with metadata allowing ZFS to scrub volumes and fixing silent data corruption.
  • Copy on Write, meaning that when data is changed it is not overwritten — it is always written to a new block. Think EBS snapshots.
  • Pooled Data Storage: ZFS takes available storage drives and pools them together as a single resource allowing efficient use of capacity available.
  • Compression.
  • Inline block level deduplication – this one is particularly magnificent.
  • ZFS send/receive for replicating data across between systems.
  • NFS, CIFS and iscsi sharing of volumes directly out of ZFS.
  • SSD Hybrid Storage Pools allowing ZFS to use SSD’s as L2ARC (Read Cache) and ZIL (write cache) and this is what we will do here.

I could spend the rest of 2014 writing about ZFS and I won’t give it justice. Thankfully there are people who are much smarter than me who have done that already, so I’ll point you in their direction

Installation and Configuration

Thanks to open-zfs and ZFS on linux setting up ZFS on ubuntu is really easy:

1
2
3
$ sudo add-apt-repository ppa:zfs-native/stable
$ sudo apt-get update
$ sudo apt-get install ubuntu-zfs

Next we create a brand new zpool:

1
2
3
4
5
6
7
8
9
10
11
12
$ zpool create vol1 xvdd
$ zpool status vol1
  pool: vol1
 state: ONLINE
  scan: none requested
config:

  NAME        STATE     READ WRITE CKSUM
  vol1        ONLINE       0     0     0
    xvdd      ONLINE       0     0     0

errors: No known data errors

Next we add the ephemeral SSD on /dev/xvdb as as a ZIL (ZFS Intent Log) as our write cache:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ zpool add vol1 log xvdb
$ zpool status vol1
  pool: vol1
 state: ONLINE
  scan: none requested
config:

  NAME        STATE     READ WRITE CKSUM
  vol1        ONLINE       0     0     0
    xvdd      ONLINE       0     0     0
  logs
    xvdb      ONLINE       0     0     0

errors: No known data errors

And now we finally add /dev/xvdc as the L2ARC or basically the write cache:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ zpool add vol1 cache xvdc
$ zpool status vol1
  pool: vol1
 state: ONLINE
  scan: none requested
config:

  NAME        STATE     READ WRITE CKSUM
  vol1        ONLINE       0     0     0
    xvdd      ONLINE       0     0     0
  logs
    xvdb      ONLINE       0     0     0
  cache
    xvdc      ONLINE       0     0     0

errors: No known data errors

Test Results

Now we are ready to test our new setup by using the directory /vol1 created automatically by ZFS for us. /vol1 is using both the ephemeral SSD’s, one for caching reads and the other for caching writes.

Please note that ZFS doesn’t support non-buffered io O_DIRECT so we had to alter the fio test command with --direct=0 to fix that:

1
$ fio --filename=/dev/vol1 --size=1G --name=test --direct=0 --rw=randwrite --refill_buffers --ioengine=libaio --bs=16k --iodepth=32 --numjobs=32 --time_based --runtime=10 --group_reporting

By doing this the results will be skewed since there all the tests will be buffered.

Analysis

  • The results are difficult to compare to previous tests due to ZFS not supporting non-buffered io.
  • It is very clear however that the addition of L2ARC and ZIL has improved the performance significantly.
  • When using a ZIL, as with all write caching techniques there is a risk of losing in transit data that has not yet been committed to EBS.

Comments