Introduction

In this article, we will be looking at how to setup LVM caching using a fast collection of SSDs in front of a slower, larger HDD backed storage within a standard Proxmox environment. We will also look at the performance benefits of this setup.

LVM Volumes

Using this forum post, I was able to create the LVM caching layer using the default LVM volumes created from a Proxmox installation.

The following are the steps required to add the LVM cache to the data volume:

pvcreate /dev/sdb
vgextend pve /dev/sdb

lvcreate -L 360G -n CacheDataLV pve /dev/sdb
lvcreate -L 5G -n CacheMetaLV pve /dev/sdb

lvconvert --type cache-pool --poolmetadata pve/CacheMetaLV pve/CacheDataLV

lvconvert --type cache --cachepool pve/CacheDataLV --cachemode writeback pve/data

/dev/sdb is the block device which contains the SSD cache. Two volumes are created on the block device, one for the data and one for the metadata. Additionally, we use the writeback cache mode which will offer better performance. But if the cache device fails, data may be lost as it is not guaranteed that the dirty data has been written back to the main data storage. We then modify the default Proxmox pve/data volume to use our new cache volume.

One piece of information to be aware of is that the size of the pve/data volume is not extended. The SSDs are merely used as a read and write cache for the media behind it.

Finally, this is a non-destructive operation that can be performed on a live system. All data already on the pve/data volume will be preserved after completion.

Performance Benefits

To see first hand the benefits of the LVM cache, we will be using the FIO utility to gather metrics for IOPS and bandwidth. I ran FIO using the Ubuntu 18.04 LXC container before and after using the following hardware:

  • IBM X3550 M4
  • 2 x Intel Xeon E5-2680
  • 128GB Memory
  • ServeRAID M5110 (1GB Memory Cache)
  • 4 x IBM SystemX 300GB 10k SAS HDDs (RAID 10)
  • 4 x HGST Ultrastar SSD400S.B 200GB SAS SSDs (RAID 10)

Testing Methodology

The following is the FIO profile used.

[global]
rw=<MODE>
ioengine=libaio
iodepth=64
size=2g
direct=1
buffered=0
startdelay=5
ramp_time=5
runtime=20
time_based
clat_percentiles=0
disable_lat=1
disable_clat=1
disable_slat=1
filename=fiofile
directory=/

[test]
name=test
bs=32k
stonewall

This profile is time based and is using a constant IO depth of 64, a blocksize of 32k, and Direct IO option across all runs. A file size of 2GB is used to minimize the caching effects of the RAID card on the performance results.

Results

Configuration Random Read Random Write Sequential Read Sequential Write
HDD IOPS=3289 BW=103MiB/s IOPS=1025 BW=32.1MiB/s IOPS=12.5k BW=390MiB/s IOPS=7387 BW=231MiB/s
HDD+SSD IOPS=43.3k BW=1353MiB/s IOPS=28.4k BW=887MiB/s IOPS=50.4k BW=1575MiB/s IOPS=29.7k BW=929MiB/s

As can be seen above, the performance differences are massive for both IOPS and available bandwidth of the underlying storage layer. Random read and write performance is increased by over 20 times and sequential read and write is increased by nearly 4 times compared with the traditional HDD setup.

Conclusion

Setting up a LVM cache for your Proxmox nodes produces astonishing results for localized storage performance. This can be done very easily on an established live system with zero down time. If you have the available hardware, and you are using the default LVM volumes, I would recommend trying out this configuration. You will not be disappointed!