Mark Syms
2015-01-27 18:15:04 UTC
Hi,
Is it possible to check the alignment of the various pieces involved in
a LVM2, MDADM RAID5 system to ensure that things are going to 4k boundaries?
I have a system with 4 2TB drives (which are 4k natural block and use
GPT) which has Raid 1 and Raid 5 mdadm arrays on it (raid1 for /boot,
everything else on the raid5). The raid5 array is used as an LVM2 PV
which then has multiple LVs on it. The partitions for the raid were
created with parted and are aligned to 4k and report as such with a
align-check optimal.
I get some confusing performance results if I use ioping -WWWs to test
write speed to a test volume (shown at the bottom). Periodically thing
block for 1-2 seconds which makes the performance quite unpredictable.
I've tried tweaking the dirty ratios and used a modified form of the
second script from this forum post
(http://ubuntuforums.org/showthread.php?t=1494846) to set the raid
stripe cache size and read ahead but without much success.
Running "pvs -o +pe_start", gives
PV VG Fmt Attr PSize PFree 1st PE
/dev/md1 vol1 lvm2 a-- 5.44t 4.08t 1.50m
which if I'm reading it right (as the man page isn't much help in terms
of information) says that the PV is aligned at a 1.5m boundary and so
should be on a 4k boundary?
Any suggestions for further things to try would be much appreciated,
please reply to me directly as I'm not subscribed to the list.
Thanks,
Mark.
-----------------------------------
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=6 time=159.7 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=7 time=146.9 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=8 time=144.1 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=9 time=144.6 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=10 time=144.6 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=11 time=147.8 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=12 time=161.7 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=13 time=1.9 s
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=14 time=175.8 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=15 time=163.0 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=16 time=140.4 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=17 time=182.7 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=18 time=155.3 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=19 time=173.8 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=20 time=2.2 s
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=21 time=158.5 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=22 time=144.5 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=23 time=166.3 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=24 time=147.1 ms
Is it possible to check the alignment of the various pieces involved in
a LVM2, MDADM RAID5 system to ensure that things are going to 4k boundaries?
I have a system with 4 2TB drives (which are 4k natural block and use
GPT) which has Raid 1 and Raid 5 mdadm arrays on it (raid1 for /boot,
everything else on the raid5). The raid5 array is used as an LVM2 PV
which then has multiple LVs on it. The partitions for the raid were
created with parted and are aligned to 4k and report as such with a
align-check optimal.
I get some confusing performance results if I use ioping -WWWs to test
write speed to a test volume (shown at the bottom). Periodically thing
block for 1-2 seconds which makes the performance quite unpredictable.
I've tried tweaking the dirty ratios and used a modified form of the
second script from this forum post
(http://ubuntuforums.org/showthread.php?t=1494846) to set the raid
stripe cache size and read ahead but without much success.
Running "pvs -o +pe_start", gives
PV VG Fmt Attr PSize PFree 1st PE
/dev/md1 vol1 lvm2 a-- 5.44t 4.08t 1.50m
which if I'm reading it right (as the man page isn't much help in terms
of information) says that the PV is aligned at a 1.5m boundary and so
should be on a 4k boundary?
Any suggestions for further things to try would be much appreciated,
please reply to me directly as I'm not subscribed to the list.
Thanks,
Mark.
-----------------------------------
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=6 time=159.7 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=7 time=146.9 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=8 time=144.1 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=9 time=144.6 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=10 time=144.6 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=11 time=147.8 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=12 time=161.7 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=13 time=1.9 s
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=14 time=175.8 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=15 time=163.0 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=16 time=140.4 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=17 time=182.7 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=18 time=155.3 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=19 time=173.8 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=20 time=2.2 s
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=21 time=158.5 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=22 time=144.5 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=23 time=166.3 ms
10.0 MiB from /dev/vol1/test (device 100.0 GiB): request=24 time=147.1 ms