Discussion:
[linux-lvm] what is the IOPS behavior when partitions of single disk(raid5 backend) are used in an LVM?
Sherpa Sherpa
2018-10-11 03:08:07 UTC
Permalink
I have LVM(backed by hardware RAID5) with logical volume and a volume group
named "dbstore-lv" and "dbstore-vg" which have sdb1 sdb2 sdb3 created from
same sdb disk. The system as 42 cores and about 128G memory. Although i
dont see CPU spikes in htop the load average output from uptime is ~43+ as
well as vmstat shows constant iowait of 20-40 where the context switches is
constantly around 80,000-150000 and even more at peak hours, the cpu idle
time is also hovers around 70-85. Below is output of iostat -xp 1 where the
%util is constantly 100%

avg-cpu: %user %nice %system %iowait %steal %idle
8.91 0.00 1.31 10.98 0.00 78.80

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 264.00 0.00 58.00 0.00 1428.00
49.24 0.02 0.28 0.00 0.28 0.21 1.20
sda1 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda2 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda3 0.00 264.00 0.00 58.00 0.00 1428.00
49.24 0.02 0.28 0.00 0.28 0.21 1.20
sdb 0.00 316.00 4.00 86.00 512.00 1608.00
47.11 36.02 0.27 5.00 0.05 11.11 100.00
sdb1 0.00 312.00 4.00 63.00 3512.00 4500.00
60.06 34.02 100.00 5.00 0.00 14.93 100.00
sdb2 0.00 0.00 0.00 821.00 450.00 84.00
8.00 82.00 99.19 0.00 0.19 47.62 100.00
sdb3 0.00 4.00 0.00 2.00 0.00 24.00
24.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-0 0.00 0.00 0.00 6.00 0.00 24.00
8.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-1 0.00 0.00 4.00 396.00 512.00 1584.00
10.48 36.02 8180.00 5.00 8180.00 2.50 100.00
dm-2 0.00 0.00 0.00 329.00 0.00 3896.00
23.68 0.85 2.58 0.00 2.58 0.05 1.60
dm-3 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00

Similarly the TPS/iops is around 600-1000 most of the time(eg. iostat
outptu below)

avg-cpu: %user %nice %system %iowait %steal %idle
22.24 0.35 2.56 32.08 0.00 42.77

Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 527.00 3828.00 1536.00 3828 1536
sdb 576.00 8532.00 2804.00 8532 2804
sdc 42.00 280.00 156.00 280 156
dm-0 0.00 0.00 0.00 0 0
dm-1 956.00 8400.00 2804.00 8400 2804
dm-2 569.00 4108.00 1692.00 4108 1692
dm-3 0.00 0.00 0.00 0 0

Here is vmstat 1 output

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
2 22 986520 621704 440356 23588860 0 0 2560 8140 27032 132198
16 2 46 36 0
7 23 986520 672528 440204 23532752 0 0 2360 8 26659 107002
10 2 48 41 0
22 18 986520 697048 440084 23496096 0 0 3152 22520 60223 187651
25 5 46 25 0
2 18 986520 688596 439984 23501104 0 0 2436 684 50451 210261
20 5 49 26 0
13 33 986520 663680 439984 23495812 0 0 1712 149956 38549
136294 15 4 45 36 0
9 34 986520 647308 439968 23507944 0 0 1484 1832 51501 174355
19 4 38 39 0
14 18 986520 608364 439340 23531976 12 0 1828 21344 63692 134934
15 4 48 33 0
11 23 986520 588220 437636 23549852 0 0 2528 192 33461 116199
13 3 50 35 0
3 17 986520 601892 438080 23542508 0 0 3224 16376 74679 167580
20 5 40 34 0
1 16 986520 567092 438080 23574776 0 0 2272 76624 40944 136229
16 4 51 29 0
6 16 986520 584120 438380 23560932 0 0 18568 0 32038 108119
12 3 56 29 0
17 17 986520 568012 438392 23575828 0 0 2572 67248 54320 168767
19 4 51 26 0
5 23 986520 566384 438124 23575640 0 0 2656 360 60057 158031
18 5 49 28 0
1 29 986520 632216 438604 23546316 0 0 2520 28528 49198 109391
10 4 50 37 0
19 14 986508 621236 438616 23560516 0 0 2528 9368 39632 169120
19 4 44 32 0
8 31 986532 653172 440340 23548788 32 0 2460 208 29679 116036
14 4 42 40 0
28 26 986532 675568 440344 23551600 0 0 4552 3928 29385 113816
16 3 39 42 0
10 34 986532 654700 440352 23561616 0 0 2712 816 31667 155532
20 3 40 37 0
15 20 986520 630768 440356 23577388 32 0 4416 4348 35499 175319
30 3 35 32 0

Below is excerpt of lsblk which shows lvm associated to disks

sdb 8:16 0 19.7T 0 disk
├─sdb1 8:17 0 7.7T 0 part
│ └─dbstore-lv (dm-1) 252:1 0 9.4T 0 lvm /var/db/st01
├─sdb2 8:18 0 1.7T 0 part
│ └─dbstore-lv (dm-1) 252:1 0 9.4T 0 lvm /var/db/st01
└─sdb3 8:19 0 10.3T 0 part
└─archive--archivedbstore--lv (dm-0) 252:0 0 10.3T 0 lvm
/opt/archive/

Queue Depth for sdb

cat /sys/block/sdb/device/queue_depth
1020

I am assuming this is due to disk seek problem as the same disk partitions
are used for same LVM or may be its due to saturation of the disks(i dont
have the vendor provided IOPS data of this disk yet). As initial tuning i
have set vm.dirty_ratio to 5 and dirty_background_ratio to 2 + tried
deadline scheduler (currently noop) but this doesnt seem to help to reduce
the iowait. Any suggestions please ?


Warm Regards
Urgen Sherpa
David Teigland
2018-10-11 14:25:50 UTC
Permalink
Post by Sherpa Sherpa
I have LVM(backed by hardware RAID5) with logical volume and a volume group
named "dbstore-lv" and "dbstore-vg" which have sdb1 sdb2 sdb3 created from
same sdb disk.
sdb 8:16 0 19.7T 0 disk
├─sdb1 8:17 0 7.7T 0 part
│ └─dbstore-lv (dm-1) 252:1 0 9.4T 0 lvm /var/db/st01
├─sdb2 8:18 0 1.7T 0 part
│ └─dbstore-lv (dm-1) 252:1 0 9.4T 0 lvm /var/db/st01
└─sdb3 8:19 0 10.3T 0 part
└─archive--archivedbstore--lv (dm-0) 252:0 0 10.3T 0 lvm
I am assuming this is due to disk seek problem as the same disk partitions
are used for same LVM or may be its due to saturation of the disks
You shouldn't add different partitions as different PVs. If it's too late
to fix, it might help to create new LV that uses only one of the
partitions, e.g. lvcreate -n lv -L size vg /dev/sdb2, and then copy your
current LV to the new one.
Emmanuel Gelati
2018-10-11 14:31:26 UTC
Permalink
If you use sdb only for data, you don't have need to use partition on the
disk.
Post by Sherpa Sherpa
Post by Sherpa Sherpa
I have LVM(backed by hardware RAID5) with logical volume and a volume
group
Post by Sherpa Sherpa
named "dbstore-lv" and "dbstore-vg" which have sdb1 sdb2 sdb3 created
from
Post by Sherpa Sherpa
same sdb disk.
sdb 8:16 0 19.7T 0 disk
├─sdb1 8:17 0 7.7T 0 part
│ └─dbstore-lv (dm-1) 252:1 0 9.4T 0 lvm /var/db/st01
├─sdb2 8:18 0 1.7T 0 part
│ └─dbstore-lv (dm-1) 252:1 0 9.4T 0 lvm /var/db/st01
└─sdb3 8:19 0 10.3T 0 part
└─archive--archivedbstore--lv (dm-0) 252:0 0 10.3T 0 lvm
I am assuming this is due to disk seek problem as the same disk
partitions
Post by Sherpa Sherpa
are used for same LVM or may be its due to saturation of the disks
You shouldn't add different partitions as different PVs. If it's too late
to fix, it might help to create new LV that uses only one of the
partitions, e.g. lvcreate -n lv -L size vg /dev/sdb2, and then copy your
current LV to the new one.
_______________________________________________
linux-lvm mailing list
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
--
.~.
/V\
// \\
/( )\
^`~'^
Heinz Mauelshagen
2018-10-12 12:02:22 UTC
Permalink
Post by Emmanuel Gelati
If you use sdb only for data, you don't have need to use partition on
the disk.
Though that's true, keeping 1 partition per disk for each LVM PV adds
additional
'visibility' by tools like fdisk/[cs]fdisk, parted etc. showing the
partition type to be 'Liinux LVM'.

Using the whole disk, blkid or lsblk will provide that information still,
e.g. 'blkid --match-token TYPE=LVM2_member'.

Heinz
Post by Emmanuel Gelati
Il giorno gio 11 ott 2018 alle ore 16:26 David Teigland
Post by Sherpa Sherpa
I have LVM(backed by hardware RAID5) with logical volume and a
volume group
Post by Sherpa Sherpa
named "dbstore-lv" and "dbstore-vg" which have sdb1 sdb2 sdb3
created from
Post by Sherpa Sherpa
same sdb disk.
sdb                                8:16   0  19.7T  0 disk
├─sdb1                             8:17   0   7.7T  0 part
│ └─dbstore-lv (dm-1)              252:1    0   9.4T  0 lvm 
/var/db/st01
Post by Sherpa Sherpa
├─sdb2                             8:18   0   1.7T  0 part
│ └─dbstore-lv (dm-1)              252:1    0   9.4T  0 lvm 
/var/db/st01
Post by Sherpa Sherpa
└─sdb3                             8:19   0  10.3T  0 part
   └─archive--archivedbstore--lv (dm-0)     252:0    0 10.3T  0 lvm
I am assuming this is due to disk seek problem as the same disk
partitions
Post by Sherpa Sherpa
are used for same LVM or may be its due to saturation of the disks
You shouldn't add different partitions as different PVs.  If it's
too late
to fix, it might help to create new LV that uses only one of the
partitions, e.g. lvcreate -n lv -L size vg /dev/sdb2, and then copy your
current LV to the new one.
_______________________________________________
linux-lvm mailing list
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
--
  .~.
  /V\
 //  \\
/(   )\
^`~'^
_______________________________________________
linux-lvm mailing list
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Heinz Mauelshagen
2018-10-15 14:48:31 UTC
Permalink
Thank you for reply i dont mind if fstab sees partitions.  I read this
"To avoid striping performance
 problems LVM can't tell that two PVs are on the same physical disk, so if
you create a striped LV then the stripes could be on different
partitions  on the same disk resulting in a *decrease* in performance
rather than an increase." in the tldp.org <http://tldp.org> but does
this apply to disks made from RAID backend ?
If you use partitioning, only create one partition per backing device
and use it as a PV.
This avoids striping across multiple PVs on the same backing device.

The same config flaw (i.e. use multiple partitions on the same backing
device as PVs thus potentially
stripe across them) may apply to any backing store allowing for
partitioning.  So don't do it on SW/HW RAID either.

Heinz
Warm Regards
Urgen Sherpa
Post by Emmanuel Gelati
If you use sdb only for data, you don't have need to use
partition on the disk.
Though that's true, keeping 1 partition per disk for each LVM PV
adds additional
'visibility' by tools like fdisk/[cs]fdisk, parted etc. showing
the partition type to be 'Liinux LVM'.
Using the whole disk, blkid or lsblk will provide that information still,
e.g. 'blkid --match-token TYPE=LVM2_member'.
Heinz
Post by Emmanuel Gelati
Il giorno gio 11 ott 2018 alle ore 16:26 David Teigland
Post by Sherpa Sherpa
I have LVM(backed by hardware RAID5) with logical volume
and a volume group
Post by Sherpa Sherpa
named "dbstore-lv" and "dbstore-vg" which have sdb1 sdb2
sdb3 created from
Post by Sherpa Sherpa
same sdb disk.
sdb                                8:16   0 19.7T  0 disk
├─sdb1                             8:17   0  7.7T  0 part
│ └─dbstore-lv (dm-1)              252:1    0  9.4T  0 lvm 
/var/db/st01
Post by Sherpa Sherpa
├─sdb2                             8:18   0  1.7T  0 part
│ └─dbstore-lv (dm-1)              252:1    0  9.4T  0 lvm 
/var/db/st01
Post by Sherpa Sherpa
└─sdb3                             8:19   0 10.3T  0 part
   └─archive--archivedbstore--lv (dm-0)     252:0   0 
10.3T  0 lvm
Post by Sherpa Sherpa
I am assuming this is due to disk seek problem as the same
disk partitions
Post by Sherpa Sherpa
are used for same LVM or may be its due to saturation of
the disks
You shouldn't add different partitions as different PVs.  If
it's too late
to fix, it might help to create new LV that uses only one of the
partitions, e.g. lvcreate -n lv -L size vg /dev/sdb2, and then copy your
current LV to the new one.
_______________________________________________
linux-lvm mailing list
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
--
  .~.
  /V\
 //  \\
/(   )\
^`~'^
_______________________________________________
linux-lvm mailing list
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO athttp://tldp.org/HOWTO/LVM-HOWTO/
_______________________________________________
linux-lvm mailing list
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Loading...