Damon Wang
2018-05-24 14:44:05 UTC
Hi all,
I'm using lvmlockd + sanlock on iSCSI, and sometimes (usually
intensive operations), it shows vglock is failed:
/var/log/messages:
May 24 21:14:29 dev1 sanlock[1108]: 2018-05-24 21:14:29 605471
[1112]: r627 paxos_release 8255 other lver 8258
May 24 21:14:29 dev1 sanlock[1108]: 2018-05-24 21:14:29 605471
[1112]: r627 release_token release leader -250
May 24 21:14:29 dev1 lvmlockd[1061]: 1527167669 S
lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK unlock_san release error
-250
May 24 21:14:29 dev1 lvmlockd[1061]: 1527167669 S
lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK res_unlock lm error -250
May 24 21:14:29 dev1 lvmlockd[1061]: 1527167669 S
lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK unlock_san release error
-1
May 24 21:14:29 dev1 lvmlockd[1061]: 1527167669 S
lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK res_unlock lm error -1
May 24 21:14:29 dev1 sanlock[1108]: 2018-05-24 21:14:29 605471
[1111]: cmd_release 2,9,1061 no resource VGLK
meanwhile lvmlockctl -d:
1527167668 close lvchange[34042] cl 476 fd 9
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK res_lock
rv 0 read vb 101 0 1761
1527167669 send lvcreate[34073] cl 478 lock vg rv 0
1527167669 recv lvcreate[34073] cl 478 find_free_lock vg
"ff35ecc8217543e0a5be9cbe935ffc84" mode iv flags 0
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 find free lock
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84
find_free_lock_san found unused area at 127926272
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 find free lock 0
offset 127926272
1527167669 send lvcreate[34073] cl 478 find_free_lock vg rv 0
1527167669 recv lvcreate[34073] cl 478 init lv
"ff35ecc8217543e0a5be9cbe935ffc84" mode iv flags 0
1527167669 work init_lv
ff35ecc8217543e0a5be9cbe935ffc84/c0dda9c02a7347a098203fbe2a7df1d0 uuid
Wnj8vm-2Hiy-gdqs-Uh1i-fUiA-W3mj-zdASzK
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 init_lv_san
Wnj8vm-2Hiy-gdqs-Uh1i-fUiA-W3mj-zdASzK found unused area at 127926272
1527167669 send lvcreate[34073] cl 478 init lv rv 0 vg_args
1.0.0:lvmlock lv_args 1.0.0:127926272
1527167669 recv lvcreate[34073] cl 478 update vg
"ff35ecc8217543e0a5be9cbe935ffc84" mode iv flags 0
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK action update iv
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK
res_update cl 478 lk version to 1762
1527167669 send lvcreate[34073] cl 478 update vg rv 0
1527167669 recv lvcreate[34073] cl 478 lock lv
"ff35ecc8217543e0a5be9cbe935ffc84" mode un flags 1
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R
Wnj8vm-2Hiy-gdqs-Uh1i-fUiA-W3mj-zdASzK action lock un
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R
Wnj8vm-2Hiy-gdqs-Uh1i-fUiA-W3mj-zdASzK res_unlock cl 478 no locks
1527167669 send lvcreate[34073] cl 478 lock lv rv -2
1527167669 recv lvcreate[34073] cl 478 lock vg
"ff35ecc8217543e0a5be9cbe935ffc84" mode un flags 0
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK action lock un
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK res_unlock cl 478
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK
res_unlock r_version new 1762
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK
unlock_san ex r_version 1762 flags 0
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK
unlock_san set r_version 1762
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK
unlock_san release error -250
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK
res_unlock lm error -250
1527167669 send lvcreate[34073] cl 478 lock vg rv -250
1527167669 close lvcreate[34073] cl 478 fd 11
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK
res_unlock cl 478 from close
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK
unlock_san ex r_version 0 flags 0
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK
unlock_san release error -1
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK
res_unlock lm error -1
1527167670 new cl 479 pi 2 fd 9
and then any lvm related commands will hung:
[***@dev1 ~]# lvmlockctl -i
VG ff35ecc8217543e0a5be9cbe935ffc84 lock_type=sanlock
Nz4PmR-Bwfi-pItr-C3om-Ewnj-SICA-5rFuFg
LS sanlock lvm_ff35ecc8217543e0a5be9cbe935ffc84
LK VG ex ver 1762 pid 0 ()
LW VG sh ver 0 pid 34216 (lvchange)
LW VG sh ver 0 pid 75685 (lvs)
LW VG sh ver 0 pid 83741 (lvdisplay)
LW VG sh ver 0 pid 90569 (lvchange)
LW VG sh ver 0 pid 92735 (lvchange)
LW VG sh ver 0 pid 99982 (lvs)
LW VG sh ver 0 pid 14069 (lvchange)
LK LV sh onuPlt-YXI0-nYtv-CV8Q-yS93-6Zt9-QvlfyH
LK GL sh ver 0 pid 75685 (lvs)
LK GL sh ver 0 pid 99982 (lvs)
LK LV un XOgw6I-3Nuh-ejIP-ydPq-rr7g-LMjb-oklfup
My questions are:
1. why VGLK failed, is it because network failure(cause iSCSI fail and
sanlock could not find VGLK volume), can I find a direct proof?
2. Is it recoverable? I have tried kill all hung commands but new
command still hung forever.
Thanks a lot,
Damon
I'm using lvmlockd + sanlock on iSCSI, and sometimes (usually
intensive operations), it shows vglock is failed:
/var/log/messages:
May 24 21:14:29 dev1 sanlock[1108]: 2018-05-24 21:14:29 605471
[1112]: r627 paxos_release 8255 other lver 8258
May 24 21:14:29 dev1 sanlock[1108]: 2018-05-24 21:14:29 605471
[1112]: r627 release_token release leader -250
May 24 21:14:29 dev1 lvmlockd[1061]: 1527167669 S
lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK unlock_san release error
-250
May 24 21:14:29 dev1 lvmlockd[1061]: 1527167669 S
lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK res_unlock lm error -250
May 24 21:14:29 dev1 lvmlockd[1061]: 1527167669 S
lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK unlock_san release error
-1
May 24 21:14:29 dev1 lvmlockd[1061]: 1527167669 S
lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK res_unlock lm error -1
May 24 21:14:29 dev1 sanlock[1108]: 2018-05-24 21:14:29 605471
[1111]: cmd_release 2,9,1061 no resource VGLK
meanwhile lvmlockctl -d:
1527167668 close lvchange[34042] cl 476 fd 9
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK res_lock
rv 0 read vb 101 0 1761
1527167669 send lvcreate[34073] cl 478 lock vg rv 0
1527167669 recv lvcreate[34073] cl 478 find_free_lock vg
"ff35ecc8217543e0a5be9cbe935ffc84" mode iv flags 0
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 find free lock
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84
find_free_lock_san found unused area at 127926272
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 find free lock 0
offset 127926272
1527167669 send lvcreate[34073] cl 478 find_free_lock vg rv 0
1527167669 recv lvcreate[34073] cl 478 init lv
"ff35ecc8217543e0a5be9cbe935ffc84" mode iv flags 0
1527167669 work init_lv
ff35ecc8217543e0a5be9cbe935ffc84/c0dda9c02a7347a098203fbe2a7df1d0 uuid
Wnj8vm-2Hiy-gdqs-Uh1i-fUiA-W3mj-zdASzK
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 init_lv_san
Wnj8vm-2Hiy-gdqs-Uh1i-fUiA-W3mj-zdASzK found unused area at 127926272
1527167669 send lvcreate[34073] cl 478 init lv rv 0 vg_args
1.0.0:lvmlock lv_args 1.0.0:127926272
1527167669 recv lvcreate[34073] cl 478 update vg
"ff35ecc8217543e0a5be9cbe935ffc84" mode iv flags 0
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK action update iv
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK
res_update cl 478 lk version to 1762
1527167669 send lvcreate[34073] cl 478 update vg rv 0
1527167669 recv lvcreate[34073] cl 478 lock lv
"ff35ecc8217543e0a5be9cbe935ffc84" mode un flags 1
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R
Wnj8vm-2Hiy-gdqs-Uh1i-fUiA-W3mj-zdASzK action lock un
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R
Wnj8vm-2Hiy-gdqs-Uh1i-fUiA-W3mj-zdASzK res_unlock cl 478 no locks
1527167669 send lvcreate[34073] cl 478 lock lv rv -2
1527167669 recv lvcreate[34073] cl 478 lock vg
"ff35ecc8217543e0a5be9cbe935ffc84" mode un flags 0
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK action lock un
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK res_unlock cl 478
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK
res_unlock r_version new 1762
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK
unlock_san ex r_version 1762 flags 0
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK
unlock_san set r_version 1762
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK
unlock_san release error -250
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK
res_unlock lm error -250
1527167669 send lvcreate[34073] cl 478 lock vg rv -250
1527167669 close lvcreate[34073] cl 478 fd 11
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK
res_unlock cl 478 from close
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK
unlock_san ex r_version 0 flags 0
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK
unlock_san release error -1
1527167669 S lvm_ff35ecc8217543e0a5be9cbe935ffc84 R VGLK
res_unlock lm error -1
1527167670 new cl 479 pi 2 fd 9
and then any lvm related commands will hung:
[***@dev1 ~]# lvmlockctl -i
VG ff35ecc8217543e0a5be9cbe935ffc84 lock_type=sanlock
Nz4PmR-Bwfi-pItr-C3om-Ewnj-SICA-5rFuFg
LS sanlock lvm_ff35ecc8217543e0a5be9cbe935ffc84
LK VG ex ver 1762 pid 0 ()
LW VG sh ver 0 pid 34216 (lvchange)
LW VG sh ver 0 pid 75685 (lvs)
LW VG sh ver 0 pid 83741 (lvdisplay)
LW VG sh ver 0 pid 90569 (lvchange)
LW VG sh ver 0 pid 92735 (lvchange)
LW VG sh ver 0 pid 99982 (lvs)
LW VG sh ver 0 pid 14069 (lvchange)
LK LV sh onuPlt-YXI0-nYtv-CV8Q-yS93-6Zt9-QvlfyH
LK GL sh ver 0 pid 75685 (lvs)
LK GL sh ver 0 pid 99982 (lvs)
LK LV un XOgw6I-3Nuh-ejIP-ydPq-rr7g-LMjb-oklfup
My questions are:
1. why VGLK failed, is it because network failure(cause iSCSI fail and
sanlock could not find VGLK volume), can I find a direct proof?
2. Is it recoverable? I have tried kill all hung commands but new
command still hung forever.
Thanks a lot,
Damon