[linux-lvm] LVM crashed, no superblock on dozens of LVs even after vgcfgrestore

Discussion:

Christophe

2015-03-27 11:00:55 UTC

Hi,

I use LVM on Debian Wheezy, kernel 3.2.0-4-amd64, with several dozens of
LVs.

Every attempt of mounting any of these LVs says : can't read superblock

I can't imagine than each of all theses LVs were corrupted at the same time.

There should be a common problem for that. But which one ??? FYI I'm
using LVM -on-DRBD-on-LVM.
DBRD seems to be fine.

pvs
PV VG Fmt Attr PSize PFree
/dev/raidssd/drbd0 miroirtest lvm2 a-- 124,99g 9,99g
/dev/raidssd/drbd1 miroir1 lvm2 a-- 899,97g 0
/dev/raidssd/drbd2 miroir2 lvm2 a-- 899,97g 0
/dev/raidssd/drbd3 miroir1 lvm2 a-- 899,97g 0
/dev/raidssd/drbd4 miroir2 lvm2 a-- 899,97g 9,99g
/dev/raidssd/drbd5 miroir1 lvm2 a-- 929,97g 0
/dev/raidssd/drbd6 miroir2 lvm2 a-- 929,97g 1,70g
/dev/raidssd/drbd7 miroir1 lvm2 a-- 929,97g 17,88g
/dev/raidssd/drbd8 miroir2 lvm2 a-- 935,96g 157,64g
/dev/sda raidssd lvm2 a-- 3,64t 0
/dev/sdb raidssd lvm2 a-- 3,64t 0

cat /proc/drbd
version: 8.3.11 (api:88/proto:86-96)
srcversion: 41C52C8CD882E47FB5AF767
0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate A r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate A r-----
ns:0 nr:10990235 dw:10990235 dr:0 al:0 bm:178 lo:0 pe:0 ua:0 ap:0
ep:1 wo:d oos:0
2: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate A r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
3: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate A r-----
ns:0 nr:21387 dw:21387 dr:0 al:0 bm:6 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d
oos:0
4: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate A r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
5: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate A r-----
ns:0 nr:2648 dw:2648 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
6: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate A r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
7: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate A r-----
ns:0 nr:59829829 dw:59829829 dr:0 al:0 bm:93 lo:0 pe:0 ua:0 ap:0
ep:1 wo:d oos:0
8: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate A r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0

There was no error about that in the system logs.

Any idea about that ?

Regards,

--
Christophe

Christian Schröder

2015-03-27 11:45:05 UTC

Permalink

Hi,
None of your DRBD devices seems to be in primary mode, so you probably cannot access them at all. (You could try reading the first block of one of your DRBD devices, e.g. "dd if=/dev/raidssd/drbd0 bs=512 count=1", which should also fail.)
You should be able to promote the ressources using "drbdadm primary xxx" with "xxx" being the name of the ressource. (I think you have to promote them one by one.) See http://drbd.linbit.com/users-guide-8.4/s-switch-resource-roles.html for an explanation.
Normally, devices should be promoted when DRBD is started. Check if "become-primary-on" is set in your ressource configuration (e.g. /etc/drbd.d/foo.res). If you use some cluster management software, it should handle promotion and demotion of DRBD devices.

Kind regards,
Christian

------------------------------------------------------------
Deriva GmbH Financial IT and Consulting
Christian Schröder
Geschäftsführer
Hans-Böckler-Straße 2 | D-37079 Göttingen
Tel: +49 (0)551 489 500-42
Fax: +49 (0)551 489 500-91
http://www.deriva.de

Amtsgericht Göttingen | HRB 3240
Geschäftsführer: Christian Schröder

-----Ursprüngliche Nachricht-----
Von: linux-lvm-***@redhat.com [mailto:linux-lvm-***@redhat.com] Im Auftrag von Christophe
Gesendet: Freitag, 27. März 2015 12:01
An: linux-***@redhat.com
Betreff: [linux-lvm] LVM crashed, no superblock on dozens of LVs even after vgcfgrestore

Hi,

I use LVM on Debian Wheezy, kernel 3.2.0-4-amd64, with several dozens of LVs.

Every attempt of mounting any of these LVs says : can't read superblock

I can't imagine than each of all theses LVs were corrupted at the same time.

There should be a common problem for that. But which one ??? FYI I'm using LVM -on-DRBD-on-LVM.
DBRD seems to be fine.

pvs
PV VG Fmt Attr PSize PFree
/dev/raidssd/drbd0 miroirtest lvm2 a-- 124,99g 9,99g
/dev/raidssd/drbd1 miroir1 lvm2 a-- 899,97g 0
/dev/raidssd/drbd2 miroir2 lvm2 a-- 899,97g 0
/dev/raidssd/drbd3 miroir1 lvm2 a-- 899,97g 0
/dev/raidssd/drbd4 miroir2 lvm2 a-- 899,97g 9,99g
/dev/raidssd/drbd5 miroir1 lvm2 a-- 929,97g 0
/dev/raidssd/drbd6 miroir2 lvm2 a-- 929,97g 1,70g
/dev/raidssd/drbd7 miroir1 lvm2 a-- 929,97g 17,88g
/dev/raidssd/drbd8 miroir2 lvm2 a-- 935,96g 157,64g
/dev/sda raidssd lvm2 a-- 3,64t 0
/dev/sdb raidssd lvm2 a-- 3,64t 0

cat /proc/drbd
version: 8.3.11 (api:88/proto:86-96)
srcversion: 41C52C8CD882E47FB5AF767
0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate A r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate A r-----
ns:0 nr:10990235 dw:10990235 dr:0 al:0 bm:178 lo:0 pe:0 ua:0 ap:0
ep:1 wo:d oos:0
2: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate A r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
3: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate A r-----
ns:0 nr:21387 dw:21387 dr:0 al:0 bm:6 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d
oos:0
4: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate A r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
5: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate A r-----
ns:0 nr:2648 dw:2648 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
6: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate A r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
7: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate A r-----
ns:0 nr:59829829 dw:59829829 dr:0 al:0 bm:93 lo:0 pe:0 ua:0 ap:0
ep:1 wo:d oos:0
8: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate A r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0

There was no error about that in the system logs.

Any idea about that ?

Regards,

--
Christophe

_______________________________________________
linux-lvm mailing list
linux-***@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

Christophe

2015-03-27 12:36:29 UTC

Permalink

Hi Christian,

Post by Christian SchrÃ¶der
Hi,
None of your DRBD devices seems to be in primary mode, so you probably cannot access them at all. (You could try reading the first block of one of your DRBD devices, e.g. "dd if=/dev/raidssd/drbd0 bs=512 count=1", which should also fail.)
You should be able to promote the ressources using "drbdadm primary xxx" with "xxx" being the name of the ressource. (I think you have to promote them one by one.) See http://drbd.linbit.com/users-guide-8.4/s-switch-resource-roles.html for an explanation.
Normally, devices should be promoted when DRBD is started. Check if "become-primary-on" is set in your ressource configuration (e.g. /etc/drbd.d/foo.res). If you use some cluster management software, it should handle promotion and demotion of DRBD devices.

You're right, how did I miss that ???

I owe you a beer (or two packs of) the next time you come to Paris ;-)

Thanks a lot !!!

--
Christophe

Christophe

2015-03-27 13:30:02 UTC

Permalink

Hi again,

i still now have a last LV saying "can't read superblock"
lvs says :
"nfspostgres miroir2 -wi-d--- 500,00g"

which means 'device present without tables'

and lvdisplay :
--- Logical volume ---
LV Path /dev/miroir2/nfspostgres
LV Name nfspostgres
VG Name miroir2
LV UUID wPB72R-ZGLS-x1ww-0lgT-D6bz-ZA6S-h6PrOe
LV Write Access read/write
LV Creation host, time host6filer2, 2014-06-02 00:40:01 +0200
LV Status available
# open 0
LV Size 500,00 GiB
Current LE 128000
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:35

I restored the LVM table of yesterday which included already this LV,
and all the other LVs came back without problem.

With ext4, I could search for another superblock, I didn't find the
command for xfs, maybe it exists ?

Any idea ?

Regards,

--
Christophe

Jack Waterworth

2015-03-30 13:33:39 UTC

Permalink

Does the device actually exist here? I usually see these types of errors
when there is something wrong with the PV. More information should be
available via dmesg during activation.

# lvchange -ay vgname/lvname
# dmesg

As for the xfs file-system, i'd recommend reaching out to the xfs
community if you believe your lvm stack is intact. Even if it's not,
they may be able to provide some insight.

Jack Waterworth, Red Hat Certified Architect
OpenStack Technical Support North America
Red Hat Global Support Services ( 1.888.467.3342 )

Post by Christophe
Hi again,
i still now have a last LV saying "can't read superblock"
"nfspostgres miroir2 -wi-d--- 500,00g"
which means 'device present without tables'
--- Logical volume ---
LV Path /dev/miroir2/nfspostgres
LV Name nfspostgres
VG Name miroir2
LV UUID wPB72R-ZGLS-x1ww-0lgT-D6bz-ZA6S-h6PrOe
LV Write Access read/write
LV Creation host, time host6filer2, 2014-06-02 00:40:01 +0200
LV Status available
# open 0
LV Size 500,00 GiB
Current LE 128000
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:35
I restored the LVM table of yesterday which included already this LV,
and all the other LVs came back without problem.
With ext4, I could search for another superblock, I didn't find the
command for xfs, maybe it exists ?
Any idea ?
Regards,