[linux-lvm] Can't mount LVM RAID5 drives

Discussion:

Ryan Davis

2014-04-04 21:32:40 UTC

Hi,

I have 3 drives in a RAID 5 configuration as a LVM volume. These disks
contain /home

After performing a shutdown and moving the computer I can't get the drives
to mount automatically.

This is all new to me so I am not sure if this is a LVM issue but any help
is appreciated. LVS shows I have a mapped device present without tables.

When I try to mount the volume to home this happens:

[***@hobbes ~]# mount -t ext4 /dev/vg_data/lv_home /home

mount: wrong fs type, bad option, bad superblock on /dev/vg_data/lv_home,

missing codepage or other error

(could this be the IDE device where you in fact use

ide-scsi so that sr0 or sda or so is needed?)

In some cases useful info is found in syslog - try

dmesg | tail or so

[***@hobbes ~]# dmesg | tail

EXT4-fs (dm-0): unable to read superblock

[***@hobbes ~]# fsck.ext4 -v /dev/sdc1

e4fsck 1.41.12 (17-May-2010)

fsck.ext4: Superblock invalid, trying backup blocks...

fsck.ext4: Bad magic number in super-block while trying to open /dev/sdc1

The superblock could not be read or does not describe a correct ext2

filesystem. If the device is valid and it really contains an ext2

filesystem (and not swap or ufs or something else), then the superblock

is corrupt, and you might try running e4fsck with an alternate superblock:

e4fsck -b 8193 <device>

[***@hobbes ~]# mke2fs -n /dev/sdc1

mke2fs 1.39 (29-May-2006)

Filesystem label=

OS type: Linux

Block size=4096 (log=2)

Fragment size=4096 (log=2)

488292352 inodes, 976555199 blocks

48827759 blocks (5.00%) reserved for the super user

First data block=0

Maximum filesystem blocks=4294967296

29803 block groups

32768 blocks per group, 32768 fragments per group

16384 inodes per group

Superblock backups stored on blocks:

32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632,
2654208,

4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,

102400000, 214990848, 512000000, 550731776, 644972544

Is the superblock issue causing the lvm issues?

Thanks for any input you might have.

Here are useful outputs about the system.

Here are some of the packages installed

#rpm -qa | egrep -i '(kernel|lvm2|device-mapper)'

device-mapper-1.02.67-2.el5

kernel-devel-2.6.18-348.18.1.el5

device-mapper-event-1.02.67-2.el5

kernel-headers-2.6.18-371.6.1.el5

lvm2-2.02.88-12.el5

device-mapper-1.02.67-2.el5

kernel-devel-2.6.18-371.3.1.el5

device-mapper-multipath-0.4.7-59.el5

kernel-2.6.18-371.6.1.el5

kernel-devel-2.6.18-371.6.1.el5

kernel-2.6.18-371.3.1.el5

kernel-2.6.18-348.18.1.el5

lvm2-cluster-2.02.88-9.el5_10.2

#uname -a

Linux hobbes 2.6.18-371.6.1.el5 #1 SMP Wed Mar 12 20:03:51 EDT 2014 x86_64
x86_64 x86_64 GNU/Linux

LVM info:

#vgs

VG #PV #LV #SN Attr VSize VFree

vg_data 1 1 0 wz--n- 3.64T 0

#lvs

LV VG Attr LSize Origin Snap% Move Log Copy% Convert

lv_home vg_data -wi-d- 3.64T

Looks like I have a mapped device present without tables (d) attribute.

#pvs

PV VG Fmt Attr PSize PFree

/dev/sdc1 vg_data lvm2 a-- 3.64T 0

#ls /dev/vg_data

lv_home

#vgscan --mknodes

Reading all physical volumes. This may take a while...

Found volume group "vg_data" using metadata type lvm2

#pvscan

PV /dev/sdc1 VG vg_data lvm2 [3.64 TB / 0 free]

Total: 1 [3.64 TB] / in use: 1 [3.64 TB] / in no VG: 0 [0 ]

#vgchange -ay

1 logical volume(s) in volume group "vg_data" now active

device-mapper: ioctl: error adding target to table

#dmesg |tail

device-mapper: table: device 8:33 too small for target

device-mapper: table: 253:0: linear: dm-linear: Device lookup failed

device-mapper: ioctl: error adding target to table

#vgdisplay -v

--- Volume group ---

VG Name vg_data

System ID

Format lvm2

Metadata Areas 1

Metadata Sequence No 2

VG Access read/write

VG Status resizable

MAX LV 0

Cur LV 1

Open LV 0

Max PV 0

Cur PV 1

Act PV 1

VG Size 3.64 TB

PE Size 4.00 MB

Total PE 953668

Alloc PE / Size 953668 / 3.64 TB

Free PE / Size 0 / 0

VG UUID b2w9mR-hvSc-Rm0k-3yHL-iEgc-6nMq-uq69E1

--- Logical volume ---

LV Name /dev/vg_data/lv_home

VG Name vg_data

LV UUID 13TmTm-YqIo-6xIp-1NHf-AJTu-9ImE-SHwLz6

LV Write Access read/write

LV Status available

# open 0

LV Size 3.64 TB

Current LE 953668

Segments 1

Allocation inherit

Read ahead sectors 16384

- currently set to 256

Block device 253:0

--- Physical volumes ---

PV Name /dev/sdc1

PV UUID 8D67bX-xg4s-QRy1-4E8n-XfiR-0C2r-Oi1Blf

PV Status allocatable

Total PE / Free PE 953668 / 0

#lvscan

ACTIVE '/dev/vg_data/lv_home' [3.64 TB] inherit

#partprobe -s

/dev/sda: msdos partitions 1 2 3 4 <5 6 7 8 9 10>

/dev/sdb: msdos partitions 1 2 3 4 <5 6 7 8 9 10>

/dev/sdc: gpt partitions 1

#dmsetup table

vg_data-lv_home:

#dmsetup ls

vg_data-lv_home (253, 0)

#lvdisplay -m

--- Logical volume ---

LV Name /dev/vg_data/lv_home

VG Name vg_data

LV UUID 13TmTm-YqIo-6xIp-1NHf-AJTu-9ImE-SHwLz6

LV Write Access read/write

LV Status available

# open 0

LV Size 3.64 TB

Current LE 953668

Segments 1

Allocation inherit

Read ahead sectors 16384

- currently set to 256

Block device 253:0

--- Segments ---

Logical extent 0 to 953667:

Type linear

Physical volume /dev/sdc1

Physical extents 0 to 953667

Here is a link to files outputted by lvmdump:
https://www.dropbox.com/sh/isg4fdmthiyoszh/tyYOfqllya

Peter Rajnoha

2014-04-07 13:22:49 UTC

Permalink

Post by Ryan Davis
mount: wrong fs type, bad option, bad superblock on /dev/vg_data/lv_home,
missing codepage or other error
(could this be the IDE device where you in fact use
ide-scsi so that sr0 or sda or so is needed?)
In some cases useful info is found in syslog - try
dmesg | tail or so
EXT4-fs (dm-0): unable to read superblock

That's because an LV that is represented by a device-mapper
mapping doesn't have a proper table loaded (as you already
mentioned later). So such device is unusable until proper
tables are loaded...

Post by Ryan Davis
mke2fs 1.39 (29-May-2006)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
488292352 inodes, 976555199 blocks
48827759 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
29803 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632,
2654208,
4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
102400000, 214990848, 512000000, 550731776, 644972544

Oh! Don't use the PV directly (the /dev/sdc1), but always use the
LV on top (/dev/vg_data/lv_home) otherwise you'll destroy the PV.
(Here you used "-n" so it didn't do anything to the PV fortunately.)

Post by Ryan Davis
Is the superblock issue causing the lvm issues?
Thanks for any input you might have.

We need to see why the table load failed for the LV.
That's the exact problem here.

Post by Ryan Davis
#vgs
VG #PV #LV #SN Attr VSize VFree
vg_data 1 1 0 wz--n- 3.64T 0
#lvs
LV VG Attr LSize Origin Snap% Move Log Copy% Convert
lv_home vg_data -wi-d- 3.64T
Looks like I have a mapped device present without tables (d) attribute.
#pvs
PV VG Fmt Attr PSize PFree
/dev/sdc1 vg_data lvm2 a-- 3.64T 0
#ls /dev/vg_data
lv_home
#vgscan --mknodes
Reading all physical volumes. This may take a while...
Found volume group "vg_data" using metadata type lvm2
#pvscan
PV /dev/sdc1 VG vg_data lvm2 [3.64 TB / 0 free]
Total: 1 [3.64 TB] / in use: 1 [3.64 TB] / in no VG: 0 [0 ]
#vgchange -ay
1 logical volume(s) in volume group "vg_data" now active
device-mapper: ioctl: error adding target to table
#dmesg |tail
device-mapper: table: device 8:33 too small for target
device-mapper: table: 253:0: linear: dm-linear: Device lookup failed
device-mapper: ioctl: error adding target to table

The 8:33 is the /dev/sdc1 which is the PV used.
What's the actual size of the /dev/sdc1?
Try "blockdev --getsz /dev/sdc1" and see what the size is.

--
Peter

Ryan Davis

2014-04-09 16:07:26 UTC

Permalink

Thanks for explaining some of the aspects of LVs. Used them for years
but it's not until they break that I started reading more into it.

Here is the block device size of scdc1:

[***@hobbes ~]# blockdev --getsz /dev/sdc1

7812441596

Here is the output of pvs -o pv_all /dev/sdc1

Fmt PV UUID DevSize PV PMdaFree PMdaSize 1st PE PSize PFree Used Attr PE
Alloc PV Tags #PMda #PMdaUse lvm2 8D67bX-xg4s-QRy1-4E8n-XfiR-0C2r-Oi1Blf
3.64T /dev/sdc1 92.50K 188.00K 192.00K 3.64T 0 3.64T a-- 953668 953668 1 1

Thanks for the support!

Ryan

Post by Peter Rajnoha

That's because an LV that is represented by a device-mapper
mapping doesn't have a proper table loaded (as you already
mentioned later). So such device is unusable until proper
tables are loaded...

Post by Ryan Davis
Is the superblock issue causing the lvm issues?
Thanks for any input you might have.

We need to see why the table load failed for the LV.
That's the exact problem here.

The 8:33 is the /dev/sdc1 which is the PV used.
What's the actual size of the /dev/sdc1?
Try "blockdev --getsz /dev/sdc1" and see what the size is.

Peter Rajnoha

2014-04-10 14:10:43 UTC

Permalink

Post by Ryan Davis
Thanks for explaining some of the aspects of LVs. Used them for years
but it's not until they break that I started reading more into it.
7812441596
Here is the output of pvs -o pv_all /dev/sdc1
Fmt PV UUID DevSize PV PMdaFree PMdaSize 1st PE PSize PFree Used Attr PE
Alloc PV Tags #PMda #PMdaUse lvm2 8D67bX-xg4s-QRy1-4E8n-XfiR-0C2r-Oi1Blf
3.64T /dev/sdc1 92.50K 188.00K 192.00K 3.64T 0 3.64T a-- 953668 953668 1 1

So we have 953668 extents, each one having 4MiB, that's 7812448256 sectors
(512-byte sectors). Then we need to add the PE start value which is 192 KiB,
which means the original device size during creation of this PV was
7812448256 + 384 = 7812448640 sectors.

The difference from the current device size reported is:

7812441596 - 7812448640 = -7044 sectors

So the disk drive is about 3.44MiB shorter now for some reason.
That's why the LV does not fit here.

I can't tell you why this happened exactly. But that's what the
sizes show.

What you can do here to fix this is to resize your filesystem/LV/PV accordingly.
If we know that it's just one extent, we can do the following:

- if it's possible, do a backup of the disk content!!!
- double check it's really /dev/sdc1 still as during reboots,
it can be assigned a different name by kernel

1. you can check which LV is mapped onto the PV by issuing
pvdisplay --maps /dev/sdc1

2. then deactivate one LV found on that PV (if there are more LVs mapped
on the PV, choose the LV that is mapped at the end of the disk since
it's more probable that the disk is shorter at the end when compared
to original size)
lvchange -an <the_LV_found_on_the_PV>

3. then reduce the LV size by one extent (1 should be enough since the
PV is shorter with 3.44 MiB) *also* with resizing the filesystem
that's on the LV!!! (this is the "-f" option for the lvreduce, it's
very important!!!)
lvreduce -f -l -1 <the_LV_found_on_the_PV>

4. then make the PV size in sync with actual device size with calling:
pvresize /dev/sdc1

5. now activate the LVs you deactivated in step 2.
lvchange -ay <the_LVs_found_on_the_PV>

Note that this will only work if it's possible to resize the filesystem
and the LV data are not fully allocated! (in which case you probably
lost some data already)

Take this as a hint only and be very very careful when doing this
as you may lose data when this is done incorrectly!

I'm not taking responsibility for any data loss.

If you have any more questions, feel free to ask.

--
Peter

Peter Rajnoha

2014-04-10 14:14:38 UTC

Permalink

Post by Peter Rajnoha
3. then reduce the LV size by one extent (1 should be enough since the
PV is shorter with 3.44 MiB) *also* with resizing the filesystem
that's on the LV!!! (this is the "-f" option for the lvreduce, it's
very important!!!)
lvreduce -f -l -1 <the_LV_found_on_the_PV>

!!!!
Sorry, I meant "-r" option instead of "-f"!!!
!!!!

The -f is the dangerous one - force!!!
The -r is the "resizefs"!!!

--
Peter

Ryan Davis

2014-04-18 18:23:21 UTC

Permalink

Hi Peter,

I made a backup copy of /dev/sdc using dd as you suggested. This took a
while to get a drive to copy this data to.

I started today with the resizing of the LV and hit a few snags and was
wondering if you could shine some light on what is going on.

#pvdisplay --maps /dev/sdc1

--- Physical volume ---
PV Name /dev/sdc1
VG Name vg_data
PV Size 3.64 TB / not usable 3.97 MB
Allocatable yes (but full)
PE Size (KByte) 4096
Total PE 953668
Free PE 0
Allocated PE 953668
PV UUID 8D67bX-xg4s-QRy1-4E8n-XfiR-0C2r-Oi1Blf

--- Physical Segments ---
Physical extent 0 to 953667:
Logical volume /dev/vg_data/lv_home
Logical extents 0 to 953667

Is the Allocatable yes (but Full) a deal breaker?

# lvchange -an /dev/vg_data/lv_home

[***@hobbes ~]# lvscan
inactive '/dev/vg_data/lv_home' [3.64 TB] inherit

[***@hobbes ~]# lvreduce -r -l -1 /dev/vg_data/lv_home
Logical volume lv_home must be activated before resizing filesystem

[***@hobbes ~]# pvresize /dev/sdc1
/dev/sdc1: cannot resize to 953667 extents as 953668 are allocated.
0 physical volume(s) resized / 1 physical volume(s) not resized

What should I do now? Did I miss something along the way?

Thanks for the help once again!

Ryan

-----Original Message-----
From: Peter Rajnoha [mailto:***@redhat.com]
Sent: Thursday, April 10, 2014 7:11 AM
To: Ryan Davis
Cc: LVM general discussion and development
Subject: Re: [linux-lvm] Can't mount LVM RAID5 drives

So we have 953668 extents, each one having 4MiB, that's 7812448256 sectors
(512-byte sectors). Then we need to add the PE start value which is 192 KiB,
which means the original device size during creation of this PV was
7812448256 + 384 = 7812448640 sectors.

The difference from the current device size reported is:

7812441596 - 7812448640 = -7044 sectors

So the disk drive is about 3.44MiB shorter now for some reason.
That's why the LV does not fit here.

I can't tell you why this happened exactly. But that's what the
sizes show.

What you can do here to fix this is to resize your filesystem/LV/PV
accordingly.
If we know that it's just one extent, we can do the following:

- if it's possible, do a backup of the disk content!!!
- double check it's really /dev/sdc1 still as during reboots,
it can be assigned a different name by kernel

1. you can check which LV is mapped onto the PV by issuing
pvdisplay --maps /dev/sdc1

2. then deactivate one LV found on that PV (if there are more LVs mapped
on the PV, choose the LV that is mapped at the end of the disk since
it's more probable that the disk is shorter at the end when compared
to original size)
lvchange -an <the_LV_found_on_the_PV>

3. then reduce the LV size by one extent (1 should be enough since the
PV is shorter with 3.44 MiB) *also* with resizing the filesystem
that's on the LV!!! (this is the "-f" option for the lvreduce, it's
very important!!!)
lvreduce -f -l -1 <the_LV_found_on_the_PV>

4. then make the PV size in sync with actual device size with calling:
pvresize /dev/sdc1

5. now activate the LVs you deactivated in step 2.
lvchange -ay <the_LVs_found_on_the_PV>

Note that this will only work if it's possible to resize the filesystem
and the LV data are not fully allocated! (in which case you probably
lost some data already)

Take this as a hint only and be very very careful when doing this
as you may lose data when this is done incorrectly!

I'm not taking responsibility for any data loss.

If you have any more questions, feel free to ask.

--
Peter

Peter Rajnoha

2014-04-22 11:14:20 UTC

Permalink

Post by Ryan Davis
Hi Peter,
I made a backup copy of /dev/sdc using dd as you suggested. This took a
while to get a drive to copy this data to.
I started today with the resizing of the LV and hit a few snags and was
wondering if you could shine some light on what is going on.
#pvdisplay --maps /dev/sdc1
--- Physical volume ---
PV Name /dev/sdc1
VG Name vg_data
PV Size 3.64 TB / not usable 3.97 MB
Allocatable yes (but full)
PE Size (KByte) 4096
Total PE 953668
Free PE 0
Allocated PE 953668
PV UUID 8D67bX-xg4s-QRy1-4E8n-XfiR-0C2r-Oi1Blf
--- Physical Segments ---
Logical volume /dev/vg_data/lv_home
Logical extents 0 to 953667
Is the Allocatable yes (but Full) a deal breaker?

That's OK and it's not a stopper here...

Post by Ryan Davis
# lvchange -an /dev/vg_data/lv_home
inactive '/dev/vg_data/lv_home' [3.64 TB] inherit
Logical volume lv_home must be activated before resizing filesystem

Ah, I see. Well, we have to do that a bit more manually then...

Post by Ryan Davis
/dev/sdc1: cannot resize to 953667 extents as 953668 are allocated.
0 physical volume(s) resized / 1 physical volume(s) not resized
What should I do now? Did I miss something along the way?

This should do the job then (I've tried this exact sequence on my machine):

- deactivate the problematic LV:
lvchange -an vg_data/lv_home

- reduce the size of the LV by 1 extent:
lvreduce -l -1 vg_data/lv_home

- make the PV size to be in sync with the real device size:
pvresize /dev/sdc1

- activate the LV:
lvchange -ay vg_data/lv_home

- run fsck for the FS on the LV:
e2fsck -f -n /dev/vg_data/lv_home

- resize the FS to be in sync with new LV size:
fsadm resize /dev/vg/lvol0

- check the resized filesystem:
e2fsck /dev/vg_data/lv_home

Let me know if it works for you.

--
Peter

Ryan Davis

2014-04-22 18:43:55 UTC

Permalink

Hi Peter,

Thanks for the support.

Everything ran smooth until I did a fsck on the FS on the LV. It's
complaining about a bad superblock

[***@hobbes ~]# e2fsck -f -n /dev/vg_data/lv_home
e2fsck 1.39 (29-May-2006)
e2fsck: Filesystem has unsupported feature(s) while trying to open
/dev/vg_data/lv_home

The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>

Should I now run mke2fs on the LV and see what happens?
#mke2fs -n /dev/vg_data/lv_home
And then try different superblocks. I also found this:
http://docs.oracle.com/cd/E19455-01/805-7228/6j6q7uf0i/index.html

[root hobbes ~]# mount -t ext4 /dev/vg_data/lv_home /home
mount: wrong fs type, bad option, bad superblock on /dev/vg_data/lv_home,
missing codepage or other error
(could this be the IDE device where you in fact use
ide-scsi so that sr0 or sda or so is needed?)
In some cases useful info is found in syslog - try
dmesg | tail or so
[root hobbes ~]# dmesg | tail
EXT4-fs (dm-0): unable to read superblock

That's because an LV that is represented by a device-mapper
mapping doesn't have a proper table loaded (as you already
mentioned later). So such device is unusable until proper
tables are loaded...

[root hobbes ~]# mke2fs -n /dev/sdc1
mke2fs 1.39 (29-May-2006)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
488292352 inodes, 976555199 blocks
48827759 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
29803 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632,
2654208,
4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
102400000, 214990848, 512000000, 550731776, 644972544

Zdenek Kabelac

2014-04-23 07:59:54 UTC

Permalink

Post by Ryan Davis
Hi Peter,
Thanks for the support.
Everything ran smooth until I did a fsck on the FS on the LV. It's
complaining about a bad superblock

Saying something runs smooth here is somewhat pointless...

Looking at your lvmdump --

pv0 {
id = "8D67bX-xg4s-QRy1-4E8n-XfiR-0C2r-Oi1Blf"
device = "/dev/sdc1" # Hint only

status = ["ALLOCATABLE"]
flags = []
dev_size = 7812456381 # 3.63796 Terabytes
pe_start = 384
pe_count = 953668 # 3.63795
}

This is how your PV was looking when you have created your VG.

Obviously your device /dev/sdc1 had 7812456381 sectors.
(Very strange to have odd number here....)

Later you report # blockdev --getsz /dev/sdc1 as 7812441596

So we MUST start from the moment you tell us what you did to your
system that suddenly your device is 14785 blocks shorter (~8MB) ?

Have you reconfigured your /dev/sdc device?
Is it HW raid5 device ?
Have you repartitioned/resized it (fdisk,gparted) ?

We can't move forward without knowing exact roots of your problem ?

Everything else is pointless waste of time since we will just try to hunt
some random piece of information?

I just hope you have not tried to play directly with your /dev/sdc device
(Since in some emails it seems you try to execute various command directly on
this device)

Zdenek

Ryan Davis

2014-04-23 16:56:00 UTC

Permalink

Hi Zdenek,

The last thing I want to do is waste people's time. I do appreciate you
wanting to know what caused this in the first place. Even if we got the
data to mount (/home), I would like to know what caused this so that I could
be aware of it and prevent it from happening again.

I was running some analysis tools on some genomic data stored on the LV. I
checked the capacity of the LV with #df -h and realized that I was 99% full
on home and proceeded to delete some folders while the analysis was running.
At the end of the day I was at 93% full on /home.
I shutdown the system and then physically moved the server to a new location
and upon booting up the system for the first time in the new location I
received the following error when it tried to mount /dev/vg_data/lv_home:

The superblock could not be read or does not describe
a correct ext2fs.
device-mapper: reload ioctl failed invalid argument

The system dumped me to a rescue prompt and I looked at dmesg:

device-mapper table device 8:33 too small for target
device-mapper 253:0 linear dm-linear device lookup failed
device-mapper ioctl error adding target to table

I then contacted the manufacturer who setup the server. We booted the
system using a live CD (CentOS 6.3) and commented out the mounting of /home.
They had me issue the following commands:

pvdisplay
vgdisplay
lvdisplay

They then had me do the following that I reported in the initial post:

[root hobbes ~]# mount -t ext4 /dev/vg_data/lv_home /home

mount: wrong fs type, bad option, bad superblock on /dev/vg_data/lv_home,

missing codepage or other error

(could this be the IDE device where you in fact use

ide-scsi so that sr0 or sda or so is needed?)

In some cases useful info is found in syslog - try

dmesg | tail or so

[root hobbes ~]# dmesg | tail

EXT4-fs (dm-0): unable to read superblock

[root hobbes ~]# fsck.ext4 -v /dev/sdc1

e4fsck 1.41.12 (17-May-2010)

fsck.ext4: Superblock invalid, trying backup blocks...

fsck.ext4: Bad magic number in super-block while trying to open /dev/sdc1

The superblock could not be read or does not describe a correct ext2

filesystem. If the device is valid and it really contains an ext2

filesystem (and not swap or ufs or something else), then the superblock

is corrupt, and you might try running e4fsck with an alternate superblock:

e4fsck -b 8193 <device>

[root hobbes ~]# mke2fs -n /dev/sdc1

mke2fs 1.39 (29-May-2006)

Filesystem label=

OS type: Linux

Block size=4096 (log=2)

Fragment size=4096 (log=2)

488292352 inodes, 976555199 blocks

48827759 blocks (5.00%) reserved for the super user

First data block=0

Maximum filesystem blocks=4294967296

29803 block groups

32768 blocks per group, 32768 fragments per group

16384 inodes per group

Superblock backups stored on blocks:

32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632,
2654208,

4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,

102400000, 214990848, 512000000, 550731776, 644972544

At this point they didn't know what to do and told me the filesystem was
probably beyond repair. This is when I posted to this mailing list.

Post by Zdenek Kabelac
Obviously your device /dev/sdc1 had 7812456381 sectors.
(Very strange to have odd number here....)

This was setup by manufacturer

Post by Zdenek Kabelac
So we MUST start from the moment you tell us what you did to your system

that suddenly your device is 14785 blocks shorter (~8MB) ?

Hopefully the information above fills you. If not I am not sure what
happened.

Have you reconfigured your /dev/sdc device?
No

Is it HW raid5 device ?
This is a hardware Raid5

/home is controlled by a 3ware card:

Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy

----------------------------------------------------------------------------
--

u0 RAID-5 OK - - 256K 3725.27 RiW ON

u1 SPARE OK - - - 1863.01 - OFF

VPort Status Unit Size Type Phy Encl-Slot Model

----------------------------------------------------------------------------
--

p0 OK u0 1.82 TB SATA 0 - WDC WD2000FYYZ-01UL

p1 OK u1 1.82 TB SATA 1 - WDC WD2002FYPS-01U1

p2 OK u0 1.82 TB SATA 2 - WDC WD2002FYPS-01U1

p3 OK u0 1.82 TB SATA 3 - WDC WD2002FYPS-01U1

Post by Zdenek Kabelac
Have you repartitioned/resized it (fdisk,gparted) ?

No, just did some fdisk -l

Post by Zdenek Kabelac
I just hope you have not tried to play directly with your /dev/sdc device

(Since in some emails it seems you try to execute various command directly
on >this device)

Besides the commands above and mentioned in these posts I have not tried
anything on /dev/sdc1.

I have had issues with the RAID5 in the past with bad drives. Could
something have happened during the shutdown since the issues arose after
that?

Thanks for the support!

Ryan

-----Original Message-----
From: Zdenek Kabelac [mailto:***@gmail.com]
Sent: Wednesday, April 23, 2014 1:00 AM
To: LVM general discussion and development; ***@ucdavis.edu
Subject: Re: Can't mount LVM RAID5 drives

Post by Zdenek Kabelac
Hi Peter,
Thanks for the support.
Everything ran smooth until I did a fsck on the FS on the LV. It's
complaining about a bad superblock

Saying something runs smooth here is somewhat pointless...

Looking at your lvmdump --

pv0 {
id = "8D67bX-xg4s-QRy1-4E8n-XfiR-0C2r-Oi1Blf"
device = "/dev/sdc1" # Hint only

status = ["ALLOCATABLE"]
flags = []
dev_size = 7812456381 # 3.63796 Terabytes
pe_start = 384
pe_count = 953668 # 3.63795
}

This is how your PV was looking when you have created your VG.

Obviously your device /dev/sdc1 had 7812456381 sectors.
(Very strange to have odd number here....)

Later you report # blockdev --getsz /dev/sdc1 as 7812441596

So we MUST start from the moment you tell us what you did to your system
that suddenly your device is 14785 blocks shorter (~8MB) ?

Have you reconfigured your /dev/sdc device?
Is it HW raid5 device ?
Have you repartitioned/resized it (fdisk,gparted) ?

We can't move forward without knowing exact roots of your problem ?

Everything else is pointless waste of time since we will just try to hunt
some random piece of information?

I just hope you have not tried to play directly with your /dev/sdc device
(Since in some emails it seems you try to execute various command directly
on this device)

Zdenek

Zdenek Kabelac

2014-04-24 09:38:33 UTC

Permalink

Post by Ryan Davis
Hi Zdenek,
I was running some analysis tools on some genomic data stored on the LV. I

Ooops - looks like Nobel price won't be this year ?

Post by Ryan Davis
I shutdown the system and then physically moved the server to a new location
and upon booting up the system for the first time in the new location I

So you have physically took whole machine 'as-is' (with all cables, disks...)
you haven't touched anything inside the box - you just basically plugged it
into a different power socket ?

There was no machine hw upgrade/change and no software upgrade meanwhile right ?

Post by Ryan Davis
The superblock could not be read or does not describe
a correct ext2fs.
device-mapper: reload ioctl failed invalid argument

Since the lvmdump contained only very tiny bit portion of your
/var/log/messages - could you grab a bigger piece (I assume it's been rotated)

Package it and post a link so I could check myself what has been logged.
Ideally make sure the log contains all info from last successful boot and
home mount (might be long time if you do not reboot machine often)

Post by Ryan Davis
device-mapper table device 8:33 too small for target

This is crucial error message - lvm2 has detected major problem,
PV got smaller and can't be used - admin needs to resolve the problem.

Since I've already seen some cheap raid5 arrays are able to demolish itself
easily via reshape.

I assume it will be mandatory to collect messages related to output of your
hardware card - there mostly likely will be some message pointing to
the moment, when the array went mad...

Post by Ryan Davis
[root hobbes ~]# mount -t ext4 /dev/vg_data/lv_home /home
mount: wrong fs type, bad option, bad superblock on /dev/vg_data/lv_home,

Surely this will not work.

Post by Ryan Davis
[root hobbes ~]# fsck.ext4 -v /dev/sdc1

Definitelly can't be used this way - LV does it's own mapping and
has it own disk headers - so you would need to use proper offset
to start at right place here.

Post by Ryan Davis

Post by Zdenek Kabelac
Obviously your device /dev/sdc1 had 7812456381 sectors.
(Very strange to have odd number here....)

This was setup by manufacturer

I'm afraid it's not here the problem lvm - but rather hw array.

The next thing you should capture and post a link to is this:

grab first MB of your /dev/sdc1

dd if=/dev/sdc1 of=/tmp/grab bs=1M count=1

And send it - it should contain ring buffer with lvm2 metadata -
and it might give us some clue how your disks actually look like
since when array goes 'mad' - it typically destroys data - so
we will see garbage in the place of ring buffer instead sequence of metadata
(though you have there only few entries - which might not be enough to make
any judgement...) - but anyway worth a try...

Post by Ryan Davis

Post by Zdenek Kabelac
So we MUST start from the moment you tell us what you did to your system

that suddenly your device is 14785 blocks shorter (~8MB) ?

I just doubt those 14785 had been lost from the end of you drive - but those
missing sectors could be located across your whole device
(array has changed its geometry??)

Then you have to contact your HW raid5 provider - to do their analysis
(since their format is typically proprietary - which is IMHO the major reason
to never use it to me.....)

I assume the key thing for solution is to 'restore' the original geometry of
your raid5 array (without reinitialization of your array)
(so the size will again report the original value)

Post by Ryan Davis
Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
----------------------------------------------------------------------------
--
u0 RAID-5 OK - - 256K 3725.27 RiW ON
u1 SPARE OK - - - 1863.01 - OFF
VPort Status Unit Size Type Phy Encl-Slot Model
----------------------------------------------------------------------------
--
p0 OK u0 1.82 TB SATA 0 - WDC WD2000FYYZ-01UL
p1 OK u1 1.82 TB SATA 1 - WDC WD2002FYPS-01U1
p2 OK u0 1.82 TB SATA 2 - WDC WD2002FYPS-01U1
p3 OK u0 1.82 TB SATA 3 - WDC WD2002FYPS-01U1

Post by Zdenek Kabelac
Have you repartitioned/resized it (fdisk,gparted) ?

No, just did some fdisk -l

Post by Zdenek Kabelac
I just hope you have not tried to play directly with your /dev/sdc device

Yes - replacing invalid drive might have had impact on the raid5 array
geometry - but as said above - 3ware provider and its support team needs to do
the analysis here.

Zdenek

Ryan Davis

2014-04-10 15:35:30 UTC

Permalink

Thank you so much for the help. I will work through your pipeline but first want to try to back some of the data up. I have about 80% backed up from a week or so ago. How would one go about backing it up without being able to mount this. Sorry if that is dumb question.

Thank you again for figuring this out.

Ryan

Post by Peter Rajnoha

So we have 953668 extents, each one having 4MiB, that's 7812448256 sectors
(512-byte sectors). Then we need to add the PE start value which is 192 KiB,
which means the original device size during creation of this PV was
7812448256 + 384 = 7812448640 sectors.
7812441596 - 7812448640 = -7044 sectors
So the disk drive is about 3.44MiB shorter now for some reason.
That's why the LV does not fit here.
I can't tell you why this happened exactly. But that's what the
sizes show.
What you can do here to fix this is to resize your filesystem/LV/PV accordingly.
- if it's possible, do a backup of the disk content!!!
- double check it's really /dev/sdc1 still as during reboots,
it can be assigned a different name by kernel
1. you can check which LV is mapped onto the PV by issuing
pvdisplay --maps /dev/sdc1
2. then deactivate one LV found on that PV (if there are more LVs mapped
on the PV, choose the LV that is mapped at the end of the disk since
it's more probable that the disk is shorter at the end when compared
to original size)
lvchange -an <the_LV_found_on_the_PV>
3. then reduce the LV size by one extent (1 should be enough since the
PV is shorter with 3.44 MiB) *also* with resizing the filesystem
that's on the LV!!! (this is the "-f" option for the lvreduce, it's
very important!!!)
lvreduce -f -l -1 <the_LV_found_on_the_PV>
pvresize /dev/sdc1
5. now activate the LVs you deactivated in step 2.
lvchange -ay <the_LVs_found_on_the_PV>
Note that this will only work if it's possible to resize the filesystem
and the LV data are not fully allocated! (in which case you probably
lost some data already)
Take this as a hint only and be very very careful when doing this
as you may lose data when this is done incorrectly!
I'm not taking responsibility for any data loss.
If you have any more questions, feel free to ask.
--
Peter

Peter Rajnoha

2014-04-10 16:40:47 UTC

Permalink

Post by Ryan Davis
Thank you so much for the help. I will work through your pipeline but first want to try to back some of the data up. I have about 80% backed up from a week or so ago. How would one go about backing it up without being able to mount this. Sorry if that is dumb question.

You can use "dd" command for raw disk copy.

--
Peter