Discussion:
[linux-lvm] broken fs after removing disk from group
Marc des Garets
2014-11-12 22:16:13 UTC
Permalink
Hi,

I messed up a bit and I am trying to find the best way to recover.

A few days ago, 1 of the physical disk of my lvm setup started to show
sign of failure (I/O errors) so I decided to move it to another disk
with pvmove. That didn't work out. After 5 days, pvmove had done only
0.1% so I stopped it.

After a reboot, the dying disk wouldn't show up at all, it had died
completely so I decided to remove it with vgreduce --removemissing
--force VolGroup00

Problem is that it refused to do so because of the pvmove saying the LV
was locked. I tried pvmove --abort which refused to do so because of the
missing disk that died.

So I was stuck and did: vgcfgbackup VolGroup00

Then I edited the file, removed the entry about pvmove, tried
vgcfgbackup VolGroup00 which refused to restore because of the missing
disk so I edited the file again, removed the missing disk from there and
did the vgcfgrestore which succeeded.

Now the problem is that I can't mount my volume because it says:
wrong fs type, bad option, bad superblock

Which makes sense as the size of the partition is supposed to be 2.4Tb
but now has only 2.2Tb. Now the question is how do I fix this? Should I
use a tool like testdisk or should I be able to somehow create a new
physical volume / volume group where I can add my logical volumes which
consist of 2 physical disks and somehow get the file system right (file
system is ext4)?

pvdisplay output:

--- Physical volume ---
PV Name /dev/sda4
VG Name VolGroup00
PV Size 417.15 GiB / not usable 4.49 MiB
Allocatable yes (but full)
PE Size 4.00 MiB
Total PE 106789
Free PE 0
Allocated PE 106789
PV UUID dRhDoK-p2Dl-ryCc-VLhC-RbUM-TDUG-2AXeWQ

--- Physical volume ---
PV Name /dev/sdb1
VG Name VolGroup00
PV Size 1.82 TiB / not usable 4.97 MiB
Allocatable yes (but full)
PE Size 4.00 MiB
Total PE 476923
Free PE 0
Allocated PE 476923
PV UUID MF46QJ-YNnm-yKVr-pa3W-WIk0-seSr-fofRav


Thank you for your help.


Marc
Fran Garcia
2014-11-12 23:11:23 UTC
Permalink
Post by Marc des Garets
Hi,
[...]
wrong fs type, bad option, bad superblock
Which makes sense as the size of the partition is supposed to be 2.4Tb but
now has only 2.2Tb. Now the question is how do I fix this? Should I use a
tool like testdisk or should I be able to somehow create a new physical
volume / volume group where I can add my logical volumes which consist of 2
physical disks and somehow get the file system right (file system is ext4)?
So you basically need a tool that will "invent" about 200 *Gb* of
missing filesystem? :-)

I think you better start grabbing your tapes for a restore...

~f
Marc des Garets
2014-11-13 07:21:06 UTC
Permalink
I think something is possible. I still have the config from before it
died. Below is how it was. The disk that died (and which I removed) is
pv1 (/dev/sdc1) but it doesn't want to restore this config because it
says the disk is missing.

VolGroup00 {
id = "a0p2ke-sYDF-Sptd-CM2A-fsRQ-jxPI-6sMc9Y"
seqno = 4
format = "lvm2" # informational
status = ["RESIZEABLE", "READ", "WRITE"]
flags = []
extent_size = 8192 # 4 Megabytes
max_lv = 0
max_pv = 0
metadata_copies = 0

physical_volumes {

pv0 {
id = "dRhDoK-p2Dl-ryCc-VLhC-RbUM-TDUG-2AXeWQ"
device = "/dev/sda4" # Hint only

status = ["ALLOCATABLE"]
flags = []
dev_size = 874824678 # 417.149 Gigabytes
pe_start = 2048
pe_count = 106789 # 417.145 Gigabytes
}

pv1 {
id = "NOskcl-8nOA-PpZg-DCtW-KQgG-doKw-n3J9xd"
device = "/dev/sdc1" # Hint only

status = ["ALLOCATABLE"]
flags = []
dev_size = 625142385 # 298.091 Gigabytes
pe_start = 2048
pe_count = 76311 # 298.09 Gigabytes
}

pv2 {
id = "MF46QJ-YNnm-yKVr-pa3W-WIk0-seSr-fofRav"
device = "/dev/sdb1" # Hint only

status = ["ALLOCATABLE"]
flags = []
dev_size = 3906963393 # 1.81932 Terabytes
pe_start = 2048
pe_count = 476923 # 1.81932 Terabytes
}
}

logical_volumes {

lvolmedia {
id = "aidfLk-hjlx-Znrp-I0Pb-JtfS-9Fcy-OqQ3EW"
status = ["READ", "WRITE", "VISIBLE"]
flags = []
creation_host = "archiso"
creation_time = 1402302740 # 2014-06-09
10:32:20 +0200
segment_count = 3

segment1 {
start_extent = 0
extent_count = 476923 # 1.81932 Terabytes

type = "striped"
stripe_count = 1 # linear

stripes = [
"pv2", 0
]
}
segment2 {
start_extent = 476923
extent_count = 106789 # 417.145 Gigabytes

type = "striped"
stripe_count = 1 # linear

stripes = [
"pv0", 0
]
}
segment3 {
start_extent = 583712
extent_count = 76311 # 298.09 Gigabytes

type = "striped"
stripe_count = 1 # linear

stripes = [
"pv1", 0
]
}
}
}
}
Post by Fran Garcia
Post by Marc des Garets
Hi,
[...]
wrong fs type, bad option, bad superblock
Which makes sense as the size of the partition is supposed to be 2.4Tb but
now has only 2.2Tb. Now the question is how do I fix this? Should I use a
tool like testdisk or should I be able to somehow create a new physical
volume / volume group where I can add my logical volumes which consist of 2
physical disks and somehow get the file system right (file system is ext4)?
So you basically need a tool that will "invent" about 200 *Gb* of
missing filesystem? :-)
I think you better start grabbing your tapes for a restore...
~f
_______________________________________________
linux-lvm mailing list
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Marc des Garets
2014-11-13 09:47:42 UTC
Permalink
For example, what about if I take a new disk and I do this (/dev/sdc
being a new empty disk):

pvcreate --uuid NOskcl-8nOA-PpZg-DCtW-KQgG-doKw-n3J9xd /dev/sdc

NOskcl-8nOA-PpZg-DCtW-KQgG-doKw-n3J9xd is the id of the disk that died
before. This new disk is 1.8Tb instead of 298Gb though.

Then I restore the lvm metadata I posted in my previous email then
vgscan and vgchange like this:
vgcfgrestore VolGroup00
vgscan
vgchange -ay VolGroup00

And then I fsck:
e2fsck /dev/VolGroup00/lvolmedia
Post by Marc des Garets
I think something is possible. I still have the config from before it
died. Below is how it was. The disk that died (and which I removed) is
pv1 (/dev/sdc1) but it doesn't want to restore this config because it
says the disk is missing.
VolGroup00 {
id = "a0p2ke-sYDF-Sptd-CM2A-fsRQ-jxPI-6sMc9Y"
seqno = 4
format = "lvm2" # informational
status = ["RESIZEABLE", "READ", "WRITE"]
flags = []
extent_size = 8192 # 4 Megabytes
max_lv = 0
max_pv = 0
metadata_copies = 0
physical_volumes {
pv0 {
id = "dRhDoK-p2Dl-ryCc-VLhC-RbUM-TDUG-2AXeWQ"
device = "/dev/sda4" # Hint only
status = ["ALLOCATABLE"]
flags = []
dev_size = 874824678 # 417.149 Gigabytes
pe_start = 2048
pe_count = 106789 # 417.145 Gigabytes
}
pv1 {
id = "NOskcl-8nOA-PpZg-DCtW-KQgG-doKw-n3J9xd"
device = "/dev/sdc1" # Hint only
status = ["ALLOCATABLE"]
flags = []
dev_size = 625142385 # 298.091 Gigabytes
pe_start = 2048
pe_count = 76311 # 298.09 Gigabytes
}
pv2 {
id = "MF46QJ-YNnm-yKVr-pa3W-WIk0-seSr-fofRav"
device = "/dev/sdb1" # Hint only
status = ["ALLOCATABLE"]
flags = []
dev_size = 3906963393 # 1.81932 Terabytes
pe_start = 2048
pe_count = 476923 # 1.81932 Terabytes
}
}
logical_volumes {
lvolmedia {
id = "aidfLk-hjlx-Znrp-I0Pb-JtfS-9Fcy-OqQ3EW"
status = ["READ", "WRITE", "VISIBLE"]
flags = []
creation_host = "archiso"
creation_time = 1402302740 # 2014-06-09
10:32:20 +0200
segment_count = 3
segment1 {
start_extent = 0
extent_count = 476923 # 1.81932 Terabytes
type = "striped"
stripe_count = 1 # linear
stripes = [
"pv2", 0
]
}
segment2 {
start_extent = 476923
extent_count = 106789 # 417.145 Gigabytes
type = "striped"
stripe_count = 1 # linear
stripes = [
"pv0", 0
]
}
segment3 {
start_extent = 583712
extent_count = 76311 # 298.09 Gigabytes
type = "striped"
stripe_count = 1 # linear
stripes = [
"pv1", 0
]
}
}
}
}
Post by Fran Garcia
Post by Marc des Garets
Hi,
[...]
wrong fs type, bad option, bad superblock
Which makes sense as the size of the partition is supposed to be 2.4Tb but
now has only 2.2Tb. Now the question is how do I fix this? Should I use a
tool like testdisk or should I be able to somehow create a new physical
volume / volume group where I can add my logical volumes which consist of 2
physical disks and somehow get the file system right (file system is ext4)?
So you basically need a tool that will "invent" about 200 *Gb* of
missing filesystem? :-)
I think you better start grabbing your tapes for a restore...
~f
_______________________________________________
linux-lvm mailing list
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Marc des Garets
2014-11-13 19:27:45 UTC
Permalink
Obviously data on the bad disk is gone. Can you explain why the entire
file system is gone also?

I did what I said in my previous email and so far it has worked pretty
well. The idea came from this:
https://www.novell.com/coolsolutions/appnote/19386.html#DiskPermanentlyRemoved

So I took a disk and did the: pvcreate --uuid
NOskcl-8nOA-PpZg-DCtW-KQgG-doKw-n3J9xd --restorefile
VolGroup00_00001-16738001.vg /dev/sdc1

The restorefile being the config before the disk died.

Then I did vgcfgrestore with the VolGroup00_00001-16738001.vg followed
by vgscan and vgchange -ay VolGroup00

All the above went well, exactly like on the novell.com link.

pvdisplay shows the 3 disks exactly like before I had the one that died
but the e2fsck (or fsck.ext4) tells me:
The filesystem size (according to the superblock) is 675863552 blocks
The physical size of the device is 597721088 blocks
Either the superblock or the partition table is likely to be corrupt!
Abort<y>?

That's what I don't understand, shouldn't the size of the device be back
to 675863552 blocks and so I should be able to do the fsck without
getting this warning?

Thanks.
Dude seriously? Any data that was on the bad disk is gone including
the ENTIRE file system if any of it resided on said disk.
Moral of the story use better disks and don't spread file systems
across multiple devices.
Post by Marc des Garets
Hi,
[...]
wrong fs type, bad option, bad superblock
Which makes sense as the size of the partition is supposed to be 2.4Tb but
now has only 2.2Tb. Now the question is how do I fix this? Should I use a
tool like testdisk or should I be able to somehow create a new physical
volume / volume group where I can add my logical volumes which consist of 2
physical disks and somehow get the file system right (file system is ext4)?
So you basically need a tool that will "invent" about 200 *Gb* of
missing filesystem? :-)
I think you better start grabbing your tapes for a restore...
~f
_______________________________________________
linux-lvm mailing list
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
matthew patton
2014-11-13 19:48:18 UTC
Permalink
Post by Marc des Garets
https://www.novell.com/coolsolutions/appnote/19386.html#DiskPermanentlyRemoved
pvdisplay shows the 3 disks exactly like before I had the one that
The filesystem size (according to the superblock) is 675863552 blocks
The physical size of the device is 597721088 blocks
what about LVdisplay? Clearly the replacement chunk from the new device isn't the correct size. It's short by 39071232KB or roughly 37GB.

Is the VG configuration restore file human readable?
Marc des Garets
2014-11-13 20:04:36 UTC
Permalink
Yes, it's human readable. It's the one I sent in my previous email.

lvdisplay gives the following:
--- Logical volume ---
LV Path /dev/VolGroup00/lvolmedia
LV Name lvolmedia
VG Name VolGroup00
LV UUID aidfLk-hjlx-Znrp-I0Pb-JtfS-9Fcy-OqQ3EW
LV Write Access read/write
LV Creation host, time archiso, 2014-06-09 10:32:20 +0200
LV Status suspended
# open 0
LV Size 2.52 TiB
Current LE 660023
Segments 3
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 254:0
Post by matthew patton
Post by Marc des Garets
https://www.novell.com/coolsolutions/appnote/19386.html#DiskPermanentlyRemoved
pvdisplay shows the 3 disks exactly like before I had the one that
The filesystem size (according to the superblock) is 675863552 blocks
The physical size of the device is 597721088 blocks
what about LVdisplay? Clearly the replacement chunk from the new device isn't the correct size. It's short by 39071232KB or roughly 37GB.
Is the VG configuration restore file human readable?
_______________________________________________
linux-lvm mailing list
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Marc des Garets
2014-11-14 08:42:34 UTC
Permalink
I am even able to mount the volume actually now and get to the data but
I can't do a fsck on the disk because of the difference of block between
physical size and filesystem size. I really don't feel like copying all
my data to recreate the lvm from scratch...

I tried what was suggested here:
http://sourceforge.net/p/e2fsprogs/discussion/7053/thread/c9e05785/

debugfs -w /dev/sda1
debugfs: set_super_value blocks_count 597721088
debugfs: quit

Where I had to compile e2fsprogs as explained in the thread but it
didn't work out anyway. All I gained from doing this is another warning
when running fsck:
ext2fs_open2: The ext2 superblock is corrupt
fsck.ext4: Superblock invalid, trying backup blocks...

Guess it's happy with the backup blocks but it still complains about
difference in block size.
Post by Marc des Garets
Yes, it's human readable. It's the one I sent in my previous email.
--- Logical volume ---
LV Path /dev/VolGroup00/lvolmedia
LV Name lvolmedia
VG Name VolGroup00
LV UUID aidfLk-hjlx-Znrp-I0Pb-JtfS-9Fcy-OqQ3EW
LV Write Access read/write
LV Creation host, time archiso, 2014-06-09 10:32:20 +0200
LV Status suspended
# open 0
LV Size 2.52 TiB
Current LE 660023
Segments 3
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 254:0
Post by matthew patton
Post by Marc des Garets
https://www.novell.com/coolsolutions/appnote/19386.html#DiskPermanentlyRemoved
pvdisplay shows the 3 disks exactly like before I had the one
that
Post by Marc des Garets
The filesystem size (according to the superblock) is 675863552 blocks
The physical size of the device is 597721088 blocks
what about LVdisplay? Clearly the replacement chunk from the new
device isn't the correct size. It's short by 39071232KB or roughly 37GB.
Is the VG configuration restore file human readable?
_______________________________________________
linux-lvm mailing list
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Jack Waterworth
2014-11-14 17:12:10 UTC
Permalink
If you get anything back from this file-system I would be impressed.
Losing 200GB of a file-system isnt something small that can just be
"made up" to restore the data. The only options I can think of would be
to (1) restore from backup, (2) copy as much data as you can off onto a
new location, (3) fsck. #1 is probably the best in your situation,
however if that's not an option you should try to save as much data as
possible (#2).

a fsck should be able to repair the file-system back into a "good"
state, but you'll likely lose a lot of data. Can you show us the full
error output you're getting when you try to fsck?

Also, keep in mind that this is an LVM mailing list... not a filesystem
one. Perhaps you could get some help in #ext4 on the OFTC IRC.

Jack
Post by Marc des Garets
ext2fs_open2: The ext2 superblock is corrupt
Marc des Garets
2014-11-14 19:16:22 UTC
Permalink
I got a lot back from this file system. I only lost what was on that
dead disk and by chance there weren't much data on it, most is on the
other 2. I already know what I lost, I was lucky and it's not very
important: some GC heap dumps and some backup of phone sd card.

I made a backup of what was important now and so I'll be trying fsck. I
hope it will work because I don't feel like rebuilding my lvm and copy
everything again and lose what I consider not important.

Thanks.
Post by Jack Waterworth
If you get anything back from this file-system I would be impressed.
Losing 200GB of a file-system isnt something small that can just be
"made up" to restore the data. The only options I can think of would
be to (1) restore from backup, (2) copy as much data as you can off
onto a new location, (3) fsck. #1 is probably the best in your
situation, however if that's not an option you should try to save as
much data as possible (#2).
a fsck should be able to repair the file-system back into a "good"
state, but you'll likely lose a lot of data. Can you show us the full
error output you're getting when you try to fsck?
Also, keep in mind that this is an LVM mailing list... not a
filesystem one. Perhaps you could get some help in #ext4 on the OFTC IRC.
Jack
Post by Marc des Garets
ext2fs_open2: The ext2 superblock is corrupt
_______________________________________________
linux-lvm mailing list
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Marc des Garets
2014-11-14 19:30:58 UTC
Permalink
I am in luck, I rebooted and fsck no longer complains about the size of
the physical blocks that differ from the filesystem so now it's running
and I believe it should be successful. I expected the reboot would sort
this out after I saw the lvdisplay was giving the right size but that
fdisk -l was not. I can still mount my partition and access the files so
now I expect fsck to get rid of the files that were on the dead disk and
which are now obviously giving I/O errors.
Post by Marc des Garets
I got a lot back from this file system. I only lost what was on that
dead disk and by chance there weren't much data on it, most is on the
other 2. I already know what I lost, I was lucky and it's not very
important: some GC heap dumps and some backup of phone sd card.
I made a backup of what was important now and so I'll be trying fsck.
I hope it will work because I don't feel like rebuilding my lvm and
copy everything again and lose what I consider not important.
Thanks.
Post by Jack Waterworth
If you get anything back from this file-system I would be impressed.
Losing 200GB of a file-system isnt something small that can just be
"made up" to restore the data. The only options I can think of would
be to (1) restore from backup, (2) copy as much data as you can off
onto a new location, (3) fsck. #1 is probably the best in your
situation, however if that's not an option you should try to save as
much data as possible (#2).
a fsck should be able to repair the file-system back into a "good"
state, but you'll likely lose a lot of data. Can you show us the full
error output you're getting when you try to fsck?
Also, keep in mind that this is an LVM mailing list... not a
filesystem one. Perhaps you could get some help in #ext4 on the OFTC IRC.
Jack
Post by Marc des Garets
ext2fs_open2: The ext2 superblock is corrupt
_______________________________________________
linux-lvm mailing list
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Marc des Garets
2014-11-14 20:54:10 UTC
Permalink
It's all sorted, fsck did the job and the only thing I lost is what was
on the dead disk :-)
Post by Marc des Garets
I am in luck, I rebooted and fsck no longer complains about the size
of the physical blocks that differ from the filesystem so now it's
running and I believe it should be successful. I expected the reboot
would sort this out after I saw the lvdisplay was giving the right
size but that fdisk -l was not. I can still mount my partition and
access the files so now I expect fsck to get rid of the files that
were on the dead disk and which are now obviously giving I/O errors.
Post by Marc des Garets
I got a lot back from this file system. I only lost what was on that
dead disk and by chance there weren't much data on it, most is on the
other 2. I already know what I lost, I was lucky and it's not very
important: some GC heap dumps and some backup of phone sd card.
I made a backup of what was important now and so I'll be trying fsck.
I hope it will work because I don't feel like rebuilding my lvm and
copy everything again and lose what I consider not important.
Thanks.
Post by Jack Waterworth
If you get anything back from this file-system I would be impressed.
Losing 200GB of a file-system isnt something small that can just be
"made up" to restore the data. The only options I can think of would
be to (1) restore from backup, (2) copy as much data as you can off
onto a new location, (3) fsck. #1 is probably the best in your
situation, however if that's not an option you should try to save as
much data as possible (#2).
a fsck should be able to repair the file-system back into a "good"
state, but you'll likely lose a lot of data. Can you show us the
full error output you're getting when you try to fsck?
Also, keep in mind that this is an LVM mailing list... not a
filesystem one. Perhaps you could get some help in #ext4 on the OFTC IRC.
Jack
Post by Marc des Garets
ext2fs_open2: The ext2 superblock is corrupt
_______________________________________________
linux-lvm mailing list
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Continue reading on narkive:
Search results for '[linux-lvm] broken fs after removing disk from group' (Questions and Answers)
4
replies
How can I understand the numbers below?
started 2006-03-09 04:18:16 UTC
computers & internet
Loading...