Discussion:
[linux-lvm] fix corrupted thin pool
Vasiliy Tolstov
2014-10-24 19:59:20 UTC
Permalink
Hello! By mistake i'm restore by vcfgrestore thin volume, after that i
have errors on this thin pool on all volumes like :
lvchange -ay vg1/2735
Thin pool transaction_id=120, while expected: 114.
Does it possible to recovery from this? I'm try lvconvert --recover
and get tp1_tmeta0 but i'm don't understand whan i need to do next..?
--
Vasiliy Tolstov,
e-mail: ***@selfip.ru
jabber: ***@selfip.ru
Zdenek Kabelac
2014-10-25 12:43:09 UTC
Permalink
Post by Vasiliy Tolstov
Hello! By mistake i'm restore by vcfgrestore thin volume, after that i
lvchange -ay vg1/2735
Thin pool transaction_id=120, while expected: 114.
Does it possible to recovery from this? I'm try lvconvert --recover
and get tp1_tmeta0 but i'm don't understand whan i need to do next..?
Hi

I'm not sure how you could do that 'by a mistake' since LVM is printing pretty
BIG WARNING that any vgcfgrestore with thin should be done after big thinking
and requires even extra --force option.

But anyway - if you have /etc/lvm/archive - you should probably be able to
find the 'right' version of lvm2 metadata for your kernel metadata.

However 'normally' you could be off the sequence number only by one! so
I'm quite curious what you've been able to make such big difference.

If you could - package /etc/lvm/archive so I could get closer look where the
lvm2 has holes to allow such operations ?

Which version of lvm2 and kernel is here in use ?

Have you been manipulating with thin-pool's metadata in any way ?

Regards

Zdenek
Vasiliy Tolstov
2014-10-25 18:41:39 UTC
Permalink
Post by Zdenek Kabelac
I'm not sure how you could do that 'by a mistake' since LVM is printing
pretty BIG WARNING that any vgcfgrestore with thin should be done after big
thinking and requires even extra --force option.
But anyway - if you have /etc/lvm/archive - you should probably be able to
find the 'right' version of lvm2 metadata for your kernel metadata.
However 'normally' you could be off the sequence number only by one! so
I'm quite curious what you've been able to make such big difference.
If you could - package /etc/lvm/archive so I could get closer look where
the lvm2 has holes to allow such operations ?
Which version of lvm2 and kernel is here in use ?
Have you been manipulating with thin-pool's metadata in any way ?
Regards
Zdenek
I can't provide old archive data =(, Now i only have this error..
Also in lvm conf i have issue_discards =1
--
Vasiliy Tolstov,
e-mail: ***@selfip.ru
jabber: ***@selfip.ru
Vasiliy Tolstov
2014-10-25 18:42:17 UTC
Permalink
Post by Zdenek Kabelac
Which version of lvm2 and kernel is here in use ?
Have you been manipulating with thin-pool's metadata in any way ?
I'm use 3.10.55 kernel and lvm 2.02.106
--
Vasiliy Tolstov,
e-mail: ***@selfip.ru
jabber: ***@selfip.ru
Zdenek Kabelac
2014-10-25 20:18:42 UTC
Permalink
Post by Vasiliy Tolstov
Post by Zdenek Kabelac
I'm not sure how you could do that 'by a mistake' since LVM is printing
pretty BIG WARNING that any vgcfgrestore with thin should be done after big
thinking and requires even extra --force option.
But anyway - if you have /etc/lvm/archive - you should probably be able to
find the 'right' version of lvm2 metadata for your kernel metadata.
However 'normally' you could be off the sequence number only by one! so
I'm quite curious what you've been able to make such big difference.
If you could - package /etc/lvm/archive so I could get closer look where
the lvm2 has holes to allow such operations ?
Which version of lvm2 and kernel is here in use ?
Have you been manipulating with thin-pool's metadata in any way ?
Regards
Zdenek
I can't provide old archive data =(, Now i only have this error..
Also in lvm conf i have issue_discards =1
There is 'internal' metadata archive then -

dd if=/dev/your_pv_volume of=/tmp/1st.megabyte bs=1M count=1

It's will capture first megabyte of your PV where are embedded
metadata of your Volume group.

If you are not skilled enough - tar.gz and send this file to me.

Zdenek
Vasiliy Tolstov
2014-10-25 20:53:04 UTC
Permalink
Post by Zdenek Kabelac
There is 'internal' metadata archive then -
dd if=/dev/your_pv_volume of=/tmp/1st.megabyte bs=1M count=1
It's will capture first megabyte of your PV where are embedded
metadata of your Volume group.
If you are not skilled enough - tar.gz and send this file to me.
I'm do dd and send it. While i'm break thin pool i'm try to restore volume 2657.
But i don't stop lvm thin pool =(.
--
Vasiliy Tolstov,
e-mail: ***@selfip.ru
jabber: ***@selfip.ru
Zdenek Kabelac
2014-10-25 22:47:51 UTC
Permalink
Post by Vasiliy Tolstov
Post by Zdenek Kabelac
There is 'internal' metadata archive then -
dd if=/dev/your_pv_volume of=/tmp/1st.megabyte bs=1M count=1
It's will capture first megabyte of your PV where are embedded
metadata of your Volume group.
If you are not skilled enough - tar.gz and send this file to me.
I'm do dd and send it. While i'm break thin pool i'm try to restore volume 2657.
But i don't stop lvm thin pool =(.
From the metadata something bad was going one:

Fri Oct 24 17:03:04 2014

transaction_id = 120 - create = "3695"

And suddenly on Fri Oct 24 18:07:23 2014
pool is back on older transaction_id

transaction_id = 114


Is that the time of your vgcfgrestore?

I'm attaching those metadata which you likely should put back to get in sync
with your kernel metadata (assuming you have not modified those in any way)

Zdenek
Vasiliy Tolstov
2014-10-26 19:46:45 UTC
Permalink
Post by Zdenek Kabelac
Fri Oct 24 17:03:04 2014
transaction_id = 120 - create = "3695"
And suddenly on Fri Oct 24 18:07:23 2014
pool is back on older transaction_id
transaction_id = 114
Is that the time of your vgcfgrestore?
I'm attaching those metadata which you likely should put back to get in sync
with your kernel metadata (assuming you have not modified those in any way)
Hm yes, i miss that one vg created in this time, thanks. As i
understand transaction id needs to be changed? Or something other?
But i have error:

lvchange -ay vg1/2735
Check of thin pool vg1/tp1 failed (status:1). Manual repair required
(thin_dump --repair /dev/mapper/vg1-tp1_tmeta)!
--
Vasiliy Tolstov,
e-mail: ***@selfip.ru
jabber: ***@selfip.ru
Zdenek Kabelac
2014-10-27 09:15:47 UTC
Permalink
Post by Vasiliy Tolstov
Post by Zdenek Kabelac
Fri Oct 24 17:03:04 2014
transaction_id = 120 - create = "3695"
And suddenly on Fri Oct 24 18:07:23 2014
pool is back on older transaction_id
transaction_id = 114
Is that the time of your vgcfgrestore?
I'm attaching those metadata which you likely should put back to get in sync
with your kernel metadata (assuming you have not modified those in any way)
Hm yes, i miss that one vg created in this time, thanks. As i
understand transaction id needs to be changed? Or something other?
lvchange -ay vg1/2735
Check of thin pool vg1/tp1 failed (status:1). Manual repair required
(thin_dump --repair /dev/mapper/vg1-tp1_tmeta)!
If you would have latest lvm2 tools - you could have tried:

lvconvert --repair vg/pool


With older tools - you need to go in these manual step:


1. create temporary small LV
# lvcreate -an -Zn -L10 --name temp vg

2. replace pool's metadata volume with this tempLV
# lvconvert --thinpool vg/pool --poolmetadata temp
(say 'y' to swap)

3. activate & repair metadata from 'temp' volume - you will likely need
another volume where to store repaire metadata -
so create:
# lvcreate -Lat_least_as_big_as_temp --name repaired vg
# lvchage -ay vg/temp
# thin_repair -i /dev/vg/temp /dev/vg/repaired

if everything when fine - compare visualy 'transaction_id' of repaired
metadata (thin_dump /dev/vg/repaired)

4. swap deactivated repaired volume back to your thin-pool
# lvchange -an vg/repaired
# lvconvert --thinpool vg/pool --poolmetadata repaired

try to activate pool - if it doesn't work report more problems.

Zdenek
Vasiliy Tolstov
2014-10-28 13:55:12 UTC
Permalink
Post by Zdenek Kabelac
lvconvert --repair vg/pool
4. swap deactivated repaired volume back to your thin-pool
# lvchange -an vg/repaired
# lvconvert --thinpool vg/pool --poolmetadata repaired
try to activate pool - if it doesn't work report more problems.
I'm can't activate volumes =(.
I'm run
lvconvert --repair vg1/tp1
lvchange -ay vg1/tp1_meta0

thin_dump --repair /dev/mapper/vg1-tp1_tmeta0
<superblock uuid="" time="27" transaction="120" data_block_size="128"
nr_data_blocks="7290880">
</superblock>

lvchange -an vg1/tp1_tmeta0

and finally

lvchange -ay vg1/3695
Check of thin pool vg1/tp1 failed (status:1). Manual repair required
(thin_dump --repair /dev/mapper/vg1-tp1_tmeta)!
--
Vasiliy Tolstov,
e-mail: ***@selfip.ru
jabber: ***@selfip.ru
Joe Thornber
2014-10-28 14:09:06 UTC
Permalink
Post by Vasiliy Tolstov
thin_dump --repair /dev/mapper/vg1-tp1_tmeta0
thin_dump spits just spits out xml, it doesn't change the device it's
reading. So the process is either:

thin_dump --repair <dev> > metadata.xml
thin_restore -i metadata.xml -o <dev>

or you can use the thin_repair tool which does both of these
processes.
Post by Vasiliy Tolstov
<superblock uuid="" time="27" transaction="120" data_block_size="128"
nr_data_blocks="7290880">
</superblock>
This is worrying, you seem to have no volumes in your pool?
Vasiliy Tolstov
2014-10-28 14:29:58 UTC
Permalink
Post by Joe Thornber
thin_dump spits just spits out xml, it doesn't change the device it's
thin_dump --repair <dev> > metadata.xml
thin_restore -i metadata.xml -o <dev>
As i understand - if thin_dump --repair not displays volumes i can't
do restore...
Post by Joe Thornber
or you can use the thin_repair tool which does both of these
processes.
Post by Vasiliy Tolstov
<superblock uuid="" time="27" transaction="120" data_block_size="128"
nr_data_blocks="7290880">
</superblock>
This is worrying, you seem to have no volumes in your pool?
No, as i email before, i have many volumes:
lvs vg1
LV VG Attr LSize Pool Origin Data% Move Log
Cpy%Sync Convert
2679_751 vg1 Vwi---tz-k 20.00g tp1
2735 vg1 Vwi---tz-- 20.00g tp1
2749 vg1 Vwi---tz-- 20.00g tp1
2799 vg1 Vwi---tz-- 20.00g tp1
2937_785 vg1 Vwi---tz-k 160.00g tp1
3119 vg1 Vwi---tz-- 20.00g tp1
3435 vg1 Vwi---tz-- 20.00g tp1
3471 vg1 Vwi---tz-- 20.00g tp1
3547 vg1 Vwi---tz-- 160.00g tp1
3645 vg1 Vwi---tz-- 20.00g tp1
3647 vg1 Vwi---tz-- 20.00g tp1
3695 vg1 Vwi---tz-- 20.00g tp1
tp1 vg1 twi---tz-- 445.00g
tp1_tmeta0 vg1 -wi------- 620.00m
--
Vasiliy Tolstov,
e-mail: ***@selfip.ru
jabber: ***@selfip.ru
Anatoly Pugachev
2014-10-27 06:58:30 UTC
Permalink
Post by Zdenek Kabelac
Post by Vasiliy Tolstov
Post by Zdenek Kabelac
There is 'internal' metadata archive then -
dd if=/dev/your_pv_volume of=/tmp/1st.megabyte bs=1M count=1
It's will capture first megabyte of your PV where are embedded
metadata of your Volume group.
If you are not skilled enough - tar.gz and send this file to me.
I'm do dd and send it. While i'm break thin pool i'm try to restore volume 2657.
But i don't stop lvm thin pool =(.
Fri Oct 24 17:03:04 2014
transaction_id = 120 - create = "3695"
And suddenly on Fri Oct 24 18:07:23 2014
pool is back on older transaction_id
transaction_id = 114
Is that the time of your vgcfgrestore?
I'm attaching those metadata which you likely should put back to get in sync
with your kernel metadata (assuming you have not modified those in any way)
Zdenek,

can you please describe (possibly in details) what have you done with
tar.gz sent to you, so everyone would know what to do next time?

Thanks a lot!
Zdenek Kabelac
2014-10-27 09:05:44 UTC
Permalink
Post by Anatoly Pugachev
Post by Zdenek Kabelac
Post by Vasiliy Tolstov
Post by Zdenek Kabelac
There is 'internal' metadata archive then -
dd if=/dev/your_pv_volume of=/tmp/1st.megabyte bs=1M count=1
It's will capture first megabyte of your PV where are embedded
metadata of your Volume group.
If you are not skilled enough - tar.gz and send this file to me.
I'm do dd and send it. While i'm break thin pool i'm try to restore volume
2657.
Post by Zdenek Kabelac
Post by Vasiliy Tolstov
But i don't stop lvm thin pool =(.
Fri Oct 24 17:03:04 2014
transaction_id = 120 - create = "3695"
And suddenly on Fri Oct 24 18:07:23 2014
pool is back on older transaction_id
transaction_id = 114
Is that the time of your vgcfgrestore?
I'm attaching those metadata which you likely should put back to get in sync
with your kernel metadata (assuming you have not modified those in any way)
Zdenek,
can you please describe (possibly in details) what have you done with tar.gz
sent to you, so everyone would know what to do next time?
Thanks a lot!
Any Google query on lvm2 metadata recovery will disclose this - I've picked
randomly this one:
http://microdevsys.com/wp/linux-lvm-recovering-a-lost-volume/


In this case however provided data by user were just too short since he
created 300M metadata space - so I've asked to resend 4M on my email - so you
will not exactly find the info above in the initial tar.gz file (there
are just older versions) - but if you open file in 'vi' editor - you will see
those metadata yourself.

Zdenek
Continue reading on narkive:
Loading...