Yes, I have lots of data to share, I thought first to open at high level.
This is all happening inside a single VM. Archive is available, I will post
them shortly. No lvmetad. No errors that I can tell (at least not on
console or syslog).
***@VA1CTLT-SRN2-03:/etc/lvm/archive# grep seqno test_dvol-13-vg_00*
test_dvol-13-vg_00261-1410850844.vg: seqno = 0 <---- before vgcreate
test_dvol-13-vg_00262-1188507802.vg: seqno = 1 <-- before lvcreate 1
test_dvol-13-vg_00263-1818746321.vg: seqno = 2 <---- before lvcreate 2
test_dvol-13-vg_00264-1122545952.vg: seqno = 3 <--- before lvcreate 3
test_dvol-13-vg_00265-1497145254.vg: seqno = 4 <---- before lvcreate 4
test_dvol-13-vg_00266-1300493675.vg: seqno = 5 <--- before lvs
test_dvol-13-vg_00267-490193445.vg: seqno = 4 <----- disabled device
cache, lvs
test_dvol-13-vg_00268-2051497792.vg: seqno = 4 <----- disabled device
cache, lvs
test_dvol-13-vg_00269-370016695.vg: seqno = 5 <---- enabled device cache,
lvs
The contents of the metadata area seems to be the same (both contain seqno
5):
dd if=/dev/sbd13 bs=1M count=1 skip=1 of=sbd13.nocache
dd if=/dev/sbd13 bs=1M count=1 skip=1 of=sbd13.cache
cmp sbd13.nocache sbd13.cache
I tracked down these sectors by running strace on
pvcreate/vgcreate/lvcreate. As far as I can tell, all the sectors involved
are being written correctly.
Random facts:
1. Devicemapper still correctly lists the logical volume that is missing
from lvs
2. 3.13.0-44-generic, Ubuntu 14.04
3. LVM version: 2.02.98(2) (2012-10-15) Library version: 1.02.77
(2012-10-15) Driver version: 4.27.0
Random suspicious snippet generated by lvscan -vvv
/dev/mapper/sbd13p1: lvm2 label detected at sector 1
lvmcache: /dev/mapper/sbd13p1: now in VG #orphans_lvm2 (#orphans_lvm2) with
1 mdas
/dev/mapper/sbd13p1: Found metadata at 8704 size 1749 (in area at 4096 size
1044480) for test_dvol-13-vg (DFvQDG-nYVS-QQlT-Uv35-aPr4-2pY0-zMQ0dr)
lvmcache: /dev/mapper/sbd13p1: now in VG test_dvol-13-vg with 1 mdas
lvmcache: /dev/mapper/sbd13p1: setting test_dvol-13-vg VGID to
DFvQDGnYVSQQlTUv35aPr42pY0zMQ0dr
lvmcache: /dev/mapper/sbd13p1: VG test_dvol-13-vg: Set creation host to
VA1CTLT-SRN2-03. Allocated VG test_dvol-13-vg at 0x257bc00.
Using cached label for /dev/mapper/sbd13p1
Read test_dvol-13-vg metadata (4) from /dev/mapper/sbd13p1 at 8704 size
1749
/dev/mapper/sbd13p1 0: 0 19: VM-test_dvol-13-0-hard-drive-0(0:0)
/dev/mapper/sbd13p1 1: 19 19: VM-test_dvol-13-0-hard-drive-1(0:0)
/dev/mapper/sbd13p1 2: 38 19: VM-test_dvol-13-1-hard-drive-0(0:0)
/dev/mapper/sbd13p1 3: 57 42: NULL(0:0) *<----missing logical volume*
I don't understand how this is possible if that sector (8704) is identical
in both cases.
Attached are two verbose straces of vgdisplay, one of which discovered 3
logical volumes and one of that discovers 4.
I am looking for insight into the disk contents that are necessary for this
discovery. Thank you very much.
Aaron
Post by Aaron YoungHello, I'm deep into debugging an issue we have with a disk driver of
ours and
Post by Aaron YoungLVM.
create vg -> seqno 1
create lv1 -> seqno 2
create lv2 -> seqno 3
create lv3 -> seqno 4
create lv4 -> seqno 5
<clear our device cache> (note, this generates no IO)
vgdisplay: seqno = 4, lv4 is missing
* This happens only after dozens to hundreds of iterations. Most of the
time
Post by Aaron Youngit is fine.
I dd all the metadata blocks off of the pv, yep, seqno5 is on disk
metadata
Post by Aaron Youngarea perfectly fine. But the system believes 4 is the current version.
Shouldn't the system be using the highest value? Or is it stored
somewhere?
Post by Aaron YoungWhat mechanism is responsible for changing the seqno? And where does it
change
Post by Aaron Youngit? (Not the metadata contents, just the number)
Hi
Your email is quite 'mystic' - I'd need lots of crystal balls to see your
surrounding conditions.
1.) Is this 'clustered' environment or a 'single' host setup ?
2.) Do you have 'archive' backup enabled - can you check what are last
operations in history before problem happens?
3.) Are you using 'lvmetad' ? (if so, try use_lvmetad=0 )
4.) Kernel version, lvm2 version ?
5.) Was there any lvm2 command error ?
(as vgdisplay may just do a backup of most recent metadata in case they are
are missing after some command failure)
Zdenek