Discussion:
[linux-lvm] "write failed.. No space left", "Failed to write VG", and "Failed to write a MDA"
David Teigland
2018-06-04 16:35:17 UTC
Permalink
format-text.c:331 Reading mda header sector from /dev/sdb at 4096
format-text.c:678 Writing metadata for VG foo to /dev/sdb at 7168 len 1525 (wrap 0)
format-text.c:331 Reading mda header sector from /dev/sdb at 999665172480
format-text.c:678 Writing metadata for VG foo to /dev/sdb at 999665175552 len 1525 (wrap 0)
format-text.c:331 Reading mda header sector from /dev/sdg at 4096
format-text.c:678 Writing metadata for VG foo to /dev/sdg at 7168 len 1525 (wrap 0)
format-text.c:331 Reading mda header sector from /dev/sdg at 999665172480
format-text.c:678 Writing metadata for VG foo to /dev/sdg at 999665175552 len 1525 (wrap 0)
To illustrate what you should see when the metadata wraps, using the default
metadata area size:

The initial metadata written by vgcreate:

Reading mda header sector from /dev/sdb at 4096
Writing metadata for VG foo to /dev/sdb at 4608 len 931 (wrap 0)
Reading mda header sector from /dev/sdb at 999665172480
Writing metadata for VG foo to /dev/sdb at 999665172992 len 931 (wrap 0)
Reading mda header sector from /dev/sdg at 4096
Writing metadata for VG foo to /dev/sdg at 4608 len 931 (wrap 0)
Reading mda header sector from /dev/sdg at 999665172480
Writing metadata for VG foo to /dev/sdg at 999665172992 len 931 (wrap 0)

When writing the metadata wraps, it does not go beyond the end of the device
and it returns to the original offset used above:

Reading mda header sector from /dev/sdb at 4096
Writing metadata for VG foo to /dev/sdb at 1043968 len 4608 (wrap 19895)
Writing metadata for VG foo to /dev/sdb at 4608 len 19895 (wrapped)
Reading mda header sector from /dev/sdb at 999665172480
Writing metadata for VG foo to /dev/sdb at 999666212352 len 8704 (wrap 15799)
Writing metadata for VG foo to /dev/sdb at 999665172992 len 15799 (wrapped)
Reading mda header sector from /dev/sdg at 4096
Writing metadata for VG foo to /dev/sdg at 1043968 len 4608 (wrap 19895)
Writing metadata for VG foo to /dev/sdg at 4608 len 19895 (wrapped)
Reading mda header sector from /dev/sdg at 999665172480
Writing metadata for VG foo to /dev/sdg at 999666212352 len 8704 (wrap 15799)
Writing metadata for VG foo to /dev/sdg at 999665172992 len 15799 (wrapped)

The write that goes to the end of the metadata area starts at 999666212352 and
has length 8704. start+length is 999666221056 which matches the device size:

# pvs -o name,dev_size --units b /dev/sdb /dev/sdg
PV DevSize
/dev/sdb 999666221056B
/dev/sdg 999666221056B
Jim Haddad
2018-06-04 18:26:29 UTC
Permalink
Writing past the end of the disk seems to be fixed in git master.
Hoping I understood the situation well enough that it wouldn't cause
You'll notice some ongoing changes with releases and branches. I'd
suggest using 2.02.176 and 2.02.178 (skip 2.02.177). If you want to use a
git branch directly, you may want to look at 2018-06-01-stable since the
master branch may be unstable for a while.
Thanks for your replies. Will do. I did have some compilation errors
on master, but it was past creating tools/lvm, so I was able to use
that.
With git master, I ran the same command. It no longer says exactly
how much and where it's writing, just the header address.
You should see more writing debug information than you included, like
this...
You're absolutely right, I didn't look high enough up.

Do you think I'm right, that there are no lasting effects from having
ran into problem? Meaning, if I run 2.02.176/178/2018-06-01-stable
I'm all set, and don't need to copy all the data off the disk and redo
it?
David Teigland
2018-06-04 18:52:33 UTC
Permalink
Post by Jim Haddad
Do you think I'm right, that there are no lasting effects from having
ran into problem? Meaning, if I run 2.02.176/178/2018-06-01-stable
I'm all set, and don't need to copy all the data off the disk and redo
it?
It's probably ok. With one of the good versions above, run 'vgs -vvvv'
and check that the offsets look good. Then run a pointless command to
write the metadata with -vvvv and check that the vg writes are happening
correctly. "vgchange -vvvv --addtag foo <vgname>" will write a new
version of the metadata and won't have any effect on LVs.
Inbox
2018-06-03 20:18:06 UTC
Permalink
Kernel 4.16.8, lvm 2.02.177.
I'm aware I can't let my thin volume get full. I'm actually about to
delete a lot of things.
I don't understand why it gave the sdh3 then sdh3, sdg3, sdf3 no space
left on device errors. sdf3 has 366G left in its thin pool, and I asked to
create a virtual 200G within it.
I don't understand why it failed to write VG, or an MDA of VG.
I'm most concerned if anything's corrupted now, or if I can ignore this
other than that I couldn't create a volume.
disk1thin is on sdh3
disk2thin is on sdg3
disk3thin is on sdf3
disk4thin is on sde3
# lvs
...
disk1thin lvm twi-aot--- <4.53t
84.13 76.33
disk2thin lvm twi-aot--- <4.53t
85.98 78.09
disk3thin lvm twi-aot--- <4.53t
92.10 83.47
disk4thin lvm twi-aot--- 4.53t
80.99 36.91
...
# lvcreate -V200G lvm/disk3thin -n test3
WARNING: Sum of all thin volume sizes (21.22 TiB) exceeds the size of
thin pools and the size of whole volume group (<18.17 TiB).
WARNING: You have not turned on protection against thin pools running
out of space.
WARNING: Set activation/thin_pool_autoextend_threshold below 100 to
trigger automatic extension of thin pools before they get full.
/dev/sdh3: write failed after 24064 of 24576 at 4993488961536: No space
left on device
Failed to write VG lvm.
Failed to write VG lvm.
Manual intervention may be required to remove abandoned LV(s) before
retrying.
# lvremove lvm/test3
/dev/sdh3: write failed after 24064 of 24576 at 4993488961536: No space
left on device
WARNING: Failed to write an MDA of VG lvm.
/dev/sdg3: write failed after 24064 of 24576 at 4993488961536: No space
left on device
WARNING: Failed to write an MDA of VG lvm.
/dev/sdf3: write failed after 24064 of 24576 at 4993488961536: No space
left on device
WARNING: Failed to write an MDA of VG lvm.
Logical volume "test3" successfully removed
# lvs --- shows test3 is gone
# pvs
PV VG Fmt Attr PSize PFree
/dev/sde3 lvm lvm2 a-- 4.54t <10.70g
/dev/sdf3 lvm lvm2 a-- 4.54t 0
/dev/sdg3 lvm lvm2 a-- 4.54t 0
/dev/sdh3 lvm lvm2 a-- 4.54t 0
# vgs
VG #PV #LV #SN Attr VSize VFree
lvm 4 51 0 wz--n- <18.17t <10.70g
Patrick Mitchell
2018-06-04 06:54:21 UTC
Permalink
On Sun, Jun 3, 2018 at 7:57 PM, Inbox <***@gmail.com> wrote:
...
Ordinarily, I don't think this would be fatal. If lvm works within
the space it has, this just means not as many old copies of metadata
will be kept. But, the pvcreate bug left room for only 48,640 bytes
of xml in mda1 vs 966,656 in mda1. As my "lvm = {" xml is 20,624
bytes, there's only room for 2 copies of the xml in mda1.
It must be this combination of too small of an xml area in mda1, with
a large "lvm = {" xml that doesn't allow LVM to work within such a
confined space, and try to write past the end of the disk.
* disk size is correct (pv_header.device_size 0x48aa3231e00 is
4993488985600, bytes reported by fdisk)
* mda0 is located at 4096 bytes (pv_header.disk_areas.mda0.offset
0x1000 is 4096 bytes)
* mda0 is size 1044480 bytes (pv_header.disk_areas.md0.size 0xff000)
* mda1 is located at 4993488781312 bytes which is 204288 from last
disk byte (pv_header.disk_areas.mda1.offset 0x48aa3200000)
* mda1 is size 204288 bytes (pv_header.disk_areas.mda1.size 0x31e00)
* the mda checksums are now different (0xf0662726 vs 0xb46ba552)
So, it made mda1 to only be 19.5~% the size of mda0.
mda0 has room for xml of 966656 bytes. (starts at 0x14000, mda0 goes
from 0x1000 for 0xff000 bytes, so to 0x100000 = EC000 available =
966656)
md1 only has room for xml of 48640 bytes. (starts at 0x48aa3226000,
mda1 goes from 0x48aa3200000 for 0x31e00 bytes, so to 48AA3231E00 =
BE00 available = 48640)
...

Correction here.

I thought the python script's addresses for metadata.value were at the
starting position for XML. I was wrong about that. I see now those
point to mda_header.start + mda.header_raw_locns0.offset. locns[0]
I'm guessing must be for the most recent.

So, the XML area isn't shrunk down as badly like I was thinking. mda1
is still smaller:

# pvck -t /dev/sdh3
TEST MODE: Metadata will NOT be updated and volumes will not be (de)activated.
Found label on /dev/sdh3, sector 1, type=LVM2 001
Found text metadata area: offset=4096, size=1044480
Found text metadata area: offset=4993488781312, size=204288

But, there is more room in mda1 than just 2 XML files. It must have
just been the exact math on the mda1 size, the XML size, the rounding,
and the 2.02.177 algorithm, that made it try to write off the disk.
LVM isn't stuck trying to fit 2 XML's in the area.

Loading...