Discussion:
[linux-lvm] lvm Bug? - bad reaction to snapshot creation
Leeman Strout
2014-06-06 23:26:26 UTC
Permalink
Creating a snapshot throws an error "Attempted to decrement suspended
device counter below zero." but succeeds. Removing the snapshot fails,
attempting a 2nd time succeeds. Somewhere in this process the original
LV gets locked and the system needs to be restarted to unlock it. As
explained below this happens intermittently but regularly.


Any additional info please let me know directly, I am not a subscriber,
Leeman

----
log file: http://www.enlj.com/lvm.txt
lvm config: http://www.enlj.com/lvmconfig.txt

Arch Linux, lvm2 2.02.106-2

line 1: lvcreate -L1G -s -n srvrootsnap /dev/ssd.vg/server-root
Attempted to decrement suspended device counter below zero.
Logical volume "srvrootsnap" created
line 148+: why? globalfilter = [ "a|/dev/md|", "r|.*|" ]
line 921: after this point I do udevadm settle
no output from udevadm settle
line 923: lvremove -f /dev/ssd.vg/srvrootsnap - fails
Unable to deactivate open ssd.vg-srvrootsnap-cow (253:3)
Failed to activate srvrootsnap.
Releasing activation in critical section.
libdevmapper exiting with 1 device(s) still suspended.
line 1567: lvremove -f /dev/ssd.vg/srvrootsnap - again, works this time
Logical volume "srvrootsnap" successfully removed

However, /dev/ssd.vg/server-root is locked up, the VM seizes as no data
can be written to the volume. I have to restart the entire system to
unlock the volume.

This doesn't happen all the time. It happens every time I do a backup.
1 of the 7 LV snapshots created for the job, not always the same one.

hardware: Supermicro X9DR3-F,
onboard SATA controller:
00:1f.2 SATA controller: Intel Corporation C600/X79 series chipset
6-Port SATA AHCI Controller (rev 06)
/dev/md0 consists of 2 Seagate 600 240GB SSDs on that controller
/dev/ssd.vg consists of /dev/md0
Zdenek Kabelac
2014-06-09 09:50:00 UTC
Permalink
Creating a snapshot throws an error "Attempted to decrement suspended device
counter below zero." but succeeds. Removing the snapshot fails, attempting a
2nd time succeeds. Somewhere in this process the original LV gets locked and
the system needs to be restarted to unlock it. As explained below this
happens intermittently but regularly.
Any additional info please let me know directly, I am not a subscriber,
Hi


Unsure if this relates to all your problem (since I'm not sure how arch linux
is in sync with udev rules & systemd version)

At certain moment systemd added a new 'feature' about locking devices while
updating internal udev state - this locking ignores any udev rule flags and
opens internal lvm2 devices - so while for now it's been again disabled for
'dm' devices - you might have installed version of system which has the 'lock
everything' feature in?

Thought this doesn't explain your 'counter below zero' error - this looks like
some incorrect udev rules are running in the field ?
(Or maybe multiple systemd-udevd are running ?)

Zdenek
Marian Csontos
2014-06-09 10:08:58 UTC
Permalink
Post by Zdenek Kabelac
Creating a snapshot throws an error "Attempted to decrement suspended device
counter below zero." but succeeds. Removing the snapshot fails, attempting a
2nd time succeeds. Somewhere in this process the original LV gets locked and
the system needs to be restarted to unlock it. As explained below this
happens intermittently but regularly.
Any additional info please let me know directly, I am not a subscriber,
Hi
Unsure if this relates to all your problem (since I'm not sure how arch linux
is in sync with udev rules & systemd version)
At certain moment systemd added a new 'feature' about locking devices while
updating internal udev state - this locking ignores any udev rule flags
and opens internal lvm2 devices - so while for now it's been again
disabled for 'dm' devices - you might have installed version of system
which has the 'lock everything' feature in?
Thought this doesn't explain your 'counter below zero' error - this looks like
some incorrect udev rules are running in the field ?
(Or maybe multiple systemd-udevd are running ?)
There is also a recent BZ against RHEL7:

https://bugzilla.redhat.com/show_bug.cgi?id=1105732

What are the devices used in the stack?

-- Martian
Post by Zdenek Kabelac
Zdenek
_______________________________________________
linux-lvm mailing list
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Leeman Strout
2014-06-09 14:55:29 UTC
Permalink
Post by Marian Csontos
https://bugzilla.redhat.com/show_bug.cgi?id=1105732
What are the devices used in the stack?
It's all there in the original message,
Post by Marian Csontos
hardware: Supermicro X9DR3-F,
00:1f.2 SATA controller: Intel Corporation C600/X79 series chipset 6-Port SATA AHCI Controller (rev 06)
/dev/md0 consists of 2 Seagate 600 240GB SSDs on that controller
/dev/ssd.vg consists of /dev/md0
If that is not sufficient, what exactly are you looking for?


Thanks,
Leeman
Leeman Strout
2014-06-09 14:53:45 UTC
Permalink
Post by Zdenek Kabelac
Hi
Unsure if this relates to all your problem (since I'm not sure how
arch linux is in sync with udev rules & systemd version)
At certain moment systemd added a new 'feature' about locking
devices while updating internal udev state - this locking ignores any
udev rule flags and opens internal lvm2 devices - so while for now
it's been again disabled for 'dm' devices - you might have installed
version of system which has the 'lock everything' feature in?
systemd 213-6 is what Arch reports, as far as I can tell from the
PKGBUILD it's vanilla 213 plus 2 patches:
- backport fix for faily MACAddress matching (FS#40675)
- backport fix for fsck/udev mess (FS#40706)

Do you have links to discussion of this feature or bugs pertaining to it?
Post by Zdenek Kabelac
Thought this doesn't explain your 'counter below zero' error - this
looks like some incorrect udev rules are running in the field ? (Or
maybe multiple systemd-udevd are running ?)
After a clean restart of everything twice without doing snapshots, the
initial lvcreate does not have the 'counter below zero' error. And with
special attention to make sure the snapshots are cleaned up prior to
attempting a new snapshot I am not getting this initial decrement on
lvcreate.


Thanks,
Leeman
Zdenek Kabelac
2014-06-10 07:58:03 UTC
Permalink
Post by Zdenek Kabelac
Hi
Unsure if this relates to all your problem (since I'm not sure how
arch linux is in sync with udev rules & systemd version)
At certain moment systemd added a new 'feature' about locking
devices while updating internal udev state - this locking ignores any
udev rule flags and opens internal lvm2 devices - so while for now
it's been again disabled for 'dm' devices - you might have installed
version of system which has the 'lock everything' feature in?
systemd 213-6 is what Arch reports, as far as I can tell from the PKGBUILD
- backport fix for faily MACAddress matching (FS#40675)
- backport fix for fsck/udev mess (FS#40706)
this commit should be fixing problems for lvm2 & latest systemd:

e918a1b5a94f270186dca59156354acd2a596494

and this is systemd commit which has introduced problem:

3d06f4183470d42361303086ed9dedd29c0ffc1b

Unsure what do you have in your arch build.
Post by Zdenek Kabelac
Thought this doesn't explain your 'counter below zero' error - this
looks like some incorrect udev rules are running in the field ? (Or
maybe multiple systemd-udevd are running ?)
After a clean restart of everything twice without doing snapshots, the initial
lvcreate does not have the 'counter below zero' error. And with special
attention to make sure the snapshots are cleaned up prior to attempting a new
snapshot I am not getting this initial decrement on lvcreate.
So could we consider this 'counter' case is solved ?

Zdenek
Leeman Strout
2014-06-10 13:45:14 UTC
Permalink
Post by Zdenek Kabelac
Post by Zdenek Kabelac
Hi
Unsure if this relates to all your problem (since I'm not sure how
arch linux is in sync with udev rules & systemd version)
At certain moment systemd added a new 'feature' about locking
devices while updating internal udev state - this locking ignores any
udev rule flags and opens internal lvm2 devices - so while for now
it's been again disabled for 'dm' devices - you might have installed
version of system which has the 'lock everything' feature in?
systemd 213-6 is what Arch reports, as far as I can tell from the PKGBUILD
- backport fix for faily MACAddress matching (FS#40675)
- backport fix for fsck/udev mess (FS#40706)
e918a1b5a94f270186dca59156354acd2a596494
3d06f4183470d42361303086ed9dedd29c0ffc1b
Unsure what do you have in your arch build.
Post by Zdenek Kabelac
Thought this doesn't explain your 'counter below zero' error - this
looks like some incorrect udev rules are running in the field ? (Or
maybe multiple systemd-udevd are running ?)
After a clean restart of everything twice without doing snapshots, the initial
lvcreate does not have the 'counter below zero' error. And with special
attention to make sure the snapshots are cleaned up prior to
attempting a new
snapshot I am not getting this initial decrement on lvcreate.
So could we consider this 'counter' case is solved ?
Zdenek
Am testing the rev Arch dev pushed w/ that change :
https://projects.archlinux.org/svntogit/packages.git/diff/trunk/0001-udev-exclude-device-mapper-from-block-device-ownersh.patch?h=packages/systemd&id=331c26905843338b30b4cd240c64953501cd879c

It does not resolve the issue.


Thanks,
Leeman
Leeman Strout
2014-06-10 15:22:37 UTC
Permalink
Post by Zdenek Kabelac
Post by Zdenek Kabelac
Hi
Unsure if this relates to all your problem (since I'm not sure how
arch linux is in sync with udev rules & systemd version)
At certain moment systemd added a new 'feature' about locking
devices while updating internal udev state - this locking ignores any
udev rule flags and opens internal lvm2 devices - so while for now
it's been again disabled for 'dm' devices - you might have installed
version of system which has the 'lock everything' feature in?
systemd 213-6 is what Arch reports, as far as I can tell from the PKGBUILD
- backport fix for faily MACAddress matching (FS#40675)
- backport fix for fsck/udev mess (FS#40706)
e918a1b5a94f270186dca59156354acd2a596494
3d06f4183470d42361303086ed9dedd29c0ffc1b
Unsure what do you have in your arch build.
This patch:
http://lists.freedesktop.org/archives/systemd-devel/2014-June/019863.html
applied and tested in Arch solves the issue.


Thanks,
Leeman
Christian Hesse
2014-06-11 07:59:49 UTC
Permalink
Post by Leeman Strout
Post by Zdenek Kabelac
Post by Zdenek Kabelac
Hi
Unsure if this relates to all your problem (since I'm not sure how
arch linux is in sync with udev rules & systemd version)
At certain moment systemd added a new 'feature' about locking
devices while updating internal udev state - this locking ignores any
udev rule flags and opens internal lvm2 devices - so while for now
it's been again disabled for 'dm' devices - you might have installed
version of system which has the 'lock everything' feature in?
systemd 213-6 is what Arch reports, as far as I can tell from the PKGBUILD
- backport fix for faily MACAddress matching (FS#40675)
- backport fix for fsck/udev mess (FS#40706)
e918a1b5a94f270186dca59156354acd2a596494
3d06f4183470d42361303086ed9dedd29c0ffc1b
Unsure what do you have in your arch build.
http://lists.freedesktop.org/archives/systemd-devel/2014-June/019863.html
applied and tested in Arch solves the issue.
This has been applied to systemd-213-9 (and systemd upstream) already. ;)
--
Schoene Gruesse
Chris
O< ascii ribbon campaign
stop html mail - www.asciiribbon.org
Zdenek Kabelac
2014-06-10 08:31:23 UTC
Permalink
Post by Zdenek Kabelac
Hi
Unsure if this relates to all your problem (since I'm not sure how
arch linux is in sync with udev rules & systemd version)
At certain moment systemd added a new 'feature' about locking
devices while updating internal udev state - this locking ignores any
udev rule flags and opens internal lvm2 devices - so while for now
it's been again disabled for 'dm' devices - you might have installed
version of system which has the 'lock everything' feature in?
systemd 213-6 is what Arch reports, as far as I can tell from the PKGBUILD
- backport fix for faily MACAddress matching (FS#40675)
- backport fix for fsck/udev mess (FS#40706)
this commit should be fixing problems for lvm2 & latest systemd:

e918a1b5a94f270186dca59156354acd2a596494

and this is systemd commit which has introduced problem:

3d06f4183470d42361303086ed9dedd29c0ffc1b

Unsure what do you have in your arch build.
Post by Zdenek Kabelac
Thought this doesn't explain your 'counter below zero' error - this
looks like some incorrect udev rules are running in the field ? (Or
maybe multiple systemd-udevd are running ?)
After a clean restart of everything twice without doing snapshots, the initial
lvcreate does not have the 'counter below zero' error. And with special
attention to make sure the snapshots are cleaned up prior to attempting a new
snapshot I am not getting this initial decrement on lvcreate.
So could we consider this 'counter' case is solved ?

Zdenek
Loading...