Discussion:
[linux-lvm] problem with lvcreate and redirection of stdout/stderr
Lentes, Bernd
2014-08-20 15:27:52 UTC
Permalink
Hi ML,

i'm writing a script to update my servers. Before updating, i'd like to create a snapshot of the root-directory (/). If something goes wrong with the update, I have a consistent backup.
I try to redirect stdout and stderr to a logfile for being able to check everything afterwards. The redirection has problems with lvcreate.
This is my script (excerpt):

============================================================
lvremove -fv /dev/vg1/lv_root_snapshot &> /var/log/update.log

lvcreate -v -L 45G -n lv_root_snapshot -s vg1/lv_root >> /var/log/update.log 2>&1

zypper -n up -y -t patch --skip-interactive >> /var/log/update.log 2>&1
============================================================

With the second line I try to create the lv and redirect stdout/stderr to a log file.
The whole system stop by executing the second line, this are the last lines in my log:

============================================================
lvcreate Setting logging type to disk
lvcreate Setting chunksize to 8 sectors.
lvcreate Finding volume group "vg1"
lvcreate Archiving volume group "vg1" metadata (seqno 99).
lvcreate Creating logical volume lv_root_snapshot
lvcreate Creating volume group backup "/etc/lvm/backup/vg1" (seqno 100).
lvcreate Found volume group "vg1"
lvcreate activation/volume_list configuration setting not defined: Checking only host tags for vg1/lv_root_snapshot
lvcreate Creating vg1-lv_root_snapshot
lvcreate Loading vg1-lv_root_snapshot table (252:1)
lvcreate Resuming vg1-lv_root_snapshot (252:1)
lvcreate Clearing start of logical volume "lv_root_snapshot"
lvcreate Creating logical volume snapshot0
lvcreate Found volume group "vg1"
lvcreate Found volume group "vg1"
lvcreate Creating vg1-lv_root-real
lvcreate Loading vg1-lv_root-real table (252:2)
lvcreate Loading vg1-lv_root table (252:0)
lvcreate Creating vg1-lv_root_snapshot-cow
lvcreate Loading vg1-lv_root_snapshot-cow table (252:3)
lvcreate Resuming vg1-lv_root_snapshot-cow (252:3)
lvcreate Loading vg1-lv_root_snapshot table (252:1)
lvcreate Suspending vg1-lv_root (252:0) with filesystem sync with device flush
===========================================================

I tried it several times, and the system always stop at this point. When I execute the same command manually in a shell no change. Stop.
Redirecting the lvremove command on a shell in the same manner is working fine:

lvremove -fv /dev/vg1/lv_root_snapshot >> /var/log/update.log 2>&1
system continues running.

lvcreate manually in a shell and only redirecting stdout is working fine:

lvcreate -v -L 45G -n lv_root_snapshot -s vg1/lv_root >> /var/log/update.log
system continues running

Only the combination of redirecting stdout and stderr to the logfile stops the system completely:
lvcreate -v -L 45G -n lv_root_snapshot -s vg1/lv_root >> /var/log/update.log 2>&1


Any ideas ?


my system:

SLES 11 SP3 64bit
vm58820-6:~ # lvm version
LVM version: 2.02.98(2) (2012-10-15)
Library version: 1.03.01 (2011-10-15)
Driver version: 4.23.0
1 lv for the root-directory, /boot is a "real" partition
1pv, 1 vg

Bernd


--
Bernd Lentes

Systemadministration
Institut für Entwicklungsgenetik
Gebäude 35.34 - Raum 208
HelmholtzZentrum münchen
***@helmholtz-muenchen.de
phone: +49 89 3187 1241
fax: +49 89 3187 2294
http://www.helmholtz-muenchen.de/idg

Die Freiheit wird nicht durch weniger Freiheit verteidigt



Helmholtz Zentrum München
Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH)
Ingolstädter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir´in Bärbel Brumme-Bothe
Geschäftsführer: Prof. Dr. Günther Wess, Dr. Nikolaus Blum, Dr. Alfons Enhsen
Registergericht: Amtsgericht München HRB 6466
USt-IdNr: DE 129521671
Alasdair G Kergon
2014-08-20 15:50:10 UTC
Permalink
Post by Lentes, Bernd
lvcreate -v -L 45G -n lv_root_snapshot -s vg1/lv_root >> /var/log/update.log 2>&1
lvcreate Suspending vg1-lv_root (252:0) with filesystem sync with device flush
/var/log is presumably part of a filesystem on the same LV that is getting its I/O
paused for a brief moment while the snapshot is taken.

1) Run lvm dumpconfig log/activation to double-check this is not set to 1.

2) What filesystem is this and precisely what kernel? (cat /proc/mounts)

3) Try to obtain a stack trace at the point where this locks up e.g.
echo t > /proc/sysrq-trigger followed by dmesg (or divert your logging
of kernel messages to a different filesystem outside the snapshot).

Alasdair
Lentes, Bernd
2014-08-20 17:20:57 UTC
Permalink
Post by Alasdair G Kergon
Post by Lentes, Bernd
lvcreate -v -L 45G -n lv_root_snapshot -s vg1/lv_root >>
/var/log/update.log 2>&1
lvcreate Suspending vg1-lv_root (252:0) with filesystem sync with device
flush
/var/log is presumably part of a filesystem on the same LV that is getting its
I/O paused for a brief moment while the snapshot is taken.
Yes it is. I have just one lv for the whole directory / .
Post by Alasdair G Kergon
1) Run lvm dumpconfig log/activation to double-check this is not set to 1.
log {
verbose=2
syslog=0
file="/var/log/lvm2.log"
overwrite=0
level=5
indent=1
command_names=1
prefix=" "
}

activation {
missing_stripe_filler="/dev/ioerror"
reserved_stack=256
reserved_memory=8192
process_priority=-18
mirror_region_size=512
readahead="auto"
monitoring=1
mirror_log_fault_policy="allocate"
mirror_device_fault_policy="remove"
thin_pool_autoextend_threshold=100
thin_pool_autoextend_percent=20
udev_rules=1
udev_sync=1
}
Post by Alasdair G Kergon
2) What filesystem is this and precisely what kernel? (cat /proc/mounts)
FS is ext3, kernel is 3.0.76-0.11-default

vm58820-6:~ # cat /proc/mounts
rootfs / rootfs rw 0 0
udev /dev tmpfs rw,relatime,nr_inodes=0,mode=755 0 0
tmpfs /dev/shm tmpfs rw,relatime 0 0
/dev/mapper/vg1-lv_root / ext3 rw,relatime,errors=continue,user_xattr,acl,barrier=1,data=ordered 0 0
proc /proc proc rw,relatime 0 0
sysfs /sys sysfs rw,relatime 0 0
devpts /dev/pts devpts rw,relatime,gid=5,mode=620,ptmxmode=000 0 0
debugfs /sys/kernel/debug debugfs rw,relatime 0 0
/dev/vda3 /boot ext3 rw,relatime,errors=continue,user_xattr,acl,barrier=1,data=ordered 0 0
fusectl /sys/fs/fuse/connections fusectl rw,relatime 0 0
securityfs /sys/kernel/security securityfs rw,relatime 0 0
none /var/lib/ntp/proc proc ro,nosuid,nodev,relatime 0 0
Post by Alasdair G Kergon
3) Try to obtain a stack trace at the point where this locks up e.g.
echo t > /proc/sysrq-trigger followed by dmesg (or divert your logging of
kernel messages to a different filesystem outside the snapshot).
Alasdair
the stack trace you find here (~190KB): http://www.filedropper.com/stacktracevm58820-6lvcreate

Strange thing is that it worked now once without system stop. But second time stop again.


Thanks for your help.



Bernd

Helmholtz Zentrum München
Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH)
Ingolstädter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir´in Bärbel Brumme-Bothe
Geschäftsführer: Prof. Dr. Günther Wess, Dr. Nikolaus Blum, Dr. Alfons Enhsen
Registergericht: Amtsgericht München HRB 6466
USt-IdNr: DE 129521671
Roger Heflin
2014-08-21 16:57:51 UTC
Permalink
stderr flushes on each write of an line to attempt to not lose data.

I can see how with the vg suspended that might have a problem.
Everything will stop waiting for the stderr flush to complete, ie the
print statement doing the output to standard error will never return
and then execute the next command to continue on with the vg work.

I would suggest sending the output data to another device (maybe a
tmpfs device) then when done copy it back.

On Wed, Aug 20, 2014 at 12:20 PM, Lentes, Bernd
Post by Lentes, Bernd
Post by Alasdair G Kergon
Post by Lentes, Bernd
lvcreate -v -L 45G -n lv_root_snapshot -s vg1/lv_root >>
/var/log/update.log 2>&1
lvcreate Suspending vg1-lv_root (252:0) with filesystem sync with device
flush
/var/log is presumably part of a filesystem on the same LV that is getting its
I/O paused for a brief moment while the snapshot is taken.
Yes it is. I have just one lv for the whole directory / .
Post by Alasdair G Kergon
1) Run lvm dumpconfig log/activation to double-check this is not set to 1.
log {
verbose=2
syslog=0
file="/var/log/lvm2.log"
overwrite=0
level=5
indent=1
command_names=1
prefix=" "
}
activation {
missing_stripe_filler="/dev/ioerror"
reserved_stack=256
reserved_memory=8192
process_priority=-18
mirror_region_size=512
readahead="auto"
monitoring=1
mirror_log_fault_policy="allocate"
mirror_device_fault_policy="remove"
thin_pool_autoextend_threshold=100
thin_pool_autoextend_percent=20
udev_rules=1
udev_sync=1
}
Post by Alasdair G Kergon
2) What filesystem is this and precisely what kernel? (cat /proc/mounts)
FS is ext3, kernel is 3.0.76-0.11-default
vm58820-6:~ # cat /proc/mounts
rootfs / rootfs rw 0 0
udev /dev tmpfs rw,relatime,nr_inodes=0,mode=755 0 0
tmpfs /dev/shm tmpfs rw,relatime 0 0
/dev/mapper/vg1-lv_root / ext3 rw,relatime,errors=continue,user_xattr,acl,barrier=1,data=ordered 0 0
proc /proc proc rw,relatime 0 0
sysfs /sys sysfs rw,relatime 0 0
devpts /dev/pts devpts rw,relatime,gid=5,mode=620,ptmxmode=000 0 0
debugfs /sys/kernel/debug debugfs rw,relatime 0 0
/dev/vda3 /boot ext3 rw,relatime,errors=continue,user_xattr,acl,barrier=1,data=ordered 0 0
fusectl /sys/fs/fuse/connections fusectl rw,relatime 0 0
securityfs /sys/kernel/security securityfs rw,relatime 0 0
none /var/lib/ntp/proc proc ro,nosuid,nodev,relatime 0 0
Post by Alasdair G Kergon
3) Try to obtain a stack trace at the point where this locks up e.g.
echo t > /proc/sysrq-trigger followed by dmesg (or divert your logging of
kernel messages to a different filesystem outside the snapshot).
Alasdair
the stack trace you find here (~190KB): http://www.filedropper.com/stacktracevm58820-6lvcreate
Strange thing is that it worked now once without system stop. But second time stop again.
Thanks for your help.
Bernd
Helmholtz Zentrum München
Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH)
Ingolstädter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir´in Bärbel Brumme-Bothe
Geschäftsführer: Prof. Dr. Günther Wess, Dr. Nikolaus Blum, Dr. Alfons Enhsen
Registergericht: Amtsgericht München HRB 6466
USt-IdNr: DE 129521671
_______________________________________________
linux-lvm mailing list
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Alasdair G Kergon
2014-08-21 17:20:27 UTC
Permalink
Post by Roger Heflin
I can see how with the vg suspended that might have a problem.
But nothing should be getting written to stderr while the device is suspended.

Alasdair

Loading...