Dennis Schridde
2018-05-12 11:30:36 UTC
Hello!
# lvchange -ay ernie/system
Check of pool ernie/cache failed (status:1). Manual repair required!
# lvconvert --repair ernie/system
bad checksum in superblock
Repair of cache metadata volume of cache ernie/system failed (status:1).
Manual repair required!
How do I recover from this situation and repair the volume group?
For a full log, including the output of pvdisplay, vgdisplay and lvdisplay
-a, please see attached log file. If more information is necessary, please
ask.
I am using one of the infamous AMD Ryzen 2400G with an AMD B350 chipset,
which suffers from random lockups related to CPU C-states [1]. With the
recent AGESA 1.0.0.2a firmware update and the introduction of the "Power
Supply Idle Control = Typical Current Idle" setting, the system is stable,
if it boots at all. But it often takes several attempts to boot -- the
failed attempts ending in weird firmware / EFI or CPU / idle related Linux
kernel stack traces, which sometimes even require a hard reset, since
soft-reboot (ctrl+alt +del) sometimes has no effect, because init dies.
This was such a situation, where I had to reboot (ctrl+alt+del) and reset
(hard) the system several times, at the end of which everything seemed
fine, except that dracut was timing out when activating the disks.
Debugging the situation using a Fedora 28 live system resulted in the
attached log file.
--Dennis
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=196683
P.S. I already found older posts [2,3] describing a similar scenario. At the# lvchange -ay ernie/system
Check of pool ernie/cache failed (status:1). Manual repair required!
# lvconvert --repair ernie/system
bad checksum in superblock
Repair of cache metadata volume of cache ernie/system failed (status:1).
Manual repair required!
How do I recover from this situation and repair the volume group?
For a full log, including the output of pvdisplay, vgdisplay and lvdisplay
-a, please see attached log file. If more information is necessary, please
ask.
I am using one of the infamous AMD Ryzen 2400G with an AMD B350 chipset,
which suffers from random lockups related to CPU C-states [1]. With the
recent AGESA 1.0.0.2a firmware update and the introduction of the "Power
Supply Idle Control = Typical Current Idle" setting, the system is stable,
if it boots at all. But it often takes several attempts to boot -- the
failed attempts ending in weird firmware / EFI or CPU / idle related Linux
kernel stack traces, which sometimes even require a hard reset, since
soft-reboot (ctrl+alt +del) sometimes has no effect, because init dies.
This was such a situation, where I had to reboot (ctrl+alt+del) and reset
(hard) the system several times, at the end of which everything seemed
fine, except that dracut was timing out when activating the disks.
Debugging the situation using a Fedora 28 live system resulted in the
attached log file.
--Dennis
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=196683
time a recovery was impossible for the user, but it seems that the situation
improved somewhat since then. However, I am still stick, with `lvconvert --
repair` asking me to repair "manually" (whatever that means), `cache_dump --
repair` not being able to operate on non-active LVs and LVM refusing to
activate the LV as long as it has not been repaired.
[2]: https://www.redhat.com/archives/linux-lvm/2016-December/msg00013.html
[3]: https://www.redhat.com/archives/linux-lvm/2015-August/msg00008.html