Discussion:
[linux-lvm] corruption on reattaching cache
Markus Mikkolainen
2016-06-09 20:24:33 UTC
Permalink
I seem to have hit the same snag as Mark describes in his post.

https://www.redhat.com/archives/linux-lvm/2015-April/msg00025.html

with kernel 4.4.6 I detached (--splitcache) a writeback cache from a
mounted lv which was then synchronized and detached. Then I reattached it
and shortly detached it again. What was interesting is that after the
second detach it synchronized AGAIN starting from 100% , and then I
started getting filesystem errors. I immediately shutdown, and forced an
fsck , and didnt lose that much data, but still had some stuff to correct.

It looked to me like a detached cache, being reattached will retain all
cached data on it, even though it was supposed to be written to the
backing disk, and then instead of marking it clean on attaching, it will
continue serving old data from the cache.
Zdenek Kabelac
2016-06-10 09:09:50 UTC
Permalink
Post by Markus Mikkolainen
I seem to have hit the same snag as Mark describes in his post.
https://www.redhat.com/archives/linux-lvm/2015-April/msg00025.html
with kernel 4.4.6 I detached (--splitcache) a writeback cache from a mounted
lv which was then synchronized and detached. Then I reattached it and shortly
detached it again. What was interesting is that after the second detach it
synchronized AGAIN starting from 100% , and then I started getting filesystem
errors. I immediately shutdown, and forced an fsck , and didnt lose that much
data, but still had some stuff to correct.
It looked to me like a detached cache, being reattached will retain all cached
data on it, even though it was supposed to be written to the backing disk, and
then instead of marking it clean on attaching, it will continue serving old
data from the cache.
Yes - known issue, --splitcache is rather for 'debugging' purposes.
Use --uncache and create new cache when needed.

Splitted cache needs to be cleared on reattachment - but that needs further
code rework.

The idea behind is - we want to support 'offline' writeback of data as ATM
cache target doesn't work well if there is any disk error - i.e. cache is in
writeback mode and has 'error' sector - you can't clean such cache...

Regards

Zdenek
Mark Hills
2016-06-11 09:24:39 UTC
Permalink
Post by Zdenek Kabelac
Post by Markus Mikkolainen
I seem to have hit the same snag as Mark describes in his post.
https://www.redhat.com/archives/linux-lvm/2015-April/msg00025.html
with kernel 4.4.6 I detached (--splitcache) a writeback cache from a mounted
lv which was then synchronized and detached. Then I reattached it and shortly
detached it again. What was interesting is that after the second detach it
synchronized AGAIN starting from 100% , and then I started getting filesystem
errors. I immediately shutdown, and forced an fsck , and didnt lose that much
data, but still had some stuff to correct.
It looked to me like a detached cache, being reattached will retain all cached
data on it, even though it was supposed to be written to the backing disk, and
then instead of marking it clean on attaching, it will continue serving old
data from the cache.
Yes - known issue, --splitcache is rather for 'debugging' purposes.
Use --uncache and create new cache when needed.
Splitted cache needs to be cleared on reattachment - but that needs further
code rework.
It's ok that this is part of a wider picture.

If it is imncomplete, it might be wise to block the user from doing the
operation, or force them to confirm at the time of reattaching (along with
a summary of the risk)

Or, if '--splitcache' is completely for debugging purposes, then it
probably should be removed from the section "Cache removal" of the
lvmcache(7) man page.

In my case I was following the instructions on the page, which state that
the result is an "unused cache pool LV". I wrongly understood that to mean
one which is the same as a newly-created one with the same parameters.

As with Markus, I also experienced data corruption which I was lucky to
spot, and lucky to have a backup to restore from.
Post by Zdenek Kabelac
The idea behind is - we want to support 'offline' writeback of data as ATM
cache target doesn't work well if there is any disk error - i.e. cache is in
writeback mode and has 'error' sector - you can't clean such cache...
Interesting... is there scope for long-term writeback caching in this
design? My own personal use case is I would like to spin down the hard
drive in the machine for the majority of the time.

Many thanks
--
Mark
Markus Mikkolainen
2016-06-10 10:51:26 UTC
Permalink
could there atleast be some kind of a warning if the reattaching will
result in you destroying your filesystem? or possibly make the
"--splitcache" warn that this is a debug feature and the result might be
bad?

basically NOTHING suggested to me that i might be doing something that
could destroy my filesystem.
Post by Zdenek Kabelac
Post by Markus Mikkolainen
I seem to have hit the same snag as Mark describes in his post.
https://www.redhat.com/archives/linux-lvm/2015-April/msg00025.html
with kernel 4.4.6 I detached (--splitcache) a writeback cache from a mounted
lv which was then synchronized and detached. Then I reattached it and shortly
detached it again. What was interesting is that after the second detach it
synchronized AGAIN starting from 100% , and then I started getting filesystem
errors. I immediately shutdown, and forced an fsck , and didnt lose that much
data, but still had some stuff to correct.
It looked to me like a detached cache, being reattached will retain all cached
data on it, even though it was supposed to be written to the backing disk, and
then instead of marking it clean on attaching, it will continue serving old
data from the cache.
Yes - known issue, --splitcache is rather for 'debugging' purposes.
Use --uncache and create new cache when needed.
Splitted cache needs to be cleared on reattachment - but that needs further
code rework.
The idea behind is - we want to support 'offline' writeback of data as ATM
cache target doesn't work well if there is any disk error - i.e. cache is in
writeback mode and has 'error' sector - you can't clean such cache...
Regards
Zdenek
_______________________________________________
linux-lvm mailing list
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Zdenek Kabelac
2016-06-13 08:48:21 UTC
Permalink
could there atleast be some kind of a warning if the reattaching will result
in you destroying your filesystem? or possibly make the "--splitcache" warn
that this is a debug feature and the result might be bad?
I'll try to provide some fixing commit to let reattach without
zeroing require -Zn with lvconvert.
basically NOTHING suggested to me that i might be doing something that could
destroy my filesystem.
Yep
Post by Zdenek Kabelac
Post by Markus Mikkolainen
I seem to have hit the same snag as Mark describes in his post.
https://www.redhat.com/archives/linux-lvm/2015-April/msg00025.html
with kernel 4.4.6 I detached (--splitcache) a writeback cache from a mounted
lv which was then synchronized and detached. Then I reattached it and shortly
detached it again. What was interesting is that after the second detach it
synchronized AGAIN starting from 100% , and then I started getting filesystem
errors. I immediately shutdown, and forced an fsck , and didnt lose that much
data, but still had some stuff to correct.
It looked to me like a detached cache, being reattached will retain all cached
data on it, even though it was supposed to be written to the backing disk, and
then instead of marking it clean on attaching, it will continue serving old
data from the cache.
Yes - known issue, --splitcache is rather for 'debugging' purposes.
Use --uncache and create new cache when needed.
Splitted cache needs to be cleared on reattachment - but that needs further
code rework.
The idea behind is - we want to support 'offline' writeback of data as ATM
cache target doesn't work well if there is any disk error - i.e. cache is in
writeback mode and has 'error' sector - you can't clean such cache...
Regards
Zdenek
_______________________________________________
linux-lvm mailing list
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
_______________________________________________
linux-lvm mailing list
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Loading...