Discussion:
[linux-lvm] Why doesn't the lvmcache support the discard (trim) command?
Ilia Zykov
2018-10-18 22:56:04 UTC
Permalink
Maybe it will be implemented later? But it seems to me a little strange when there is no way to clear the cache from a garbage.
Maybe I do not understand? Can you please explain this behavior.
For example:

I have the full cached partition, with full cache:

[***@localhost ~]# df -h /data
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/test-data 12G 12G 0 100% /data
[***@localhost ~]# lvs -a
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
data test Cwi-aoC--- 12.00g [fast] [data_corig] 99.72 2.93 0.00
[data_corig] test owi-aoC--- 12.00g
[fast] test Cwi---C--- 1.00g 99.72 2.93 0.00
[fast_cdata] test Cwi-ao---- 1.00g
[fast_cmeta] test ewi-ao---- 8.00m
[lvol0_pmspare] test ewi------- 8.00m

I clear the partition and do fstrim:

[***@localhost ~]# rm -rf /data/*
[***@localhost ~]# fstrim -v /data
[***@localhost ~]# df -h /data
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/test-data 12G 41M 12G 1% /data


But the cache remained full:

[***@localhost ~]# lvs -a
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
data test Cwi-aoC--- 12.00g [fast] [data_corig] 99.72 2.93 0.00
[data_corig] test owi-aoC--- 12.00g
[fast] test Cwi---C--- 1.00g 99.72 2.93 0.00
[fast_cdata] test Cwi-ao---- 1.00g
[fast_cmeta] test ewi-ao---- 8.00m
[lvol0_pmspare] test ewi------- 8.00m

Thank you.
Zdenek Kabelac
2018-10-19 09:12:10 UTC
Permalink
Post by Ilia Zykov
Maybe it will be implemented later? But it seems to me a little strange when there is no way to clear the cache from a garbage.
Maybe I do not understand? Can you please explain this behavior.
Hi

Applying my brain logic here:

Cache (by default) operates on 32KB chunks.
SSD (usually) have the minimal size of trimable block as 512KB.

Conclusion can be there is non-trivial to even implement TRIM support for
cache - as something would need to keep a secondary data structure which would
keep the information about which all cached blocks are completely
'unused/trimmed' and available from a 'complete block trim' (i.e. something
like when ext4 implements 'fstrim' support.)

Second thought - if there is a wish to completely 'erase' cache - there is
very simple path by using 'lvconvert --uncache' - and once the cache is needed
again, create cache again from scratch.

Note - dm-cache is SLOW moving cache - so it doesn't target acceleration
one-time usage - i.e. if you read block just once from slow storage - it
doesn't mean it will be immediately cached.

Dm-cache is about keeping info about used blocks on 'slow' storage (hdd) which
typically does not support/implemnent TRIM. There could be possibly a
multi-layer cache, where even the cached device can handle TRIM - but this
kind on construct is not really support and it's even unclear if it would make
any sense to introduce this concept ATM (since there would need to be some
well measurable benefit).

And final note - there is upcoming support for accelerating writes with new
dm-writecache target.

Regards


Zdenek
Gionatan Danti
2018-10-19 09:42:25 UTC
Permalink
Post by Zdenek Kabelac
And final note - there is upcoming support for accelerating writes with
new dm-writecache target.
Hi, should not it be already possible with current dm-cache and
writeback caching?

Thanks.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: ***@assyoma.it - ***@assyoma.it
GPG public key ID: FF5F32A8
Zdenek Kabelac
2018-10-19 09:49:53 UTC
Permalink
Post by Zdenek Kabelac
And final note - there is upcoming support for accelerating writes with new
dm-writecache target.
Hi, should not it be already possible with current dm-cache and writeback
caching?
Hi

dm-cache targets different goal - new dm-writeback cache has way more optimal
write-pattern to accelerate writes - compared even with 'writeback' mode.

Another point is - dm-writecache is written with 'NVDIMMs' in mind and it's
also simpler.


Zdenek
Ilia Zykov
2018-10-19 09:55:09 UTC
Permalink
Post by Zdenek Kabelac
Post by Ilia Zykov
Maybe it will be implemented later? But it seems to me a little
strange when there is no way to clear the cache from a garbage.
Maybe I do not understand? Can you please explain this behavior.
Hi
Cache (by default) operates on 32KB chunks.
SSD (usually) have the minimal size of trimable block as 512KB.
Conclusion can be there is non-trivial to even implement TRIM support
for cache - as something would need to keep a secondary data structure
which would keep the information about which all cached blocks are
completely 'unused/trimmed' and available from a 'complete block trim'
(i.e. something like when ext4  implements 'fstrim' support.)
Second thought -  if there is a wish to completely 'erase' cache - there
is very simple path by using 'lvconvert --uncache' - and once the cache
is needed again, create cache again from scratch.
Note - dm-cache is SLOW moving cache - so it doesn't target acceleration
one-time usage - i.e. if you read block just once from slow storage - it
doesn't mean it will be immediately cached.
Dm-cache is about keeping info about used blocks on 'slow' storage (hdd)
which typically does not support/implemnent TRIM. There could be
possibly a multi-layer cache, where even the cached device can handle
TRIM - but this kind on construct is not really support and it's even
unclear if it would make any sense to introduce this concept ATM  (since
there would need to be some well measurable benefit).
And final note - there is upcoming support for accelerating writes with
new dm-writecache target.
Regards
Zdenek
Thank you, I supposed it is so.
One more little question about dm-writecache:
The description says that:

"It doesn't cache reads because reads are supposed to be cached in page cache
in normal RAM."

Is it only mean, missing reads not promoted to the cache?
Zdenek Kabelac
2018-10-19 10:58:07 UTC
Permalink
Post by Ilia Zykov
Post by Zdenek Kabelac
Post by Ilia Zykov
Maybe it will be implemented later? But it seems to me a little
strange when there is no way to clear the cache from a garbage.
Maybe I do not understand? Can you please explain this behavior.
Hi
Cache (by default) operates on 32KB chunks.
SSD (usually) have the minimal size of trimable block as 512KB.
Conclusion can be there is non-trivial to even implement TRIM support
for cache - as something would need to keep a secondary data structure
which would keep the information about which all cached blocks are
completely 'unused/trimmed' and available from a 'complete block trim'
(i.e. something like when ext4  implements 'fstrim' support.)
Second thought -  if there is a wish to completely 'erase' cache - there
is very simple path by using 'lvconvert --uncache' - and once the cache
is needed again, create cache again from scratch.
Note - dm-cache is SLOW moving cache - so it doesn't target acceleration
one-time usage - i.e. if you read block just once from slow storage - it
doesn't mean it will be immediately cached.
Dm-cache is about keeping info about used blocks on 'slow' storage (hdd)
which typically does not support/implemnent TRIM. There could be
possibly a multi-layer cache, where even the cached device can handle
TRIM - but this kind on construct is not really support and it's even
unclear if it would make any sense to introduce this concept ATM  (since
there would need to be some well measurable benefit).
And final note - there is upcoming support for accelerating writes with
new dm-writecache target.
Regards
Zdenek
Thank you, I supposed it is so.
"It doesn't cache reads because reads are supposed to be cached in page cache
in normal RAM."
Is it only mean, missing reads not promoted to the cache?
Hi

Writecache simply doesn't care about caching your reads at all.
Your RAM with it's page caching mechanism keeps read data as long as there is
free RAM for this - the less RAM goes to page cache - less read operations
remains cached.

It's probably worth to add comment about older dm-cache - where read access is
basically accounted (so the most used blocks cat be promoted to caching
storage device) - if the reads are served by your page-cache - they can't be
accounted - that's just to explain why repeated reads of the same block which
is basically served by your page-cache doesn't lead to quick promotion of
block to cache like one could expect without thinking about details behind....


Zdenek
Gionatan Danti
2018-10-19 12:45:02 UTC
Permalink
Post by Zdenek Kabelac
Hi
Writecache simply doesn't care about caching your reads at all.
Your RAM with it's page caching mechanism keeps read data as long as
there is free RAM for this - the less RAM goes to page cache - less read
operations remains cached.
Hi, does it mean that to have *both* fast write cache *and* read cache
one should use a dm-writeback target + a dm-cache writethrough target
(possibly pointing to different devices)?

Can you quantify/explain why and how faster is dm-writeback for heavy
write workload?

Thanks.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: ***@assyoma.it - ***@assyoma.it
GPG public key ID: FF5F32A8
Zdenek Kabelac
2018-10-19 13:08:55 UTC
Permalink
Post by Zdenek Kabelac
Hi
Writecache simply doesn't care about caching your reads at all.
Your RAM with it's page caching mechanism keeps read data as long as there
is free RAM for this - the less RAM goes to page cache - less read
operations remains cached.
Hi, does it mean that to have *both* fast write cache *and* read cache one
should use a dm-writeback target + a dm-cache writethrough target (possibly
pointing to different devices)?
Can you quantify/explain why and how faster is dm-writeback for heavy write
workload?
Hi

It's rather about different workload takes benefit from different caching
approaches.

If your system is heavy on writes - dm-writecache is what you want,
if you mostly reads - dm-cache will win.

That's why there is dmstats to also help identify hotspots and overal logic.
There is nothing to win always in all cases - so ATM 2 different targets are
provided - NVDIMMs already seems to change game a lot...

dm-writecache could be seen as 'extension' of your page-cache to held longer
list of dirty-pages...

Zdenek
Ilia Zykov
2018-10-19 13:16:54 UTC
Permalink
Post by Zdenek Kabelac
Post by Gionatan Danti
Post by Zdenek Kabelac
Hi
Writecache simply doesn't care about caching your reads at all.
Your RAM with it's page caching mechanism keeps read data as long as
there is free RAM for this - the less RAM goes to page cache - less
read operations remains cached.
Hi, does it mean that to have *both* fast write cache *and* read cache
one should use a dm-writeback target + a dm-cache writethrough target
(possibly pointing to different devices)?
Can you quantify/explain why and how faster is dm-writeback for heavy
write workload?
Hi
It's rather about different workload takes benefit from different
caching approaches.
If your system is heavy on writes -  dm-writecache is what you want,
if you mostly reads - dm-cache will win.
That's why there is  dmstats to also help identify hotspots and overal
logic.
There is nothing to win always in all cases - so ATM 2 different targets
are provided -  NVDIMMs already seems to change game a lot...
dm-writecache could be seen as 'extension' of your page-cache to held
longer list of dirty-pages...
Zdenek
Sorry, but I don't understand too. What be if reboot happens between data writes from the fast cache to the slow device? After reboot what data will be reads? A new data from the fast cache or an old from the slow device? And what data will be read 'dd if=/dev/cached iflag=direct'?
Thanks.
Ilia Zykov
2018-10-19 17:00:53 UTC
Permalink
Post by Zdenek Kabelac
dm-writecache could be seen as 'extension' of your page-cache to held
longer list of dirty-pages...
Zdenek
Does it mean that the dm-writecache is always empty, after reboot?
Thanks.
Zdenek Kabelac
2018-10-22 10:54:41 UTC
Permalink
Post by Ilia Zykov
Post by Zdenek Kabelac
dm-writecache could be seen as 'extension' of your page-cache to held
longer list of dirty-pages...
Zdenek
Does it mean that the dm-writecache is always empty, after reboot?
Thanks.
No, writecache is journaled - so after reboot used content is remembered and
read for use.


Zdenek

Gionatan Danti
2018-10-19 20:54:26 UTC
Permalink
Post by Zdenek Kabelac
Hi
It's rather about different workload takes benefit from different
caching approaches.
If your system is heavy on writes - dm-writecache is what you want,
if you mostly reads - dm-cache will win.
That's why there is dmstats to also help identify hotspots and overal logic.
There is nothing to win always in all cases - so ATM 2 different
targets are provided - NVDIMMs already seems to change game a lot...
dm-writecache could be seen as 'extension' of your page-cache to held
longer list of dirty-pages...
Zdenek
Thanks for these information. Reading a bit the commit which provide
dm-writeback, it seems a sort of "L2 pagecache", right? It should be
*very* interesting for use with NVDIMMs and/or fast NVME devices...

Regards.
--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: ***@assyoma.it - ***@assyoma.it
GPG public key ID: FF5F32A8
Loading...