Discussion:
[linux-lvm] Dmcache crash
Bertrand Paquet
2016-06-06 21:17:36 UTC
Permalink
Hi all,

The dmcache setup on one production server crashed this afternoon. The
error in syslog is below.

After a reboot, logical volumes are still here, but inactive.

# lvs -a
LV VG Attr LSize Pool Origin Data%
Meta% Move Log Cpy%Sync Convert
cache storage Cwi---C--- 834.41g

[cache_cdata] storage Cwi------- 834.41g

[cache_cmeta] storage ewi------- 16.00g

[lvol0_pmspare] storage ewi------- 16.00g

tank storage Cwi---C--- 5.00t cache [tank_corig]

[tank_corig] storage owi---C--- 5.00t

I'm not able to activate them.

# vgchange -ay
Check of pool storage/cache failed (status:1). Manual repair required!
0 logical volume(s) in volume group "storage" now active

Any idea to bring back the volume online, or at least to recover the data ?

Kernel : 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.5-1~bpo8+1 (2016-02-23)
x86_64 GNU/Linux
Distro : debian jessie

Any help will be greatly appreciated.

Regards,

Bertrand

Error log :
Jun 6 11:58:06 xxxx kernel: [3015831.571379] device-mapper: array:
array_block_check failed: blocknr 1082331758718 != wanted 16969
Jun 6 11:58:06 xxxx kernel: [3015831.571453] device-mapper: block manager:
array validator check failed for block 16969
Jun 6 11:58:06 xxxx kernel: [3015831.571515] device-mapper: cache: 252:3:
demotion failed; couldn't update on disk metadata
Jun 6 11:58:06 xxxx kernel: [3015831.571578] device-mapper: cache: 252:3:
metadata operation 'dm_cache_remove_mapping' failed: error = -15
Jun 6 11:58:06 xxxx kernel: [3015831.571664] device-mapper: cache: 252:3:
aborting current metadata transaction
Jun 6 11:58:06 xxxx kernel: [3015831.572836] device-mapper: cache: 252:3:
switching cache to read-only mode
Jun 6 11:58:06 xxxx kernel: [3015831.572840] device-mapper: cache: 252:3:
demotion failed; couldn't update on disk metadata
Jun 6 11:58:06 xxxx kernel: [3015831.572901] device-mapper: cache: 252:3:
metadata operation 'dm_cache_remove_mapping' failed: error = -22
Jun 6 11:58:06 xxxx kernel: [3015831.572973] device-mapper: cache: 252:3:
demotion failed; couldn't update on disk metadata
Jun 6 11:58:06 xxxx kernel: [3015831.573034] device-mapper: cache: 252:3:
metadata operation 'dm_cache_remove_mapping' failed: error = -22
Jun 6 11:58:06 xxxx kernel: [3015831.573103] device-mapper: cache: 252:3:
demotion failed; couldn't copy block
Jun 6 11:58:11 xxxx kernel: [3015836.425768] Aborting journal on device
dm-3-8.
Jun 6 11:58:11 xxxx kernel: [3015836.425853] Buffer I/O error on dev dm-3,
logical block 671121408, lost sync page write
Jun 6 11:58:11 xxxx kernel: [3015836.425939] JBD2: Error -5 detected when
updating journal superblock for dm-3-8.
Jun 6 11:58:11 xxxx kernel: [3015836.444140] Buffer I/O error on dev dm-3,
logical block 0, lost sync page write
Jun 6 11:58:11 xxxx kernel: [3015836.444234] EXT4-fs error (device dm-3):
ext4_journal_check_start:56: Detected aborted journal
Jun 6 11:58:11 xxxx kernel: [3015836.444311] EXT4-fs (dm-3): Remounting
filesystem read-only
Jun 6 11:58:11 xxxx kernel: [3015836.444352] EXT4-fs (dm-3): previous I/O
error to superblock detected
Jun 6 11:58:11 xxxx kernel: [3015836.444428] Buffer I/O error on dev dm-3,
logical block 0, lost sync page write
Jun 6 11:58:15 xxxx kernel: [3015840.507213] migration_success_pre_commit:
12194 callbacks suppressed
Jun 6 11:58:15 xxxx kernel: [3015840.507217] device-mapper: cache: 252:3:
promotion failed; couldn't update on disk metadata
Jun 6 11:58:15 xxxx kernel: [3015840.507291] device-mapper: cache: 252:3:
metadata operation 'dm_cache_insert_mapping' failed: error = -22
Jun 6 11:58:19 xxxx kernel: [3015844.720334] device-mapper: cache: 252:3:
promotion failed; couldn't update on disk metadata
Jun 6 11:58:19 xxxx kernel: [3015844.720406] device-mapper: cache: 252:3:
metadata operation 'dm_cache_insert_mapping' failed: error = -22
Jun 6 11:59:15 xxxx kernel: [3015901.091051] device-mapper: cache: 252:3:
demotion failed; couldn't update on disk metadata
Jun 6 11:59:15 xxxx kernel: [3015901.091124] device-mapper: cache: 252:3:
metadata operation 'dm_cache_remove_mapping' failed: error = -22
Jun 6 11:59:15 xxxx kernel: [3015901.091207] device-mapper: cache: 252:3:
demotion failed; couldn't update on disk metadata
Jun 6 11:59:15 xxxx kernel: [3015901.091274] device-mapper: cache: 252:3:
metadata operation 'dm_cache_remove_mapping' failed: error = -22
Jun 6 11:59:15 xxxx kernel: [3015901.091482] device-mapper: cache: 252:3:
demotion failed; couldn't update on disk metadata
Jun 6 11:59:15 xxxx kernel: [3015901.091551] device-mapper: cache: 252:3:
metadata operation 'dm_cache_remove_mapping' failed: error = -22
Jun 6 11:59:15 xxxx kernel: [3015901.091622] device-mapper: cache: 252:3:
demotion failed; couldn't update on disk metadata
Jun 6 11:59:15 xxxx kernel: [3015901.091689] device-mapper: cache: 252:3:
metadata operation 'dm_cache_remove_mapping' failed: error = -22
Jun 6 11:59:15 xxxx kernel: [3015901.092341] device-mapper: cache: 252:3:
demotion failed; couldn't update on disk metadata
Jun 6 11:59:15 xxxx kernel: [3015901.092409] device-mapper: cache: 252:3:
metadata operation 'dm_cache_remove_mapping' failed: error = -22
Jun 6 11:59:30 xxxx kernel: [3015915.162952] migration_success_pre_commit:
1194 callbacks suppressed
Jun 6 11:59:30 xxxx kernel: [3015915.162955] device-mapper: cache: 252:3:
promotion failed; couldn't update on disk metadata
Jun 6 11:59:30 xxxx kernel: [3015915.163034] device-mapper: cache: 252:3:
metadata operation 'dm_cache_insert_mapping' failed: error = -22
Jun 6 11:59:38 xxxx kernel: [3015923.147099] device-mapper: cache: 252:3:
promotion failed; couldn't update on disk metadata
Jun 6 11:59:38 xxxx kernel: [3015923.147183] device-mapper: cache: 252:3:
metadata operation 'dm_cache_insert_mapping' failed: error = -22
Jun 6 11:59:39 xxxx kernel: [3015924.308771] device-mapper: cache: 252:3:
promotion failed; couldn't update on disk metadata
Jun 6 11:59:39 xxxx kernel: [3015924.308846] device-mapper: cache: 252:3:
metadata operation 'dm_cache_insert_mapping' failed: error = -22
Jun 6 11:59:59 xxxx kernel: [3015944.161468] device-mapper: cache: 252:3:
promotion failed; couldn't update on disk metadata
Jun 6 11:59:59 xxxx kernel: [3015944.161541] device-mapper: cache: 252:3:
metadata operation 'dm_cache_insert_mapping' failed: error = -22
Jun 6 12:00:01 xxxx kernel: [3015946.457345] device-mapper: cache: 252:3:
promotion failed; couldn't update on disk metadata
Jun 6 12:00:01 xxxx kernel: [3015946.457416] device-mapper: cache: 252:3:
metadata operation 'dm_cache_insert_mapping' failed: error = -22
Jun 6 12:00:42 xxxx kernel: [3015987.290496] device-mapper: cache: 252:3:
promotion failed; couldn't update on disk metadata
Jun 6 12:00:42 xxxx kernel: [3015987.290569] device-mapper: cache: 252:3:
metadata operation 'dm_cache_insert_mapping' failed: error = -22
John Stoffel
2016-06-08 14:10:56 UTC
Permalink
Bertrand> The dmcache setup on one production server crashed this
Bertrand> afternoon. The error in syslog is below.

It looks like you might have some bad blocks on the disk. Is your LVM
setup on top of MD RAID by any chance? Have you run a 'check' against
the RAID to see if there's a bad block or two that needs to be
re-written?




Bertrand> After a reboot, logical volumes are still here, but inactive.

Bertrand> # lvs -a
Bertrand>   LV              VG      Attr       LSize   Pool  Origin       Data%  Meta%  Move Log Cpy%Sync
Bertrand> Convert
Bertrand>   cache           storage Cwi---C--- 834.41g                                                      
Bertrand>    
Bertrand>   [cache_cdata]   storage Cwi------- 834.41g                                                      
Bertrand>    
Bertrand>   [cache_cmeta]   storage ewi-------  16.00g                                                      
Bertrand>    
Bertrand>   [lvol0_pmspare] storage ewi-------  16.00g                                                      
Bertrand>    
Bertrand>   tank            storage Cwi---C---   5.00t cache [tank_corig]                                  
Bertrand>      
Bertrand>   [tank_corig]    storage owi---C---   5.00t                                
Bertrand>  
Bertrand> I'm not able to activate them.

Bertrand> # vgchange -ay
Bertrand>   Check of pool storage/cache failed (status:1). Manual repair required!
Bertrand>   0 logical volume(s) in volume group "storage" now active

Bertrand> Any idea to bring back the volume online, or at least to recover the data ?

Bertrand> Kernel : 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.5-1~bpo8+1 (2016-02-23) x86_64 GNU/Linux
Bertrand> Distro : debian jessie

Bertrand> Any help will be greatly appreciated.

Bertrand> Regards,

Bertrand> Bertrand

Bertrand> Error log : 
Bertrand> Jun  6 11:58:06 xxxx kernel: [3015831.571379] device-mapper: array: array_block_check failed:
Bertrand> blocknr 1082331758718 != wanted 16969
Bertrand> Jun  6 11:58:06 xxxx kernel: [3015831.571453] device-mapper: block manager: array validator check
Bertrand> failed for block 16969
Bertrand> Jun  6 11:58:06 xxxx kernel: [3015831.571515] device-mapper: cache: 252:3: demotion failed;
Bertrand> couldn't update on disk metadata
Bertrand> Jun  6 11:58:06 xxxx kernel: [3015831.571578] device-mapper: cache: 252:3: metadata operation
Bertrand> 'dm_cache_remove_mapping' failed: error = -15
Bertrand> Jun  6 11:58:06 xxxx kernel: [3015831.571664] device-mapper: cache: 252:3: aborting current
Bertrand> metadata transaction
Bertrand> Jun  6 11:58:06 xxxx kernel: [3015831.572836] device-mapper: cache: 252:3: switching cache to
Bertrand> read-only mode
Bertrand> Jun  6 11:58:06 xxxx kernel: [3015831.572840] device-mapper: cache: 252:3: demotion failed;
Bertrand> couldn't update on disk metadata
Bertrand> Jun  6 11:58:06 xxxx kernel: [3015831.572901] device-mapper: cache: 252:3: metadata operation
Bertrand> 'dm_cache_remove_mapping' failed: error = -22
Bertrand> Jun  6 11:58:06 xxxx kernel: [3015831.572973] device-mapper: cache: 252:3: demotion failed;
Bertrand> couldn't update on disk metadata
Bertrand> Jun  6 11:58:06 xxxx kernel: [3015831.573034] device-mapper: cache: 252:3: metadata operation
Bertrand> 'dm_cache_remove_mapping' failed: error = -22
Bertrand> Jun  6 11:58:06 xxxx kernel: [3015831.573103] device-mapper: cache: 252:3: demotion failed;
Bertrand> couldn't copy block
Bertrand> Jun  6 11:58:11 xxxx kernel: [3015836.425768] Aborting journal on device dm-3-8.
Bertrand> Jun  6 11:58:11 xxxx kernel: [3015836.425853] Buffer I/O error on dev dm-3, logical block
Bertrand> 671121408, lost sync page write
Bertrand> Jun  6 11:58:11 xxxx kernel: [3015836.425939] JBD2: Error -5 detected when updating journal
Bertrand> superblock for dm-3-8.
Bertrand> Jun  6 11:58:11 xxxx kernel: [3015836.444140] Buffer I/O error on dev dm-3, logical block 0, lost
Bertrand> sync page write
Bertrand> Jun  6 11:58:11 xxxx kernel: [3015836.444234] EXT4-fs error (device dm-3):
Bertrand> ext4_journal_check_start:56: Detected aborted journal
Bertrand> Jun  6 11:58:11 xxxx kernel: [3015836.444311] EXT4-fs (dm-3): Remounting filesystem read-only
Bertrand> Jun  6 11:58:11 xxxx kernel: [3015836.444352] EXT4-fs (dm-3): previous I/O error to superblock
Bertrand> detected
Bertrand> Jun  6 11:58:11 xxxx kernel: [3015836.444428] Buffer I/O error on dev dm-3, logical block 0, lost
Bertrand> sync page write
Bertrand> Jun  6 11:58:15 xxxx kernel: [3015840.507213] migration_success_pre_commit: 12194 callbacks
Bertrand> suppressed
Bertrand> Jun  6 11:58:15 xxxx kernel: [3015840.507217] device-mapper: cache: 252:3: promotion failed;
Bertrand> couldn't update on disk metadata
Bertrand> Jun  6 11:58:15 xxxx kernel: [3015840.507291] device-mapper: cache: 252:3: metadata operation
Bertrand> 'dm_cache_insert_mapping' failed: error = -22
Bertrand> Jun  6 11:58:19 xxxx kernel: [3015844.720334] device-mapper: cache: 252:3: promotion failed;
Bertrand> couldn't update on disk metadata
Bertrand> Jun  6 11:58:19 xxxx kernel: [3015844.720406] device-mapper: cache: 252:3: metadata operation
Bertrand> 'dm_cache_insert_mapping' failed: error = -22
Bertrand> Jun  6 11:59:15 xxxx kernel: [3015901.091051] device-mapper: cache: 252:3: demotion failed;
Bertrand> couldn't update on disk metadata
Bertrand> Jun  6 11:59:15 xxxx kernel: [3015901.091124] device-mapper: cache: 252:3: metadata operation
Bertrand> 'dm_cache_remove_mapping' failed: error = -22
Bertrand> Jun  6 11:59:15 xxxx kernel: [3015901.091207] device-mapper: cache: 252:3: demotion failed;
Bertrand> couldn't update on disk metadata
Bertrand> Jun  6 11:59:15 xxxx kernel: [3015901.091274] device-mapper: cache: 252:3: metadata operation
Bertrand> 'dm_cache_remove_mapping' failed: error = -22
Bertrand> Jun  6 11:59:15 xxxx kernel: [3015901.091482] device-mapper: cache: 252:3: demotion failed;
Bertrand> couldn't update on disk metadata
Bertrand> Jun  6 11:59:15 xxxx kernel: [3015901.091551] device-mapper: cache: 252:3: metadata operation
Bertrand> 'dm_cache_remove_mapping' failed: error = -22
Bertrand> Jun  6 11:59:15 xxxx kernel: [3015901.091622] device-mapper: cache: 252:3: demotion failed;
Bertrand> couldn't update on disk metadata
Bertrand> Jun  6 11:59:15 xxxx kernel: [3015901.091689] device-mapper: cache: 252:3: metadata operation
Bertrand> 'dm_cache_remove_mapping' failed: error = -22
Bertrand> Jun  6 11:59:15 xxxx kernel: [3015901.092341] device-mapper: cache: 252:3: demotion failed;
Bertrand> couldn't update on disk metadata
Bertrand> Jun  6 11:59:15 xxxx kernel: [3015901.092409] device-mapper: cache: 252:3: metadata operation
Bertrand> 'dm_cache_remove_mapping' failed: error = -22
Bertrand> Jun  6 11:59:30 xxxx kernel: [3015915.162952] migration_success_pre_commit: 1194 callbacks
Bertrand> suppressed
Bertrand> Jun  6 11:59:30 xxxx kernel: [3015915.162955] device-mapper: cache: 252:3: promotion failed;
Bertrand> couldn't update on disk metadata
Bertrand> Jun  6 11:59:30 xxxx kernel: [3015915.163034] device-mapper: cache: 252:3: metadata operation
Bertrand> 'dm_cache_insert_mapping' failed: error = -22
Bertrand> Jun  6 11:59:38 xxxx kernel: [3015923.147099] device-mapper: cache: 252:3: promotion failed;
Bertrand> couldn't update on disk metadata
Bertrand> Jun  6 11:59:38 xxxx kernel: [3015923.147183] device-mapper: cache: 252:3: metadata operation
Bertrand> 'dm_cache_insert_mapping' failed: error = -22
Bertrand> Jun  6 11:59:39 xxxx kernel: [3015924.308771] device-mapper: cache: 252:3: promotion failed;
Bertrand> couldn't update on disk metadata
Bertrand> Jun  6 11:59:39 xxxx kernel: [3015924.308846] device-mapper: cache: 252:3: metadata operation
Bertrand> 'dm_cache_insert_mapping' failed: error = -22
Bertrand> Jun  6 11:59:59 xxxx kernel: [3015944.161468] device-mapper: cache: 252:3: promotion failed;
Bertrand> couldn't update on disk metadata
Bertrand> Jun  6 11:59:59 xxxx kernel: [3015944.161541] device-mapper: cache: 252:3: metadata operation
Bertrand> 'dm_cache_insert_mapping' failed: error = -22
Bertrand> Jun  6 12:00:01 xxxx kernel: [3015946.457345] device-mapper: cache: 252:3: promotion failed;
Bertrand> couldn't update on disk metadata
Bertrand> Jun  6 12:00:01 xxxx kernel: [3015946.457416] device-mapper: cache: 252:3: metadata operation
Bertrand> 'dm_cache_insert_mapping' failed: error = -22
Bertrand> Jun  6 12:00:42 xxxx kernel: [3015987.290496] device-mapper: cache: 252:3: promotion failed;
Bertrand> couldn't update on disk metadata
Bertrand> Jun  6 12:00:42 xxxx kernel: [3015987.290569] device-mapper: cache: 252:3: metadata operation
Bertrand> 'dm_cache_insert_mapping' failed: error = -22

Bertrand> _______________________________________________
Bertrand> linux-lvm mailing list
Bertrand> linux-***@redhat.com
Bertrand> https://www.redhat.com/mailman/listinfo/linux-lvm
Bertrand> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
Loading...