LVM-Thin Problem – always xfs_repair to up Operating System
What problem/issue/behavior are you having trouble with? What do you expect to see? - can you give the solution for this ? - any related this issue with lvm-thin provisioning ? if any what your suggestion ? - or this is a bug on rhel 7.6 ? Where are you experiencing the behavior? What environment? the hang is gone When does the behavior occur? Frequency? Repeatedly? At certain times? the frequency regarding this issue, i think is every week What information can you provide around timeframes and the business impact? impact to promote in production server
o Server details: System: Mfr: VMware, Inc. Prod: VMware Virtual Platform o OS details Hostname: idcbpjnksapp001 Distro: [redhat-release] Red Hat Enterprise Linux Server release 7.6 (Maipo) Booted kernel: 3.10.0-957.el7.x86_64 GRUB default: 3.10.0-957.el7.x86_64 o Logs Before reboots there are error messages related to 'dm-3' Jun 7 21:11:48 idcbpjnksapp001 kernel: buffer_io_error: 4856 callbacks suppressed Jun 7 21:11:48 idcbpjnksapp001 kernel: Buffer I/O error on dev dm-3, logical block 9897400, lost async page write Jun 7 21:11:48 idcbpjnksapp001 kernel: Buffer I/O error on dev dm-3, logical block 9897401, lost async page write Jun 7 21:11:48 idcbpjnksapp001 kernel: Buffer I/O error on dev dm-3, logical block 9897402, lost async page write [..] Jun 8 09:16:45 idcbpjnksapp001 kernel: XFS: Failing async write: 4490 callbacks suppressed Jun 8 09:16:45 idcbpjnksapp001 kernel: XFS (dm-3): Failing async write on buffer block 0x4b82db8. Retrying async write. Jun 8 09:16:45 idcbpjnksapp001 kernel: XFS (dm-3): Failing async write on buffer block 0x4b8d3e0. Retrying async write. Jun 8 09:16:45 idcbpjnksapp001 kernel: XFS (dm-3): Failing async write on buffer block 0x4b8d3c0. Retrying async write. Jun 8 09:16:45 idcbpjnksapp001 kernel: XFS (dm-3): Failing async write on buffer block 0x4b82db8. Retrying async write. Jun 8 09:16:45 idcbpjnksapp001 kernel: XFS (dm-3): Failing async write on buffer block 0x4b8d3e0. Retrying async write. Jun 8 09:16:45 idcbpjnksapp001 kernel: XFS (dm-3): Failing async write on buffer block 0x4b8d3c0. Retrying async write. Jun 8 09:16:45 idcbpjnksapp001 kernel: XFS (dm-3): Failing async write on buffer block 0x4b82db8. Retrying async write. Jun 8 09:16:45 idcbpjnksapp001 kernel: XFS (dm-3): Failing async write on buffer block 0x4b8d3e0. Retrying async write. Jun 8 09:16:45 idcbpjnksapp001 kernel: XFS (dm-3): Failing async write on buffer block 0x4b8d3c0. Retrying async write. Jun 8 09:16:45 idcbpjnksapp001 kernel: XFS (dm-3): Failing async write on buffer block 0x4b82db8. Retrying async write. Jun 8 09:16:48 idcbpjnksapp001 kernel: XFS (dm-3): metadata I/O error: block 0x4b82db8 ("xfs_buf_iodone_callback_error") error 5 numblks 8 Jun 8 09:16:53 idcbpjnksapp001 kernel: XFS (dm-3): metadata I/O error: block 0x4b82db8 ("xfs_buf_iodone_callback_error") error 5 numblks 8 [...Reboot...] Jun 8 09:21:01 idcbpjnksapp001 kernel: Linux version 3.10.0-957.el7.x86_64 (mockbuild@x86-040.build.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC) ) #1 SMP Thu Oct 4 20:48:51 UTC 2018 Jun 12 19:19:57 idcbpjnksapp001 container-storage-setup: WARNING: /dev/rhel/root: Thin's thin-pool needs inspection. Jun 12 19:19:57 idcbpjnksapp001 container-storage-setup: WARNING: /dev/rhel/swap: Thin's thin-pool needs inspection. Jun 12 19:19:57 idcbpjnksapp001 container-storage-setup: WARNING: /dev/rhel/home: Thin's thin-pool needs inspection. Jun 12 19:19:57 idcbpjnksapp001 container-storage-setup: WARNING: /dev/rhel/var: Thin's thin-pool needs inspection. Jun 12 19:19:57 idcbpjnksapp001 container-storage-setup: WARNING: /dev/rhel/root: Thin's thin-pool needs inspection. Jun 12 19:19:57 idcbpjnksapp001 container-storage-setup: WARNING: /dev/rhel/swap: Thin's thin-pool needs inspection. Jun 12 19:19:57 idcbpjnksapp001 container-storage-setup: WARNING: /dev/rhel/home: Thin's thin-pool needs inspection. Jun 12 19:19:57 idcbpjnksapp001 container-storage-setup: WARNING: /dev/rhel/var: Thin's thin-pool needs inspection. Jun 12 19:20:23 idcbpjnksapp001 kernel: XFS (dm-3): metadata I/O error: block 0xc2710 ("xfs_buf_iodone_callback_error") error 5 numblks 8 Jun 12 19:20:28 idcbpjnksapp001 kernel: XFS (dm-3): metadata I/O error: block 0xc2710 ("xfs_buf_iodone_callback_error") error 5 numblks 8 Jun 12 20:12:19 idcbpjnksapp001 kernel: XFS: Failing async write: 2989 callbacks suppressed Jun 12 20:12:19 idcbpjnksapp001 kernel: XFS (dm-3): Failing async write on buffer block 0xc2710. Retrying async write. Jun 12 20:12:19 idcbpjnksapp001 kernel: XFS (dm-3): Failing async write on buffer block 0xc2710. Retrying async write. Jun 12 20:12:19 idcbpjnksapp001 kernel: XFS (dm-3): Failing async write on buffer block 0xc2710. Retrying async write. Jun 12 20:12:19 idcbpjnksapp001 kernel: XFS (dm-3): Failing async write on buffer block 0xc2710. Retrying async write. Jun 12 20:12:19 idcbpjnksapp001 kernel: XFS (dm-3): Failing async write on buffer block 0xc2710. Retrying async write. Jun 12 20:12:19 idcbpjnksapp001 kernel: XFS (dm-3): Failing async write on buffer block 0xc2710. Retrying async write. Jun 12 20:12:19 idcbpjnksapp001 kernel: XFS (dm-3): Failing async write on buffer block 0xc2710. Retrying async write. Jun 12 20:12:19 idcbpjnksapp001 kernel: XFS (dm-3): Failing async write on buffer block 0xc2710. Retrying async write. Jun 12 20:12:19 idcbpjnksapp001 kernel: XFS (dm-3): Failing async write on buffer block 0xc2710. Retrying async write. Jun 12 20:12:19 idcbpjnksapp001 kernel: XFS (dm-3): Failing async write on buffer block 0xc2710. Retrying async write. Jun 12 20:12:20 idcbpjnksapp001 kernel: XFS (dm-3): metadata I/O error: block 0xc2710 ("xfs_buf_iodone_callback_error") error 5 numblks 8 [...Reboot...] Jun 12 20:14:29 idcbpjnksapp001 kernel: Linux version 3.10.0-957.el7.x86_64 (mockbuild@x86-040.build.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC) ) #1 SMP Thu Oct 4 20:48:51 UTC 2018 o Latest logs Jun 12 19:19:56 idcbpjnksapp001 kernel: buffer_io_error: 323 callbacks suppressed Jun 12 19:19:56 idcbpjnksapp001 kernel: Buffer I/O error on dev dm-3, logical block 77082, lost async page write <--- Jun 12 19:19:56 idcbpjnksapp001 kernel: Buffer I/O error on dev dm-3, logical block 77083, lost async page write Jun 12 19:19:56 idcbpjnksapp001 kernel: Buffer I/O error on dev dm-3, logical block 77084, lost async page write Jun 12 20:21:33 idcbpjnksapp001 kernel: device-mapper: thin: No free metadata blocks Jun 12 20:21:33 idcbpjnksapp001 kernel: device-mapper: thin: 253:2: switching pool to read-only mode <--- Jun 12 20:21:33 idcbpjnksapp001 kernel: device-mapper: thin: 253:2: metadata operation 'dm_pool_commit_metadata' failed: error = -1 Jun 12 20:21:33 idcbpjnksapp001 kernel: device-mapper: thin: 253:2: aborting current metadata transaction <--- o LVM status $ cat sos_commands/lvm2/lvs_-a_-o_lv_tags_devices_--config_global_locking_type_0 WARNING: Locking disabled. Be careful! This could corrupt your metadata. LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert LV Tags Devices home rhel Vwi-aotz-- 8.00g pool00 84.52 [lvol0_pmspare] rhel ewi------- 28.00m /dev/sda3(0) pool00 rhel twi-cotzM- 91.55g 55.14 100.00 pool00_tdata(0) <--- Full thinpool,writable,inherited,-,check needed,(o)pen,(t)hin,(z)eroes,(M)etadata read only [pool00_tdata] rhel Twi-ao---- 91.55g /dev/sda3(7) [pool00_tmeta] rhel ewi-ao---- 28.00m /dev/sda3(23445) root rhel Vwi-aotz-- <49.95g pool00 52.15 swap rhel Vwi-aotz-- 25.60g pool00 29.82 var rhel Vwi-aotz-- 10.00g pool00 100.00 <--- Full u01lv u01vg -wi-ao---- 249.00g /dev/sdb(0) u01lv u01vg -wi-ao---- 249.00g /dev/sdd(0) u02lv u02vg -wi-ao---- 190.25g /dev/sdc(0) $ cat sos_commands/lvm2/pvs_-a_-v_-o_pv_mda_free_pv_mda_size_pv_mda_count_pv_mda_used_count_pe_start_--config_global_locking_type_0 Reloading config files WARNING: Locking disabled. Be careful! This could corrupt your metadata. WARNING: /dev/rhel/root: Thin's thin-pool needs inspection. WARNING: /dev/rhel/swap: Thin's thin-pool needs inspection. WARNING: /dev/rhel/home: Thin's thin-pool needs inspection. WARNING: /dev/rhel/var: Thin's thin-pool needs inspection. PV VG Fmt Attr PSize PFree DevSize PV UUID PMdaFree PMdaSize #PMda #PMdaUse 1st PE /dev/sda1 --- 0 0 2.00m 0 0 0 0 0 /dev/sda2 --- 0 0 500.00m 0 0 0 0 0 /dev/sda3 rhel lvm2 a-- <114.52g 22.91g 114.52g AsIdPe-ohEQ-w0po-yR1A-Y1im-CgJX-mBbn2Y 0 1020.00k 1 1 1.00m <-- rhel VG still has 22+g space /dev/sdb u01vg lvm2 a-- 49.75g 0 50.00g B5CWf8-yBdU-0KFz-GR6E-GFhj-Ft7x-Yf5oNM 0 1020.00k 1 1 1.00m /dev/sdc u02vg lvm2 a-- 199.75g 9.50g 200.00g hjqcqe-vMNI-STUH-vAz1-4e2t-Pv7G-zkCzNz 0 1020.00k 1 1 1.00m /dev/sdd u01vg lvm2 a-- 199.75g 512.00m 200.00g ud9IlR-1nZT-kzi5-S1dD-v8e5-HYYi-7IaDF5 0 1020.00k 1 1 1.00m /dev/u01vg/u01lv --- 0 0 249.00g 0 0 0 0 0 /dev/u02vg/u02lv --- 0 0 190.25g 0 0 0 0 0 Reloading config files o df status of 'var' /dev/mapper/rhel-var 10475520 6944108 3531412 67% /var Action Plan: ----------- [1] Extending thinpool lvm tmeta # lvextend --poolmetadatasize +1000M rhel/pool00 # lvs -ao+devices [2] You may want to run [A] fstrim to discard unused blocks on a mounted filesystem # fstrim /var OR [B] Extend thin lv related to 'var' if you want to give more than 10g to var [a] Execute below command to thinlv 'var' # lvextend -L+100M rhel/var [b] Extend 'xfs' file system on '/var' using https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/storage_administration_guide/xfsgrow [c] Check status # lvs -ao+devices [3] If "M" attribute seen after extending pool00 meta data. pool00 rhel twi-cotzM- 91.55g 55.14 100.00 pool00_tdata(0) # lvchange --refresh rhel/pool00