1.問題描述
宿主機使用麒麟V10操作系統(tǒng)作為hostos時,對虛擬機進行軟關機操作,會導致系統(tǒng)宕機,出問題時對應的調用棧如下:
[ 2188.225012] Process CPU 0/KVM (pid: 11384, stack limit = 0x000000005889c77d)
[ 2188.225671] CPU: 13 PID: 11384 Comm: CPU 0/KVM Kdump: loaded Tainted: G OE 4.19.90-52.44.v2207.fortest.ky10.aarch64 #1
[ 2188.226744] Source Version: 9de2a08e7b57c09e80a10421b6798d000d26a533
[ 2188.227412] Hardware name: GreatWall \xe6\x93\x8e\xe5\xa4\xa9DF723/N/A, BIOS KunLun BIOS V4.0 03/17/2021
[ 2188.228157] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[ 2188.228607] pc : queued_spin_lock_slowpath+0x190/0x308
[ 2188.229104] lr : update_lpi_config+0x160/0x168
[ 2188.229503] sp : ffffbeba9ff13790
[ 2188.229821] x29: ffffbeba9ff13790 x28: 0000000000000000
[ 2188.230267] x27: 0000000000000000 x26: ffffbeba9ff139c0
[ 2188.230736] x25: 0000000000000000 x24: 0000000000000000
[ 2188.231309] x23: 0000000000000001 x22: 0000000000000000
[ 2188.231791] x21: ffffc2b986a00000 x20: 00000000e87ab700
[ 2188.232309] x19: ffffc13ce87a9b80 x18: ffff8000c0024390
[ 2188.232834] x17: 0000fffe613ce498 x16: ffff00000811e570
[ 2188.233336] x15: 0000fffd00290007 x14: ffff8000cada6cf8
[ 2188.233862] x13: ffff8000cada6b38 x12: 0000000000000000
[ 2188.234363] x11: 0000000000000040 x10: ffff0000098ef8b0
[ 2188.234879] x9 : 000000000808008c x8 : 0000000000000000
[ 2188.235416] x7 : ffff4cf9f0373b48 x6 : 0000000000380000
[ 2188.235900] x5 : ffffc53d36082600 x4 : 000001a400000004
[ 2188.236359] x3 : ffff4cf9effd2620 x2 : ffffc53d36082600
[ 2188.236836] x1 : ffff4cf9effd2000 x0 : ffffc53d36082608
[ 2188.237311] Call trace:
[ 2188.237558] queued_spin_lock_slowpath+0x190/0x308
[ 2188.237980] update_lpi_config+0x160/0x168
[ 2188.238338] vgic_its_process_commands.part.11+0x898/0x9b0
[ 2188.238819] vgic_mmio_write_its_cwriter+0xa4/0xa8
[ 2188.239242] dispatch_mmio_write+0x94/0x110
[ 2188.239624] __kvm_io_bus_write.isra.27+0xa4/0x158
[ 2188.240062] kvm_io_bus_write+0x68/0x90
[ 2188.240430] io_mem_abort+0xd8/0x350
[ 2188.240758] kvm_handle_guest_abort+0x2a8/0x478
[ 2188.241269] handle_exit+0x184/0x368
[ 2188.241635] kvm_arch_vcpu_ioctl_run+0x250/0x850
[ 2188.242044] kvm_vcpu_ioctl+0x460/0x880
[ 2188.242401] do_vfs_ioctl+0xb0/0x8e8
[ 2188.242733] ksys_ioctl+0x8c/0xa0
[ 2188.243055] sys_ioctl+0x34/0xa0
[ 2188.243373] __sys_trace_return+0x0/0x4
2.受影響的軟件包
銀河麒麟高級服務器操作系統(tǒng) V10 SP3 2303 aarch64
4.19.90-52.44.v2207
銀河麒麟高級服務器操作系統(tǒng) V10 SP3 2403 aarch64
4.19.90-89.18.v2401~4.19.90-89.19.v2401
3.問題復現(xiàn)方法
在部署麒麟操作系統(tǒng)的主機上啟動虛擬機。在虛擬機中執(zhí)行電源->關機操作,或者使用virsh shutdown命令關閉虛擬機,會導致主機宕機。
4.問題分析結果
該問題是因為上游社區(qū)解決CVE-2024-26598的補丁ad362fe07fec ("KVM: arm64: vgic-its: Avoid potential UAF in LPI translation cache")所引入,該補丁對中斷變量irq進行了引用計數(shù)累加,但是在釋放的時候對irq進行了過早的釋放,從而導致系統(tǒng)訪問irq的時候是個非法地址。針對該問題,修復在正確的位置釋放變量irq,避免出現(xiàn)引用錯誤內存地址,導致系統(tǒng)宕機。
5.補丁及下載地址
通過新內核更新修復
6.修復和更新方法
yum update kernel(用root權限執(zhí)行以下命令)