[CodeGen] Really renumber slot indexes before register allocation #67038

jayfoad · 2023-09-21T16:52:08Z

PR #66334 tried to renumber slot indexes before register allocation, but
the numbering was still affected by list entries for instructions which
had been erased. Fix this to make the register allocator's live range
length heuristics even less dependent on the history of how instructions
have been added to and removed from SlotIndexes's maps.

llvmbot · 2023-09-21T16:53:43Z

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-backend-arm
@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-backend-x86
@llvm/pr-subscribers-backend-msp430
@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-aarch64

Changes

PR #66334 tried to renumber slot indexes before register allocation, but
the numbering was still affected by list entries for instructions which
had been erased. Fix this to make the register allocator's live range
length heuristics even less dependent on the history of how instructions
have been added to and removed from SlotIndexes's maps.

Patch is 38.80 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/67038.diff

693 Files Affected:

(modified) llvm/lib/CodeGen/SlotIndexes.cpp (+21-2)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-outline_atomics.ll (+16-16)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc.ll (+16-16)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-v8a.ll (+16-16)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-lse2.ll (+360-360)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-outline_atomics.ll (+265-265)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-rcpc.ll (+360-360)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-rcpc3.ll (+360-360)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-v8_1a.ll (+15-20)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-v8a.ll (+360-360)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-load-outline_atomics.ll (+16-16)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-load-rcpc.ll (+16-16)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-load-v8a.ll (+16-16)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomicrmw-lse2.ll (+330-330)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomicrmw-outline_atomics.ll (+235-235)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomicrmw-rcpc.ll (+330-330)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomicrmw-rcpc3.ll (+330-330)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomicrmw-v8_1a.ll (+15-20)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomicrmw-v8a.ll (+330-330)
(modified) llvm/test/CodeGen/AArch64/active_lane_mask.ll (+52-52)
(modified) llvm/test/CodeGen/AArch64/arm64-atomic-128.ll (+10-10)
(modified) llvm/test/CodeGen/AArch64/arm64-instruction-mix-remarks.ll (+7-8)
(modified) llvm/test/CodeGen/AArch64/arm64-neon-mul-div.ll (+8-8)
(modified) llvm/test/CodeGen/AArch64/arm64-shrink-wrapping.ll (+32-32)
(modified) llvm/test/CodeGen/AArch64/atomic-ops-msvc.ll (+10-11)
(modified) llvm/test/CodeGen/AArch64/atomic-ops.ll (+3-4)
(modified) llvm/test/CodeGen/AArch64/atomicrmw-uinc-udec-wrap.ll (+11-11)
(modified) llvm/test/CodeGen/AArch64/atomicrmw-xchg-fp.ll (+15-15)
(modified) llvm/test/CodeGen/AArch64/bfis-in-loop.ll (+40-40)
(modified) llvm/test/CodeGen/AArch64/complex-deinterleaving-reductions-predicated-scalable.ll (+16-16)
(modified) llvm/test/CodeGen/AArch64/complex-deinterleaving-uniform-cases.ll (+26-26)
(modified) llvm/test/CodeGen/AArch64/extbinopload.ll (+76-76)
(modified) llvm/test/CodeGen/AArch64/faddp-half.ll (+5-5)
(modified) llvm/test/CodeGen/AArch64/faddsub.ll (+154-154)
(modified) llvm/test/CodeGen/AArch64/fcvt_combine.ll (+42-42)
(modified) llvm/test/CodeGen/AArch64/fdiv.ll (+46-46)
(modified) llvm/test/CodeGen/AArch64/fminimummaximum.ll (+154-154)
(modified) llvm/test/CodeGen/AArch64/fminmax.ll (+154-154)
(modified) llvm/test/CodeGen/AArch64/fpow.ll (+217-219)
(modified) llvm/test/CodeGen/AArch64/fptosi-sat-vector.ll (+286-286)
(modified) llvm/test/CodeGen/AArch64/fptoui-sat-vector.ll (+71-71)
(modified) llvm/test/CodeGen/AArch64/frem.ll (+217-219)
(modified) llvm/test/CodeGen/AArch64/llvm.exp10.ll (+20-23)
(modified) llvm/test/CodeGen/AArch64/neon-dotreduce.ll (+788-788)
(modified) llvm/test/CodeGen/AArch64/neon-extadd.ll (+63-63)
(modified) llvm/test/CodeGen/AArch64/ragreedy-csr.ll (+36-36)
(modified) llvm/test/CodeGen/AArch64/ragreedy-local-interval-cost.ll (+121-127)
(modified) llvm/test/CodeGen/AArch64/sve-fixed-length-masked-scatter.ll (+18-18)
(modified) llvm/test/CodeGen/AArch64/sve-fixed-length-shuffles.ll (+26-26)
(modified) llvm/test/CodeGen/AArch64/sve-int-arith.ll (+7-7)
(modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-fp-to-int.ll (+60-60)
(modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-div.ll (+74-74)
(modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll (+100-106)
(modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-permute-zip-uzp-trn.ll (+116-116)
(modified) llvm/test/CodeGen/AArch64/swifterror.ll (+20-19)
(modified) llvm/test/CodeGen/AArch64/vec-libcalls.ll (+4-4)
(modified) llvm/test/CodeGen/AArch64/vector-fcopysign.ll (+27-27)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/atomic_optimizations_mul_one.ll (+50-50)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement-stack-lower.ll (+317-224)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i128.ll (+85-85)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement-stack-lower.ll (+51-51)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.ll (+18-18)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.div.fmas.ll (+14-14)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll (+24-18)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/mul.ll (+204-204)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/sdiv.i64.ll (+1552-1555)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/sdivrem.ll (+199-199)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/srem.i64.ll (+1067-1072)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/ssubsat.ll (+12-12)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/udiv.i64.ll (+413-413)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/udivrem.ll (+296-297)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i64.ll (+367-367)
(modified) llvm/test/CodeGen/AMDGPU/agpr-copy-no-free-registers.ll (+163-163)
(modified) llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll (+59-59)
(modified) llvm/test/CodeGen/AMDGPU/atomic_optimizations_buffer.ll (+78-78)
(modified) llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll (+18-18)
(modified) llvm/test/CodeGen/AMDGPU/atomic_optimizations_raw_buffer.ll (+78-78)
(modified) llvm/test/CodeGen/AMDGPU/branch-folding-implicit-def-subreg.ll (+229-228)
(modified) llvm/test/CodeGen/AMDGPU/bswap.ll (+44-44)
(modified) llvm/test/CodeGen/AMDGPU/bug-sdag-emitcopyfromreg.ll (+7-7)
(modified) llvm/test/CodeGen/AMDGPU/bypass-div.ll (+103-103)
(modified) llvm/test/CodeGen/AMDGPU/calling-conventions.ll (+121-121)
(modified) llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll (+19-19)
(modified) llvm/test/CodeGen/AMDGPU/ctpop16.ll (+8-8)
(modified) llvm/test/CodeGen/AMDGPU/cttz_zero_undef.ll (+10-10)
(modified) llvm/test/CodeGen/AMDGPU/extract-subvector-16bit.ll (+87-87)
(modified) llvm/test/CodeGen/AMDGPU/fcanonicalize.f16.ll (+56-57)
(modified) llvm/test/CodeGen/AMDGPU/fp_to_sint.ll (+8-8)
(modified) llvm/test/CodeGen/AMDGPU/frem.ll (+87-87)
(modified) llvm/test/CodeGen/AMDGPU/fsqrt.f32.ll (+24-24)
(modified) llvm/test/CodeGen/AMDGPU/gfx-callable-return-types.ll (+297-291)
(modified) llvm/test/CodeGen/AMDGPU/global_atomics_i32_system.ll (+48-48)
(modified) llvm/test/CodeGen/AMDGPU/global_atomics_i64_system.ll (+90-90)
(modified) llvm/test/CodeGen/AMDGPU/global_atomics_scan_fadd.ll (+48-48)
(modified) llvm/test/CodeGen/AMDGPU/global_atomics_scan_fsub.ll (+48-48)
(modified) llvm/test/CodeGen/AMDGPU/half.ll (+90-90)
(modified) llvm/test/CodeGen/AMDGPU/idot8s.ll (+64-64)
(modified) llvm/test/CodeGen/AMDGPU/insert-delay-alu-bug.ll (+73-70)
(modified) llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll (+14-14)
(modified) llvm/test/CodeGen/AMDGPU/insert_vector_elt.v2i16.ll (+95-95)
(modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.iglp.opt.ll (+136-136)
(modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll (+302-302)
(modified) llvm/test/CodeGen/AMDGPU/llvm.exp2.ll (+13-13)
(modified) llvm/test/CodeGen/AMDGPU/llvm.log.ll (+20-20)
(modified) llvm/test/CodeGen/AMDGPU/llvm.log10.ll (+20-20)
(modified) llvm/test/CodeGen/AMDGPU/llvm.log2.ll (+13-13)
(modified) llvm/test/CodeGen/AMDGPU/llvm.round.f64.ll (+154-154)
(modified) llvm/test/CodeGen/AMDGPU/load-constant-i1.ll (+1799-1801)
(modified) llvm/test/CodeGen/AMDGPU/load-constant-i16.ll (+1172-1187)
(modified) llvm/test/CodeGen/AMDGPU/load-constant-i32.ll (+161-159)
(modified) llvm/test/CodeGen/AMDGPU/load-constant-i8.ll (+1818-1832)
(modified) llvm/test/CodeGen/AMDGPU/load-global-i16.ll (+1579-1583)
(modified) llvm/test/CodeGen/AMDGPU/local-atomics-fp.ll (+28-28)
(modified) llvm/test/CodeGen/AMDGPU/move-to-valu-atomicrmw-system.ll (+9-10)
(modified) llvm/test/CodeGen/AMDGPU/mubuf-legalize-operands-non-ptr-intrinsics.ll (+25-25)
(modified) llvm/test/CodeGen/AMDGPU/mubuf-legalize-operands.ll (+25-25)
(modified) llvm/test/CodeGen/AMDGPU/mul.ll (+46-46)
(modified) llvm/test/CodeGen/AMDGPU/preserve-wwm-copy-dst-reg.ll (+104-130)
(modified) llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll (+209-209)
(modified) llvm/test/CodeGen/AMDGPU/rsq.f32.ll (+70-70)
(modified) llvm/test/CodeGen/AMDGPU/scc-clobbered-sgpr-to-vmem-spill.ll (+198-196)
(modified) llvm/test/CodeGen/AMDGPU/sdiv.ll (+168-168)
(modified) llvm/test/CodeGen/AMDGPU/sdiv64.ll (+131-131)
(modified) llvm/test/CodeGen/AMDGPU/select.f16.ll (+15-15)
(modified) llvm/test/CodeGen/AMDGPU/shl.ll (+11-11)
(modified) llvm/test/CodeGen/AMDGPU/si-unify-exit-return-unreachable.ll (+15-14)
(modified) llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll (+997-1012)
(modified) llvm/test/CodeGen/AMDGPU/sra.ll (+22-22)
(modified) llvm/test/CodeGen/AMDGPU/srem64.ll (+142-142)
(modified) llvm/test/CodeGen/AMDGPU/srl.ll (+11-11)
(modified) llvm/test/CodeGen/AMDGPU/swdev373493.ll (+9-9)
(modified) llvm/test/CodeGen/AMDGPU/swdev380865.ll (+4-2)
(modified) llvm/test/CodeGen/AMDGPU/udiv.ll (+48-48)
(modified) llvm/test/CodeGen/AMDGPU/udiv64.ll (+111-111)
(modified) llvm/test/CodeGen/AMDGPU/urem64.ll (+38-38)
(modified) llvm/test/CodeGen/AMDGPU/vni8-across-blocks.ll (+1215-1217)
(modified) llvm/test/CodeGen/AMDGPU/wave32.ll (+7-7)
(modified) llvm/test/CodeGen/AMDGPU/while-break.ll (+10-10)
(modified) llvm/test/CodeGen/AMDGPU/wqm.ll (+31-31)
(modified) llvm/test/CodeGen/ARM/ParallelDSP/complex_dot_prod.ll (+92-91)
(modified) llvm/test/CodeGen/ARM/ParallelDSP/multi-use-loads.ll (+105-104)
(modified) llvm/test/CodeGen/ARM/addsubo-legalization.ll (+8-8)
(modified) llvm/test/CodeGen/ARM/aes-erratum-fix.ll (+452-476)
(modified) llvm/test/CodeGen/ARM/atomicrmw-uinc-udec-wrap.ll (+32-34)
(modified) llvm/test/CodeGen/ARM/bf16-shuffle.ll (+31-31)
(modified) llvm/test/CodeGen/ARM/big-endian-neon-fp16-bitconv.ll (+7-6)
(modified) llvm/test/CodeGen/ARM/cttz.ll (+54-54)
(modified) llvm/test/CodeGen/ARM/fadd-select-fneg-combine.ll (+6-6)
(modified) llvm/test/CodeGen/ARM/fpclamptosat.ll (+116-122)
(modified) llvm/test/CodeGen/ARM/fpclamptosat_vec.ll (+324-324)
(modified) llvm/test/CodeGen/ARM/fptoi-sat-store.ll (+29-26)
(modified) llvm/test/CodeGen/ARM/fptosi-sat-scalar.ll (+174-177)
(modified) llvm/test/CodeGen/ARM/fptoui-sat-scalar.ll (+65-65)
(modified) llvm/test/CodeGen/ARM/funnel-shift.ll (+74-74)
(modified) llvm/test/CodeGen/ARM/machine-cse-cmp.ll (+6-6)
(modified) llvm/test/CodeGen/ARM/minnum-maxnum-intrinsics.ll (+26-26)
(modified) llvm/test/CodeGen/ARM/neon-copy.ll (+13-13)
(modified) llvm/test/CodeGen/ARM/select_const.ll (+9-9)
(modified) llvm/test/CodeGen/ARM/srem-seteq-illegal-types.ll (+40-40)
(modified) llvm/test/CodeGen/ARM/swifterror.ll (+8-8)
(modified) llvm/test/CodeGen/ARM/umulo-128-legalisation-lowering.ll (+115-121)
(modified) llvm/test/CodeGen/ARM/usub_sat.ll (+7-6)
(modified) llvm/test/CodeGen/ARM/usub_sat_plus.ll (+7-6)
(modified) llvm/test/CodeGen/ARM/vecreduce-fmax-legalization-soft-float.ll (+9-9)
(modified) llvm/test/CodeGen/ARM/vecreduce-fmin-legalization-soft-float.ll (+9-9)
(modified) llvm/test/CodeGen/AVR/hardware-mul.ll (+9-9)
(modified) llvm/test/CodeGen/Hexagon/atomicrmw-uinc-udec-wrap.ll (+6-6)
(modified) llvm/test/CodeGen/Hexagon/autohvx/fp-to-int.ll (+104-106)
(modified) llvm/test/CodeGen/Hexagon/autohvx/int-to-fp.ll (+535-533)
(modified) llvm/test/CodeGen/Hexagon/autohvx/isel-truncate.ll (+4-4)
(modified) llvm/test/CodeGen/Hexagon/autohvx/vmpy-parts.ll (+17-17)
(modified) llvm/test/CodeGen/Hexagon/signext-inreg.ll (+59-59)
(modified) llvm/test/CodeGen/MSP430/selectcc.ll (+12-9)
(modified) llvm/test/CodeGen/Mips/llvm-ir/ashr.ll (+9-9)
(modified) llvm/test/CodeGen/Mips/llvm-ir/lshr.ll (+9-9)
(modified) llvm/test/CodeGen/PowerPC/all-atomics.ll (+104-103)
(modified) llvm/test/CodeGen/PowerPC/csr-split.ll (+6-6)
(modified) llvm/test/CodeGen/PowerPC/inc-of-add.ll (+14-16)
(modified) llvm/test/CodeGen/PowerPC/ldst-16-byte.mir (+5-5)
(modified) llvm/test/CodeGen/PowerPC/licm-tocReg.ll (+17-17)
(modified) llvm/test/CodeGen/PowerPC/loop-instr-form-prepare.ll (+33-33)
(modified) llvm/test/CodeGen/PowerPC/loop-instr-prep-non-const-increasement.ll (+9-9)
(modified) llvm/test/CodeGen/PowerPC/more-dq-form-prepare.ll (+158-159)
(modified) llvm/test/CodeGen/PowerPC/no-ctr-loop-if-exit-in-nested-loop.ll (+8-8)
(modified) llvm/test/CodeGen/PowerPC/p10-handle-split-promote-vec.ll (+88-88)
(modified) llvm/test/CodeGen/PowerPC/p10-spill-creq.ll (+17-17)
(modified) llvm/test/CodeGen/PowerPC/ppc64-P9-vabsd.ll (+85-85)
(modified) llvm/test/CodeGen/PowerPC/sat-add.ll (+12-12)
(modified) llvm/test/CodeGen/PowerPC/sms-phi-3.ll (+12-12)
(modified) llvm/test/CodeGen/PowerPC/srem-vector-lkk.ll (+23-23)
(modified) llvm/test/CodeGen/PowerPC/sub-of-not.ll (+14-16)
(modified) llvm/test/CodeGen/PowerPC/umulo-128-legalisation-lowering.ll (+37-37)
(modified) llvm/test/CodeGen/PowerPC/urem-vector-lkk.ll (+10-10)
(modified) llvm/test/CodeGen/PowerPC/vec_conv_fp32_to_i16_elts.ll (+188-188)
(modified) llvm/test/CodeGen/PowerPC/vec_conv_fp32_to_i8_elts.ll (+102-102)
(modified) llvm/test/CodeGen/PowerPC/vec_conv_fp64_to_i16_elts.ll (+118-118)
(modified) llvm/test/CodeGen/PowerPC/vec_conv_fp64_to_i8_elts.ll (+122-122)
(modified) llvm/test/CodeGen/RISCV/add-before-shl.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/bfloat-convert.ll (+30-30)
(modified) llvm/test/CodeGen/RISCV/bfloat-select-fcmp.ll (+42-42)
(modified) llvm/test/CodeGen/RISCV/branch-relaxation.ll (+104-108)
(modified) llvm/test/CodeGen/RISCV/callee-saved-gprs.ll (+96-96)
(modified) llvm/test/CodeGen/RISCV/double-convert.ll (+18-18)
(modified) llvm/test/CodeGen/RISCV/double-round-conv-sat.ll (+105-105)
(modified) llvm/test/CodeGen/RISCV/early-clobber-tied-def-subreg-liveness.ll (+7-7)
(modified) llvm/test/CodeGen/RISCV/float-convert.ll (+18-18)
(modified) llvm/test/CodeGen/RISCV/float-round-conv-sat.ll (+35-35)
(modified) llvm/test/CodeGen/RISCV/fmax-fmin.ll (+36-36)
(modified) llvm/test/CodeGen/RISCV/fpclamptosat.ll (+200-200)
(modified) llvm/test/CodeGen/RISCV/half-convert.ll (+60-60)
(modified) llvm/test/CodeGen/RISCV/half-round-conv-sat.ll (+70-70)
(modified) llvm/test/CodeGen/RISCV/half-select-fcmp.ll (+42-42)
(modified) llvm/test/CodeGen/RISCV/min-max.ll (+10-6)
(modified) llvm/test/CodeGen/RISCV/mul.ll (+28-28)
(modified) llvm/test/CodeGen/RISCV/overflow-intrinsics.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/push-pop-popret.ll (+144-144)
(modified) llvm/test/CodeGen/RISCV/regalloc-last-chance-recoloring-failure.ll (+34-41)
(modified) llvm/test/CodeGen/RISCV/rvv/bitreverse-vp.ll (+26-26)
(modified) llvm/test/CodeGen/RISCV/rvv/bswap-vp.ll (+26-26)
(modified) llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll (+21-32)
(modified) llvm/test/CodeGen/RISCV/rvv/ctpop-vp.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-bitreverse-vp.ll (+26-26)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-bswap-vp.ll (+26-26)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ceil-vp.ll (+26-37)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ctlz-vp.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ctpop-vp.ll (+82-31)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-cttz-vp.ll (+90-90)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-floor-vp.ll (+26-37)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-interleave.ll (+28-8)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-explodevector.ll (+83-83)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-interleave.ll (+28-8)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-interleaved-access.ll (+298-350)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll (+858-858)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-scatter.ll (+18-18)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-store-fp.ll (+8-48)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-store-int.ll (+19-72)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-nearbyint-vp.ll (+13-29)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-int.ll (+28-24)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-rint-vp.ll (+26-37)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-round-vp.ll (+26-37)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-roundeven-vp.ll (+26-37)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-roundtozero-vp.ll (+26-37)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-trunc-vp.ll (+64-64)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vselect-vp.ll (+31-31)
(modified) llvm/test/CodeGen/RISCV/rvv/floor-vp.ll (+21-32)
(modified) llvm/test/CodeGen/RISCV/rvv/fpclamptosat_vec.ll (+244-244)
(modified) llvm/test/CodeGen/RISCV/rvv/fshr-fshl-vp.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/rvv/nearbyint-vp.ll (+16-51)
(modified) llvm/test/CodeGen/RISCV/rvv/rint-vp.ll (+79-59)
(modified) llvm/test/CodeGen/RISCV/rvv/round-vp.ll (+79-59)
(modified) llvm/test/CodeGen/RISCV/rvv/roundeven-vp.ll (+79-59)
(modified) llvm/test/CodeGen/RISCV/rvv/roundtozero-vp.ll (+79-59)
(modified) llvm/test/CodeGen/RISCV/rvv/shuffle-reverse.ll (+26-26)
(modified) llvm/test/CodeGen/RISCV/rvv/sink-splat-operands.ll (+310-310)
(modified) llvm/test/CodeGen/RISCV/rvv/splat-vector-split-i64-vl-sdnode.ll (+17-17)
(modified) llvm/test/CodeGen/RISCV/rvv/strided-vpload.ll (+76-76)
(modified) llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-load.ll (+31-15)
(modified) llvm/test/CodeGen/RISCV/rvv/vector-deinterleave.ll (+134-80)
(modified) llvm/test/CodeGen/RISCV/rvv/vector-interleave-store.ll (+13-17)
(modified) llvm/test/CodeGen/RISCV/rvv/vector-interleave.ll (+72-48)
(modified) llvm/test/CodeGen/RISCV/rvv/vfma-vp.ll (+23-13)
(modified) llvm/test/CodeGen/RISCV/rvv/vfmadd-sdnode.ll (+31-40)
(modified) llvm/test/CodeGen/RISCV/rvv/vfmuladd-vp.ll (+23-13)
(modified) llvm/test/CodeGen/RISCV/rvv/vfptrunc-vp.ll (+29-31)
(modified) llvm/test/CodeGen/RISCV/rvv/vfwnmacc-vp.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/rvv/vfwnmsac-vp.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/rvv/vreductions-fp-vp.ll (+44-44)
(modified) llvm/test/CodeGen/RISCV/rvv/vreductions-int-vp.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.ll (+9-9)
(modified) llvm/test/CodeGen/RISCV/rvv/vtrunc-vp.ll (+24-12)
(modified) llvm/test/CodeGen/RISCV/shifts.ll (+47-47)
(modified) llvm/test/CodeGen/RISCV/srem-vector-lkk.ll (+90-90)
(modified) llvm/test/CodeGen/RISCV/stack-store-check.ll (+238-237)
(modified) llvm/test/CodeGen/RISCV/umulo-128-legalisation-lowering.ll (+24-24)
(modified) llvm/test/CodeGen/RISCV/urem-vector-lkk.ll (+90-90)
(modified) llvm/test/CodeGen/RISCV/wide-scalar-shift-by-byte-multiple-legalization.ll (+328-324)
(modified) llvm/test/CodeGen/RISCV/wide-scalar-shift-legalization.ll (+595-595)
(modified) llvm/test/CodeGen/SPARC/smulo-128-legalisation-lowering.ll (+35-35)
(modified) llvm/test/CodeGen/SPARC/umulo-128-legalisation-lowering.ll (+54-54)
(modified) llvm/test/CodeGen/SystemZ/inline-asm-i128.ll (+13-10)
(modified) llvm/test/CodeGen/SystemZ/int-uadd-01.ll (+14-14)
(modified) llvm/test/CodeGen/SystemZ/int-uadd-02.ll (+14-14)
(modified) llvm/test/CodeGen/SystemZ/int-usub-01.ll (+12-12)

diff --git a/llvm/lib/CodeGen/SlotIndexes.cpp b/llvm/lib/CodeGen/SlotIndexes.cpp
index 65726f06dedb473..f0cb187a2574256 100644
--- a/llvm/lib/CodeGen/SlotIndexes.cpp
+++ b/llvm/lib/CodeGen/SlotIndexes.cpp
@@ -238,8 +238,27 @@ void SlotIndexes::repairIndexesInRange(MachineBasicBlock *MBB,
 }
 
 void SlotIndexes::packIndexes() {
-  for (auto [Index, Entry] : enumerate(indexList))
-    Entry.setIndex(Index * SlotIndex::InstrDist);
+  unsigned Index = 0;
+  // Iterate over basic blocks in slot index order.
+  for (MachineBasicBlock *MBB : make_second_range(idx2MBBMap)) {
+    // Update entries for each instruction in the block and the dummy entry for
+    // the end of the block.
+    auto [MBBStartIdx, MBBEndIdx] = MBBRanges[MBB->getNumber()];
+    for (auto I = MBBStartIdx.listEntry()->getIterator(),
+              E = MBBEndIdx.listEntry()->getIterator();
+         I++ != E;) {
+      if (I == E || I->getInstr()) {
+        Index += SlotIndex::InstrDist;
+        I->setIndex(Index);
+      } else {
+        // LiveIntervals may still refer to entries for instructions that have
+        // been erased. We have to update these entries but we don't want them
+        // to affect the rest of the slot numbering, so set them to half way
+        // between the neighboring real instrucion indexes.
+        I->setIndex(Index + SlotIndex::InstrDist / 2);
+      }
+    }
+  }
 }
 
 #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-outline_atomics.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-outline_atomics.ll
index fb4bef33d9b4ffd..348528f02d93217 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-outline_atomics.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-outline_atomics.ll
@@ -236,8 +236,8 @@ define dso_local i128 @load_atomic_i128_aligned_unordered(ptr %ptr) {
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_unordered:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr unordered, align 16
     ret i128 %r
 }
@@ -251,8 +251,8 @@ define dso_local i128 @load_atomic_i128_aligned_unordered_const(ptr readonly %pt
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_unordered_const:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr unordered, align 16
     ret i128 %r
 }
@@ -266,8 +266,8 @@ define dso_local i128 @load_atomic_i128_aligned_monotonic(ptr %ptr) {
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_monotonic:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr monotonic, align 16
     ret i128 %r
 }
@@ -281,8 +281,8 @@ define dso_local i128 @load_atomic_i128_aligned_monotonic_const(ptr readonly %pt
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_monotonic_const:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr monotonic, align 16
     ret i128 %r
 }
@@ -296,8 +296,8 @@ define dso_local i128 @load_atomic_i128_aligned_acquire(ptr %ptr) {
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_acquire:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr acquire, align 16
     ret i128 %r
 }
@@ -311,8 +311,8 @@ define dso_local i128 @load_atomic_i128_aligned_acquire_const(ptr readonly %ptr)
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_acquire_const:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr acquire, align 16
     ret i128 %r
 }
@@ -326,8 +326,8 @@ define dso_local i128 @load_atomic_i128_aligned_seq_cst(ptr %ptr) {
 ; -O0:    stlxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_seq_cst:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stlxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stlxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr seq_cst, align 16
     ret i128 %r
 }
@@ -341,8 +341,8 @@ define dso_local i128 @load_atomic_i128_aligned_seq_cst_const(ptr readonly %ptr)
 ; -O0:    stlxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_seq_cst_const:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stlxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stlxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr seq_cst, align 16
     ret i128 %r
 }
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc.ll
index 373b040ebec65d1..c5c03cbb0763119 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc.ll
@@ -236,8 +236,8 @@ define dso_local i128 @load_atomic_i128_aligned_unordered(ptr %ptr) {
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_unordered:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr unordered, align 16
     ret i128 %r
 }
@@ -251,8 +251,8 @@ define dso_local i128 @load_atomic_i128_aligned_unordered_const(ptr readonly %pt
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_unordered_const:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr unordered, align 16
     ret i128 %r
 }
@@ -266,8 +266,8 @@ define dso_local i128 @load_atomic_i128_aligned_monotonic(ptr %ptr) {
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_monotonic:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr monotonic, align 16
     ret i128 %r
 }
@@ -281,8 +281,8 @@ define dso_local i128 @load_atomic_i128_aligned_monotonic_const(ptr readonly %pt
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_monotonic_const:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr monotonic, align 16
     ret i128 %r
 }
@@ -296,8 +296,8 @@ define dso_local i128 @load_atomic_i128_aligned_acquire(ptr %ptr) {
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_acquire:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr acquire, align 16
     ret i128 %r
 }
@@ -311,8 +311,8 @@ define dso_local i128 @load_atomic_i128_aligned_acquire_const(ptr readonly %ptr)
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_acquire_const:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr acquire, align 16
     ret i128 %r
 }
@@ -326,8 +326,8 @@ define dso_local i128 @load_atomic_i128_aligned_seq_cst(ptr %ptr) {
 ; -O0:    stlxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_seq_cst:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stlxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stlxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr seq_cst, align 16
     ret i128 %r
 }
@@ -341,8 +341,8 @@ define dso_local i128 @load_atomic_i128_aligned_seq_cst_const(ptr readonly %ptr)
 ; -O0:    stlxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_seq_cst_const:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stlxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stlxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr seq_cst, align 16
     ret i128 %r
 }
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-v8a.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-v8a.ll
index 045e080983d5f89..0368ec909e53639 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-v8a.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-v8a.ll
@@ -236,8 +236,8 @@ define dso_local i128 @load_atomic_i128_aligned_unordered(ptr %ptr) {
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_unordered:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr unordered, align 16
     ret i128 %r
 }
@@ -251,8 +251,8 @@ define dso_local i128 @load_atomic_i128_aligned_unordered_const(ptr readonly %pt
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_unordered_const:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr unordered, align 16
     ret i128 %r
 }
@@ -266,8 +266,8 @@ define dso_local i128 @load_atomic_i128_aligned_monotonic(ptr %ptr) {
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_monotonic:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr monotonic, align 16
     ret i128 %r
 }
@@ -281,8 +281,8 @@ define dso_local i128 @load_atomic_i128_aligned_monotonic_const(ptr readonly %pt
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_monotonic_const:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr monotonic, align 16
     ret i128 %r
 }
@@ -296,8 +296,8 @@ define dso_local i128 @load_atomic_i128_aligned_acquire(ptr %ptr) {
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_acquire:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr acquire, align 16
     ret i128 %r
 }
@@ -311,8 +311,8 @@ define dso_local i128 @load_atomic_i128_aligned_acquire_const(ptr readonly %ptr)
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_acquire_const:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr acquire, align 16
     ret i128 %r
 }
@@ -326,8 +326,8 @@ define dso_local i128 @load_atomic_i128_aligned_seq_cst(ptr %ptr) {
 ; -O0:    stlxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_seq_cst:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stlxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stlxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr seq_cst, align 16
     ret i128 %r
 }
@@ -341,8 +341,8 @@ define dso_local i128 @load_atomic_i128_aligned_seq_cst_const(ptr readonly %ptr)
 ; -O0:    stlxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_seq_cst_const:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stlxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stlxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr seq_cst, align 16
     ret i128 %r
 }
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-lse2.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-lse2.ll
index 0c52a8a683e3a06..55d48f1bd6226b1 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-lse2.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-lse2.ll
@@ -156,8 +156,8 @@ define dso_local i32 @atomicrmw_xchg_i32_aligned_monotonic(ptr %ptr, i32 %value)
 ; -O0:    subs w8, w9, w8
 ;
 ; -O1-LABEL: atomicrmw_xchg_i32_aligned_monotonic:
-; -O1:    ldxr w0, [x8]
-; -O1:    stxr w9, w1, [x8]
+; -O1:    ldxr w8, [x0]
+; -O1:    stxr w9, w1, [x0]
     %r = atomicrmw xchg ptr %ptr, i32 %value monotonic, align 4
     ret i32 %r
 }
@@ -170,8 +170,8 @@ define dso_local i32 @atomicrmw_xchg_i32_aligned_acquire(ptr %ptr, i32 %value) {
 ; -O0:    subs w8, w9, w8
 ;
 ; -O1-LABEL: atomicrmw_xchg_i32_aligned_acquire:
-; -O1:    ldaxr w0, [x8]
-; -O1:    stxr w9, w1, [x8]
+; -O1:    ldaxr w8, [x0]
+; -O1:    stxr w9, w1, [x0]
     %r = atomicrmw xchg ptr %ptr, i32 %value acquire, align 4
     ret i32 %r
 }
@@ -184,8 +184,8 @@ define dso_local i32 @atomicrmw_xchg_i32_aligned_release(ptr %ptr, i32 %value) {
 ; -O0:    subs w8, w9, w8
 ;
 ; -O1-LABEL: atomicrmw_xchg_i32_aligned_release:
-; -O1:    ldxr w0, [x8]
-; -O1:    stlxr w9, w1, [x8]
+; -O1:    ldxr w8, [x0]
+; -O1:    stlxr w9, w1, [x0]
     %r = atomicrmw xchg ptr %ptr, i32 %value release, align 4
     ret i32 %r
 }
@@ -198,8 +198,8 @@ define dso_local i32 @atomicrmw_xchg_i32_aligned_acq_rel(ptr %ptr, i32 %value) {
 ; -O0:    subs w8, w9, w8
 ;
 ; -O1-LABEL: atomicrmw_xchg_i32_aligned_acq_rel:
-; -O1:    ldaxr w0, [x8]
-; -O1:    stlxr w9, w1, [x8]
+; -O1:    ldaxr w8, [x0]
+; -O1:    stlxr w9, w1, [x0]
     %r = atomicrmw xchg ptr %ptr, i32 %value acq_rel, align 4
     ret i32 %r
 }
@@ -212,8 +212,8 @@ define dso_local i32 @atomicrmw_xchg_i32_aligned_seq_cst(ptr %ptr, i32 %value) {
 ; -O0:    subs w8, w9, w8
 ;
 ; -O1-LABEL: atomicrmw_xchg_i32_aligned_seq_cst:
-; -O1:    ldaxr w0, [x8]
-; -O1:    stlxr w9, w1, [x8]
+; -O1:    ldaxr w8, [x0]
+; -O1:    stlxr w9, w1, [x0]
     %r = atomicrmw xchg ptr %ptr, i32 %value seq_cst, align 4
     ret i32 %r
 }
@@ -226,8 +226,8 @@ define dso_local i64 @atomicrmw_xchg_i64_aligned_monotonic(ptr %ptr, i64 %value)
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_xchg_i64_aligned_monotonic:
-; -O1:    ldxr x0, [x8]
-; -O1:    stxr w9, x1, [x8]
+; -O1:    ldxr x8, [x0]
+; -O1:    stxr w9, x1, [x0]
     %r = atomicrmw xchg ptr %ptr, i64 %value monotonic, align 8
     ret i64 %r
 }
@@ -240,8 +240,8 @@ define dso_local i64 @atomicrmw_xchg_i64_aligned_acquire(ptr %ptr, i64 %value) {
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_xchg_i64_aligned_acquire:
-; -O1:    ldaxr x0, [x8]
-; -O1:    stxr w9, x1, [x8]
+; -O1:    ldaxr x8, [x0]
+; -O1:    stxr w9, x1, [x0]
     %r = atomicrmw xchg ptr %ptr, i64 %value acquire, align 8
     ret i64 %r
 }
@@ -254,8 +254,8 @@ define dso_local i64 @atomicrmw_xchg_i64_aligned_release(ptr %ptr, i64 %value) {
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_xchg_i64_aligned_release:
-; -O1:    ldxr x0, [x8]
-; -O1:    stlxr w9, x1, [x8]
+; -O1:    ldxr x8, [x0]
+; -O1:    stlxr w9, x1, [x0]
     %r = atomicrmw xchg ptr %ptr, i64 %value release, align 8
     ret i64 %r
 }
@@ -268,8 +268,8 @@ define dso_local i64 @atomicrmw_xchg_i64_aligned_acq_rel(ptr %ptr, i64 %value) {
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_xchg_i64_aligned_acq_rel:
-; -O1:    ldaxr x0, [x8]
-; -O1:    stlxr w9, x1, [x8]
+; -O1:    ldaxr x8, [x0]
+; -O1:    stlxr w9, x1, [x0]
     %r = atomicrmw xchg ptr %ptr, i64 %value acq_rel, align 8
     ret i64 %r
 }
@@ -282,8 +282,8 @@ define dso_local i64 @atomicrmw_xchg_i64_aligned_seq_cst(ptr %ptr, i64 %value) {
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_xchg_i64_aligned_seq_cst:
-; -O1:    ldaxr x0, [x8]
-; -O1:    stlxr w9, x1, [x8]
+; -O1:    ldaxr x8, [x0]
+; -O1:    stlxr w9, x1, [x0]
     %r = atomicrmw xchg ptr %ptr, i64 %value seq_cst, align 8
     ret i64 %r
 }
@@ -852,9 +852,9 @@ define dso_local i64 @atomicrmw_add_i64_aligned_monotonic(ptr %ptr, i64 %value)
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_add_i64_aligned_monotonic:
-; -O1:    ldxr x0, [x8]
-; -O1:    add x9, x0, x1
-; -O1:    stxr w10, x9, [x8]
+; -O1:    ldxr x8, [x0]
+; -O1:    add x9, x8, x1
+; -O1:    stxr w10, x9, [x0]
     %r = atomicrmw add ptr %ptr, i64 %value monotonic, align 8
     ret i64 %r
 }
@@ -868,9 +868,9 @@ define dso_local i64 @atomicrmw_add_i64_aligned_acquire(ptr %ptr, i64 %value) {
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_add_i64_aligned_acquire:
-; -O1:    ldaxr x0, [x8]
-; -O1:    add x9, x0, x1
-; -O1:    stxr w10, x9, [x8]
+; -O1:    ldaxr x8, [x0]
+; -O1:    add x9, x8, x1
+; -O1:    stxr w10, x9, [x0]
     %r = atomicrmw add ptr %ptr, i64 %value acquire, align 8
     ret i64 %r
 }
@@ -884,9 +884,9 @@ define dso_local i64 @atomicrmw_add_i64_aligned_release(ptr %ptr, i64 %value) {
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_add_i64_aligned_release:
-; -O1:    ldxr x0, [x8]
-; -O1:    add x9, x0, x1
-; -O1:    stlxr w10, x9, [x8]
+; -O1:    ldxr x8, [x0]
+; -O1:    add x9, x8, x1
+; -O1:    stlxr w10, x9, [x0]
     %r = atomicrmw add ptr %ptr, i64 %value release, align 8
     ret i64 %r
 }
@@ -900,9 +900,9 @@ define dso_local i64 @atomicrmw_add_i64_aligned_acq_rel(ptr %ptr, i64 %value) {
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_add_i64_aligned_acq_rel:
-; -O1:    ldaxr x0, [x8]
-; -O1:    add x9, x0, x1
-; -O1:    stlxr w10, x9, [x8]
+; -O1:    ldaxr x8, [x0]
+; -O1:    add x9, x8, x1
+; -O1:    stlxr w10, x9, [x0]
     %r = atomicrmw add ptr %ptr, i64 %value acq_rel, align 8
     ret i64 %r
 }
@@ -916,9 +916,9 @@ define dso_local i64 @atomicrmw_add_i64_aligned_seq_cst(ptr %ptr, i64 %value) {
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_add_i64_aligned_seq_cst:
-; -O1:    ldaxr x0, [x8]
-; -O1:    add x9, x0, x1
-; -O1:    stlxr w10, x9, [x8]
+; -O1:    ldaxr x8, [x0]
+; -O1:    add x9, x8, x1
+; -O1:    stlxr w10, x9, [x0]
     %r = atomicrmw add ptr %ptr, i64 %value seq_cst, align 8
     ret i64 %r
 }
@@ -939,9 +939,9 @@ define dso_local i128 @atomicrmw_add_i128_aligned_monotonic(ptr %ptr, i128 %valu
 ; -O0:    subs x8, x8, #0
 ;
 ; -O1-LABEL: atomicrmw_add_i128_aligned_monotonic:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    adds x9, x0, x2
-; -O1:    stxp w11, x9, x10, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    adds x9, x8, x2
+; -O1:    stxp w11, x9, x10, [x0]
     %r = atomicrmw add ptr %ptr, i128 %value monotonic, align 16
     ret i128 %r
 }
@@ -962,9 +962,9 @@ define dso_local i128 @atomicrmw_add_i128_aligned_acquire(ptr %ptr, i128 %value)
 ; -O0:    subs x8, x8, #0
 ;
 ; -O1-LABEL: atomicrmw_add_i128_aligned_acquire:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    adds x9, x0, x2
-; -O1:    stxp w11, x9, x10, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    adds x9, x8, x2
+; -O1:    stxp w11, x9, x10, [x0]
     %r = atomicrmw add ptr %ptr, i128 %value acquire, align 16
     ret i128 %r
 }
@@ -985,9 +985,9 @@ define dso_local i128 @atomicrmw_add_i128_aligned_release(ptr %ptr, i128 %value)
 ; -O0:    subs x8, x8, #0
 ;
 ; -O1-LABEL: atomicrmw_add_i128_aligned_release:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    adds x9, x0, x2
-; -O1:    stlxp w11, x9, x10, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    adds x9, x8, x2
+; -O1:    stlxp w11, x9, x10, [x0]
     %r = atomicrmw add ptr %ptr, i128 %value release, align 16
     ret i128 %r
 }
@@ -1008,9 +1008,9 @@ define dso_local i128 @atomicrmw_add_i128_aligned_acq_rel(ptr %ptr, i128 %value)
 ; -O0:    subs x8, x8, #0
 ;
 ; -O1-LABEL: atomicrmw_add_i128_aligned_acq_rel:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    adds x9, x0, x2
-; -O1:    stlxp w11, x9, x10, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    adds x9, x8, x2
+; -O1:    stlxp w11, x9, x10, [x0]
     %r = atomicrmw add ptr %ptr, i128 %value acq_rel, align 16
     ret i128 %r
 }
@@ -1031,9 +1031,9 @@ define dso_local i128 @atomicrmw_add_i128_aligned_seq_cst(ptr %ptr, i128 %value)
 ; -O0:    subs x8, x8, #0
 ;
 ; -O1-LABEL: atomicrmw_add_i128_aligned_seq_cst:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    adds x9, x0, x2
-; -O1:    stlxp w11, x9, x10, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    adds x9, x8, x2
+; -O1:    stlxp w11, x9, x10, [x0]
     %r = atomicrmw add ptr %ptr, i128 %value seq_cst, align 16
     ret i128 %r
 }
@@ -1632,9 +1632,9 @@ define dso_local i64 @atomicrmw_sub_i64_aligned_monotonic(ptr %ptr, i64 %value)
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_sub_i64_aligned_monotonic:
-; -O1:    ldxr x0, [x8]
-; -O1:    sub x9, x0, x1
-; -O1:    stxr w10, x9, [x8]
+; -O1:    ldxr x8, [x0]
+; -O1:    sub x9, x8, x1
+; -O1:    stxr w10, x9, [x0]
     %r = atomicr...
[truncated]

jayfoad · 2023-09-21T16:55:32Z

TODO: update 33 failing tests with manual checks.

asl · 2023-09-21T18:42:35Z

MSP430 test changes seems to be fine

llvm/lib/CodeGen/SlotIndexes.cpp

qcolombet

LGTM

llvm/lib/CodeGen/SlotIndexes.cpp

This makes some tests robust against minor codegen differences that will be caused by PR #67038.

jayfoad · 2023-09-28T15:45:36Z

Rebased and fixed almost all lit tests. (Sorry for the force push but quite a few lit tests conflicted with other upstream changes since this PR was opened.)

I might need help with the last two lit test failures:

Failed Tests (2):
  LLVM :: CodeGen/Hexagon/reg-scavengebug-2.ll
  LLVM :: CodeGen/Hexagon/swp-epilog-phi7.ll

This makes some tests robust against minor codegen differences that will be caused by PR llvm#67038.

Applies: llvm#66334 llvm#67038 Packing the slot indexes before register allocation is useful for us because it evens the gaps between slots after all the optimization passes that happen before `greedy` and may have removed a different number of instructions between AArch64 and X86. This leads to different slot gaps and, hence, slightly different regalloc in some cases. We backport the above patches for our LLVM, with the main difference being the absence of some convenient data structure iterators, which we had to convert to be compatible with our ADT infrastructure. We add the `-pack-indexes` flag to activate this. Addressses: systems-nuts/unifico#291

jayfoad · 2023-10-03T12:21:13Z

I might need help with the last two lit test failures:
Failed Tests (2):
  LLVM :: CodeGen/Hexagon/reg-scavengebug-2.ll
  LLVM :: CodeGen/Hexagon/swp-epilog-phi7.ll

Hi @kparzysz-quic, could you help with these please? The code changes quite a bit, so I'm not even sure if they are still testing what they were supposed to test.

nikic · 2023-10-04T12:47:01Z

@jayfoad I'd suggest marking those tests as XFAIL, to avoid further bitrot.

jayfoad · 2023-10-04T14:56:51Z

@jayfoad I'd suggest marking those tests as XFAIL, to avoid further bitrot.

Done, thanks.

llvm/lib/CodeGen/SlotIndexes.cpp

jayfoad · 2023-10-06T14:33:28Z

Any more comments? If not I will plan to merge this on Monday, based on @qcolombet's approval.

PR #66334 tried to renumber slot indexes before register allocation, but the numbering was still affected by list entries for instructions which had been erased. Fix this to make the register allocator's live range length heuristics even less dependent on the history of how instructions have been added to and removed from SlotIndexes's maps.

…tion (#67038)" This reverts commit 2501ae5. Reverted due to various buildbot failures.

Add a basic implementation of verifyAnalysis that just checks that the analysis does not refer to any SlotIndexes for instructions that have been deleted. This was useful for diagnosing some SlotIndexes-related problems caused by llvm#67038.

llvmbot added backend:ARM backend:AArch64 backend:AMDGPU backend:MSP430 backend:X86 llvm:globalisel llvm:transforms labels Sep 21, 2023

jayfoad requested review from perlfu, qcolombet and a team September 21, 2023 16:53

perlfu reviewed Sep 22, 2023

View reviewed changes

llvm/lib/CodeGen/SlotIndexes.cpp Outdated Show resolved Hide resolved

jayfoad marked this pull request as draft September 22, 2023 08:54

qcolombet approved these changes Sep 27, 2023

View reviewed changes

llvm/lib/CodeGen/SlotIndexes.cpp Outdated Show resolved Hide resolved

llvm/lib/CodeGen/SlotIndexes.cpp Outdated Show resolved Hide resolved

jayfoad added a commit that referenced this pull request Sep 28, 2023

[ARM] Make some test checks more robust

fb32baf

This makes some tests robust against minor codegen differences that will be caused by PR #67038.

jayfoad requested a review from kparzysz-quic September 28, 2023 15:45

legrosbuffle pushed a commit to legrosbuffle/llvm-project that referenced this pull request Sep 29, 2023

[ARM] Make some test checks more robust

9d279b4

This makes some tests robust against minor codegen differences that will be caused by PR llvm#67038.

blackgeorge-boom mentioned this pull request Sep 29, 2023

c-print-results: Different spill order of arguments systems-nuts/unifico#291

Closed

blackgeorge-boom mentioned this pull request Sep 29, 2023

[layout] Pack slot indexes before register allocation blackgeorge-boom/llvm-project#62

Merged

blackgeorge-boom reviewed Oct 4, 2023

View reviewed changes

llvm/lib/CodeGen/SlotIndexes.cpp Show resolved Hide resolved

blackgeorge-boom reviewed Oct 4, 2023

View reviewed changes

llvm/lib/CodeGen/SlotIndexes.cpp Show resolved Hide resolved

jayfoad added backend:RISC-V backend:SystemZ backend:AVR backend:CSKY backend:Hexagon backend:MIPS backend:PowerPC backend:Sparc backend:VE labels Oct 6, 2023

jayfoad marked this pull request as ready for review October 9, 2023 09:56

jayfoad merged commit 2501ae5 into llvm:main Oct 9, 2023
4 checks passed

jayfoad deleted the pack-slotindexes branch October 9, 2023 10:44

jayfoad added a commit that referenced this pull request Oct 9, 2023

Revert "[CodeGen] Really renumber slot indexes before register alloca…

7b3bbd8

…tion (#67038)" This reverts commit 2501ae5. Reverted due to various buildbot failures.

stepthomas mentioned this pull request Oct 10, 2023

AMDGPU stepthomas atomic csub no rtn forms ver2 stepthomas/llvm-project#1

Closed

jayfoad mentioned this pull request Oct 10, 2023

[LiveDebugVariables] Add basic verification #68703

Closed

jayfoad mentioned this pull request Oct 19, 2023

[X86] Support EGPR (R16-R31) for APX #67702

Merged

jayfoad mentioned this pull request Jan 29, 2024

[LiveDebugVariables] Add basic verification #79846

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CodeGen] Really renumber slot indexes before register allocation #67038

[CodeGen] Really renumber slot indexes before register allocation #67038

jayfoad commented Sep 21, 2023

llvmbot commented Sep 21, 2023 •

edited

Loading

jayfoad commented Sep 21, 2023

asl commented Sep 21, 2023 •

edited

Loading

qcolombet left a comment

jayfoad commented Sep 28, 2023

jayfoad commented Oct 3, 2023

nikic commented Oct 4, 2023

jayfoad commented Oct 4, 2023

jayfoad commented Oct 6, 2023

[CodeGen] Really renumber slot indexes before register allocation #67038

[CodeGen] Really renumber slot indexes before register allocation #67038

Conversation

jayfoad commented Sep 21, 2023

llvmbot commented Sep 21, 2023 • edited Loading

jayfoad commented Sep 21, 2023

asl commented Sep 21, 2023 • edited Loading

qcolombet left a comment

Choose a reason for hiding this comment

jayfoad commented Sep 28, 2023

jayfoad commented Oct 3, 2023

nikic commented Oct 4, 2023

jayfoad commented Oct 4, 2023

jayfoad commented Oct 6, 2023

llvmbot commented Sep 21, 2023 •

edited

Loading

asl commented Sep 21, 2023 •

edited

Loading