Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CodeGen] Really renumber slot indexes before register allocation #67038

Merged
merged 1 commit into from
Oct 9, 2023
Merged

[CodeGen] Really renumber slot indexes before register allocation #67038

merged 1 commit into from
Oct 9, 2023

Conversation

jayfoad
Copy link
Contributor

@jayfoad jayfoad commented Sep 21, 2023

PR #66334 tried to renumber slot indexes before register allocation, but
the numbering was still affected by list entries for instructions which
had been erased. Fix this to make the register allocator's live range
length heuristics even less dependent on the history of how instructions
have been added to and removed from SlotIndexes's maps.

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 21, 2023

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-backend-arm
@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-backend-x86
@llvm/pr-subscribers-backend-msp430
@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-aarch64

Changes

PR #66334 tried to renumber slot indexes before register allocation, but
the numbering was still affected by list entries for instructions which
had been erased. Fix this to make the register allocator's live range
length heuristics even less dependent on the history of how instructions
have been added to and removed from SlotIndexes's maps.


Patch is 38.80 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/67038.diff

693 Files Affected:

  • (modified) llvm/lib/CodeGen/SlotIndexes.cpp (+21-2)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-outline_atomics.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-v8a.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-lse2.ll (+360-360)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-outline_atomics.ll (+265-265)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-rcpc.ll (+360-360)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-rcpc3.ll (+360-360)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-v8_1a.ll (+15-20)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-v8a.ll (+360-360)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-load-outline_atomics.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-load-rcpc.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-load-v8a.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomicrmw-lse2.ll (+330-330)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomicrmw-outline_atomics.ll (+235-235)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomicrmw-rcpc.ll (+330-330)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomicrmw-rcpc3.ll (+330-330)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomicrmw-v8_1a.ll (+15-20)
  • (modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomicrmw-v8a.ll (+330-330)
  • (modified) llvm/test/CodeGen/AArch64/active_lane_mask.ll (+52-52)
  • (modified) llvm/test/CodeGen/AArch64/arm64-atomic-128.ll (+10-10)
  • (modified) llvm/test/CodeGen/AArch64/arm64-instruction-mix-remarks.ll (+7-8)
  • (modified) llvm/test/CodeGen/AArch64/arm64-neon-mul-div.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/arm64-shrink-wrapping.ll (+32-32)
  • (modified) llvm/test/CodeGen/AArch64/atomic-ops-msvc.ll (+10-11)
  • (modified) llvm/test/CodeGen/AArch64/atomic-ops.ll (+3-4)
  • (modified) llvm/test/CodeGen/AArch64/atomicrmw-uinc-udec-wrap.ll (+11-11)
  • (modified) llvm/test/CodeGen/AArch64/atomicrmw-xchg-fp.ll (+15-15)
  • (modified) llvm/test/CodeGen/AArch64/bfis-in-loop.ll (+40-40)
  • (modified) llvm/test/CodeGen/AArch64/complex-deinterleaving-reductions-predicated-scalable.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/complex-deinterleaving-uniform-cases.ll (+26-26)
  • (modified) llvm/test/CodeGen/AArch64/extbinopload.ll (+76-76)
  • (modified) llvm/test/CodeGen/AArch64/faddp-half.ll (+5-5)
  • (modified) llvm/test/CodeGen/AArch64/faddsub.ll (+154-154)
  • (modified) llvm/test/CodeGen/AArch64/fcvt_combine.ll (+42-42)
  • (modified) llvm/test/CodeGen/AArch64/fdiv.ll (+46-46)
  • (modified) llvm/test/CodeGen/AArch64/fminimummaximum.ll (+154-154)
  • (modified) llvm/test/CodeGen/AArch64/fminmax.ll (+154-154)
  • (modified) llvm/test/CodeGen/AArch64/fpow.ll (+217-219)
  • (modified) llvm/test/CodeGen/AArch64/fptosi-sat-vector.ll (+286-286)
  • (modified) llvm/test/CodeGen/AArch64/fptoui-sat-vector.ll (+71-71)
  • (modified) llvm/test/CodeGen/AArch64/frem.ll (+217-219)
  • (modified) llvm/test/CodeGen/AArch64/llvm.exp10.ll (+20-23)
  • (modified) llvm/test/CodeGen/AArch64/neon-dotreduce.ll (+788-788)
  • (modified) llvm/test/CodeGen/AArch64/neon-extadd.ll (+63-63)
  • (modified) llvm/test/CodeGen/AArch64/ragreedy-csr.ll (+36-36)
  • (modified) llvm/test/CodeGen/AArch64/ragreedy-local-interval-cost.ll (+121-127)
  • (modified) llvm/test/CodeGen/AArch64/sve-fixed-length-masked-scatter.ll (+18-18)
  • (modified) llvm/test/CodeGen/AArch64/sve-fixed-length-shuffles.ll (+26-26)
  • (modified) llvm/test/CodeGen/AArch64/sve-int-arith.ll (+7-7)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-fp-to-int.ll (+60-60)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-div.ll (+74-74)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll (+100-106)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-permute-zip-uzp-trn.ll (+116-116)
  • (modified) llvm/test/CodeGen/AArch64/swifterror.ll (+20-19)
  • (modified) llvm/test/CodeGen/AArch64/vec-libcalls.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/vector-fcopysign.ll (+27-27)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/atomic_optimizations_mul_one.ll (+50-50)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement-stack-lower.ll (+317-224)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i128.ll (+85-85)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement-stack-lower.ll (+51-51)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.ll (+18-18)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.div.fmas.ll (+14-14)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll (+24-18)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/mul.ll (+204-204)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/sdiv.i64.ll (+1552-1555)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/sdivrem.ll (+199-199)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/srem.i64.ll (+1067-1072)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/ssubsat.ll (+12-12)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/udiv.i64.ll (+413-413)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/udivrem.ll (+296-297)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i64.ll (+367-367)
  • (modified) llvm/test/CodeGen/AMDGPU/agpr-copy-no-free-registers.ll (+163-163)
  • (modified) llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll (+59-59)
  • (modified) llvm/test/CodeGen/AMDGPU/atomic_optimizations_buffer.ll (+78-78)
  • (modified) llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll (+18-18)
  • (modified) llvm/test/CodeGen/AMDGPU/atomic_optimizations_raw_buffer.ll (+78-78)
  • (modified) llvm/test/CodeGen/AMDGPU/branch-folding-implicit-def-subreg.ll (+229-228)
  • (modified) llvm/test/CodeGen/AMDGPU/bswap.ll (+44-44)
  • (modified) llvm/test/CodeGen/AMDGPU/bug-sdag-emitcopyfromreg.ll (+7-7)
  • (modified) llvm/test/CodeGen/AMDGPU/bypass-div.ll (+103-103)
  • (modified) llvm/test/CodeGen/AMDGPU/calling-conventions.ll (+121-121)
  • (modified) llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll (+19-19)
  • (modified) llvm/test/CodeGen/AMDGPU/ctpop16.ll (+8-8)
  • (modified) llvm/test/CodeGen/AMDGPU/cttz_zero_undef.ll (+10-10)
  • (modified) llvm/test/CodeGen/AMDGPU/extract-subvector-16bit.ll (+87-87)
  • (modified) llvm/test/CodeGen/AMDGPU/fcanonicalize.f16.ll (+56-57)
  • (modified) llvm/test/CodeGen/AMDGPU/fp_to_sint.ll (+8-8)
  • (modified) llvm/test/CodeGen/AMDGPU/frem.ll (+87-87)
  • (modified) llvm/test/CodeGen/AMDGPU/fsqrt.f32.ll (+24-24)
  • (modified) llvm/test/CodeGen/AMDGPU/gfx-callable-return-types.ll (+297-291)
  • (modified) llvm/test/CodeGen/AMDGPU/global_atomics_i32_system.ll (+48-48)
  • (modified) llvm/test/CodeGen/AMDGPU/global_atomics_i64_system.ll (+90-90)
  • (modified) llvm/test/CodeGen/AMDGPU/global_atomics_scan_fadd.ll (+48-48)
  • (modified) llvm/test/CodeGen/AMDGPU/global_atomics_scan_fsub.ll (+48-48)
  • (modified) llvm/test/CodeGen/AMDGPU/half.ll (+90-90)
  • (modified) llvm/test/CodeGen/AMDGPU/idot8s.ll (+64-64)
  • (modified) llvm/test/CodeGen/AMDGPU/insert-delay-alu-bug.ll (+73-70)
  • (modified) llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll (+14-14)
  • (modified) llvm/test/CodeGen/AMDGPU/insert_vector_elt.v2i16.ll (+95-95)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.iglp.opt.ll (+136-136)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll (+302-302)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.exp2.ll (+13-13)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.log.ll (+20-20)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.log10.ll (+20-20)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.log2.ll (+13-13)
  • (modified) llvm/test/CodeGen/AMDGPU/llvm.round.f64.ll (+154-154)
  • (modified) llvm/test/CodeGen/AMDGPU/load-constant-i1.ll (+1799-1801)
  • (modified) llvm/test/CodeGen/AMDGPU/load-constant-i16.ll (+1172-1187)
  • (modified) llvm/test/CodeGen/AMDGPU/load-constant-i32.ll (+161-159)
  • (modified) llvm/test/CodeGen/AMDGPU/load-constant-i8.ll (+1818-1832)
  • (modified) llvm/test/CodeGen/AMDGPU/load-global-i16.ll (+1579-1583)
  • (modified) llvm/test/CodeGen/AMDGPU/local-atomics-fp.ll (+28-28)
  • (modified) llvm/test/CodeGen/AMDGPU/move-to-valu-atomicrmw-system.ll (+9-10)
  • (modified) llvm/test/CodeGen/AMDGPU/mubuf-legalize-operands-non-ptr-intrinsics.ll (+25-25)
  • (modified) llvm/test/CodeGen/AMDGPU/mubuf-legalize-operands.ll (+25-25)
  • (modified) llvm/test/CodeGen/AMDGPU/mul.ll (+46-46)
  • (modified) llvm/test/CodeGen/AMDGPU/preserve-wwm-copy-dst-reg.ll (+104-130)
  • (modified) llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll (+209-209)
  • (modified) llvm/test/CodeGen/AMDGPU/rsq.f32.ll (+70-70)
  • (modified) llvm/test/CodeGen/AMDGPU/scc-clobbered-sgpr-to-vmem-spill.ll (+198-196)
  • (modified) llvm/test/CodeGen/AMDGPU/sdiv.ll (+168-168)
  • (modified) llvm/test/CodeGen/AMDGPU/sdiv64.ll (+131-131)
  • (modified) llvm/test/CodeGen/AMDGPU/select.f16.ll (+15-15)
  • (modified) llvm/test/CodeGen/AMDGPU/shl.ll (+11-11)
  • (modified) llvm/test/CodeGen/AMDGPU/si-unify-exit-return-unreachable.ll (+15-14)
  • (modified) llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll (+997-1012)
  • (modified) llvm/test/CodeGen/AMDGPU/sra.ll (+22-22)
  • (modified) llvm/test/CodeGen/AMDGPU/srem64.ll (+142-142)
  • (modified) llvm/test/CodeGen/AMDGPU/srl.ll (+11-11)
  • (modified) llvm/test/CodeGen/AMDGPU/swdev373493.ll (+9-9)
  • (modified) llvm/test/CodeGen/AMDGPU/swdev380865.ll (+4-2)
  • (modified) llvm/test/CodeGen/AMDGPU/udiv.ll (+48-48)
  • (modified) llvm/test/CodeGen/AMDGPU/udiv64.ll (+111-111)
  • (modified) llvm/test/CodeGen/AMDGPU/urem64.ll (+38-38)
  • (modified) llvm/test/CodeGen/AMDGPU/vni8-across-blocks.ll (+1215-1217)
  • (modified) llvm/test/CodeGen/AMDGPU/wave32.ll (+7-7)
  • (modified) llvm/test/CodeGen/AMDGPU/while-break.ll (+10-10)
  • (modified) llvm/test/CodeGen/AMDGPU/wqm.ll (+31-31)
  • (modified) llvm/test/CodeGen/ARM/ParallelDSP/complex_dot_prod.ll (+92-91)
  • (modified) llvm/test/CodeGen/ARM/ParallelDSP/multi-use-loads.ll (+105-104)
  • (modified) llvm/test/CodeGen/ARM/addsubo-legalization.ll (+8-8)
  • (modified) llvm/test/CodeGen/ARM/aes-erratum-fix.ll (+452-476)
  • (modified) llvm/test/CodeGen/ARM/atomicrmw-uinc-udec-wrap.ll (+32-34)
  • (modified) llvm/test/CodeGen/ARM/bf16-shuffle.ll (+31-31)
  • (modified) llvm/test/CodeGen/ARM/big-endian-neon-fp16-bitconv.ll (+7-6)
  • (modified) llvm/test/CodeGen/ARM/cttz.ll (+54-54)
  • (modified) llvm/test/CodeGen/ARM/fadd-select-fneg-combine.ll (+6-6)
  • (modified) llvm/test/CodeGen/ARM/fpclamptosat.ll (+116-122)
  • (modified) llvm/test/CodeGen/ARM/fpclamptosat_vec.ll (+324-324)
  • (modified) llvm/test/CodeGen/ARM/fptoi-sat-store.ll (+29-26)
  • (modified) llvm/test/CodeGen/ARM/fptosi-sat-scalar.ll (+174-177)
  • (modified) llvm/test/CodeGen/ARM/fptoui-sat-scalar.ll (+65-65)
  • (modified) llvm/test/CodeGen/ARM/funnel-shift.ll (+74-74)
  • (modified) llvm/test/CodeGen/ARM/machine-cse-cmp.ll (+6-6)
  • (modified) llvm/test/CodeGen/ARM/minnum-maxnum-intrinsics.ll (+26-26)
  • (modified) llvm/test/CodeGen/ARM/neon-copy.ll (+13-13)
  • (modified) llvm/test/CodeGen/ARM/select_const.ll (+9-9)
  • (modified) llvm/test/CodeGen/ARM/srem-seteq-illegal-types.ll (+40-40)
  • (modified) llvm/test/CodeGen/ARM/swifterror.ll (+8-8)
  • (modified) llvm/test/CodeGen/ARM/umulo-128-legalisation-lowering.ll (+115-121)
  • (modified) llvm/test/CodeGen/ARM/usub_sat.ll (+7-6)
  • (modified) llvm/test/CodeGen/ARM/usub_sat_plus.ll (+7-6)
  • (modified) llvm/test/CodeGen/ARM/vecreduce-fmax-legalization-soft-float.ll (+9-9)
  • (modified) llvm/test/CodeGen/ARM/vecreduce-fmin-legalization-soft-float.ll (+9-9)
  • (modified) llvm/test/CodeGen/AVR/hardware-mul.ll (+9-9)
  • (modified) llvm/test/CodeGen/Hexagon/atomicrmw-uinc-udec-wrap.ll (+6-6)
  • (modified) llvm/test/CodeGen/Hexagon/autohvx/fp-to-int.ll (+104-106)
  • (modified) llvm/test/CodeGen/Hexagon/autohvx/int-to-fp.ll (+535-533)
  • (modified) llvm/test/CodeGen/Hexagon/autohvx/isel-truncate.ll (+4-4)
  • (modified) llvm/test/CodeGen/Hexagon/autohvx/vmpy-parts.ll (+17-17)
  • (modified) llvm/test/CodeGen/Hexagon/signext-inreg.ll (+59-59)
  • (modified) llvm/test/CodeGen/MSP430/selectcc.ll (+12-9)
  • (modified) llvm/test/CodeGen/Mips/llvm-ir/ashr.ll (+9-9)
  • (modified) llvm/test/CodeGen/Mips/llvm-ir/lshr.ll (+9-9)
  • (modified) llvm/test/CodeGen/PowerPC/all-atomics.ll (+104-103)
  • (modified) llvm/test/CodeGen/PowerPC/csr-split.ll (+6-6)
  • (modified) llvm/test/CodeGen/PowerPC/inc-of-add.ll (+14-16)
  • (modified) llvm/test/CodeGen/PowerPC/ldst-16-byte.mir (+5-5)
  • (modified) llvm/test/CodeGen/PowerPC/licm-tocReg.ll (+17-17)
  • (modified) llvm/test/CodeGen/PowerPC/loop-instr-form-prepare.ll (+33-33)
  • (modified) llvm/test/CodeGen/PowerPC/loop-instr-prep-non-const-increasement.ll (+9-9)
  • (modified) llvm/test/CodeGen/PowerPC/more-dq-form-prepare.ll (+158-159)
  • (modified) llvm/test/CodeGen/PowerPC/no-ctr-loop-if-exit-in-nested-loop.ll (+8-8)
  • (modified) llvm/test/CodeGen/PowerPC/p10-handle-split-promote-vec.ll (+88-88)
  • (modified) llvm/test/CodeGen/PowerPC/p10-spill-creq.ll (+17-17)
  • (modified) llvm/test/CodeGen/PowerPC/ppc64-P9-vabsd.ll (+85-85)
  • (modified) llvm/test/CodeGen/PowerPC/sat-add.ll (+12-12)
  • (modified) llvm/test/CodeGen/PowerPC/sms-phi-3.ll (+12-12)
  • (modified) llvm/test/CodeGen/PowerPC/srem-vector-lkk.ll (+23-23)
  • (modified) llvm/test/CodeGen/PowerPC/sub-of-not.ll (+14-16)
  • (modified) llvm/test/CodeGen/PowerPC/umulo-128-legalisation-lowering.ll (+37-37)
  • (modified) llvm/test/CodeGen/PowerPC/urem-vector-lkk.ll (+10-10)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_fp32_to_i16_elts.ll (+188-188)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_fp32_to_i8_elts.ll (+102-102)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_fp64_to_i16_elts.ll (+118-118)
  • (modified) llvm/test/CodeGen/PowerPC/vec_conv_fp64_to_i8_elts.ll (+122-122)
  • (modified) llvm/test/CodeGen/RISCV/add-before-shl.ll (+8-8)
  • (modified) llvm/test/CodeGen/RISCV/bfloat-convert.ll (+30-30)
  • (modified) llvm/test/CodeGen/RISCV/bfloat-select-fcmp.ll (+42-42)
  • (modified) llvm/test/CodeGen/RISCV/branch-relaxation.ll (+104-108)
  • (modified) llvm/test/CodeGen/RISCV/callee-saved-gprs.ll (+96-96)
  • (modified) llvm/test/CodeGen/RISCV/double-convert.ll (+18-18)
  • (modified) llvm/test/CodeGen/RISCV/double-round-conv-sat.ll (+105-105)
  • (modified) llvm/test/CodeGen/RISCV/early-clobber-tied-def-subreg-liveness.ll (+7-7)
  • (modified) llvm/test/CodeGen/RISCV/float-convert.ll (+18-18)
  • (modified) llvm/test/CodeGen/RISCV/float-round-conv-sat.ll (+35-35)
  • (modified) llvm/test/CodeGen/RISCV/fmax-fmin.ll (+36-36)
  • (modified) llvm/test/CodeGen/RISCV/fpclamptosat.ll (+200-200)
  • (modified) llvm/test/CodeGen/RISCV/half-convert.ll (+60-60)
  • (modified) llvm/test/CodeGen/RISCV/half-round-conv-sat.ll (+70-70)
  • (modified) llvm/test/CodeGen/RISCV/half-select-fcmp.ll (+42-42)
  • (modified) llvm/test/CodeGen/RISCV/min-max.ll (+10-6)
  • (modified) llvm/test/CodeGen/RISCV/mul.ll (+28-28)
  • (modified) llvm/test/CodeGen/RISCV/overflow-intrinsics.ll (+12-12)
  • (modified) llvm/test/CodeGen/RISCV/push-pop-popret.ll (+144-144)
  • (modified) llvm/test/CodeGen/RISCV/regalloc-last-chance-recoloring-failure.ll (+34-41)
  • (modified) llvm/test/CodeGen/RISCV/rvv/bitreverse-vp.ll (+26-26)
  • (modified) llvm/test/CodeGen/RISCV/rvv/bswap-vp.ll (+26-26)
  • (modified) llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll (+21-32)
  • (modified) llvm/test/CodeGen/RISCV/rvv/ctpop-vp.ll (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-bitreverse-vp.ll (+26-26)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-bswap-vp.ll (+26-26)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ceil-vp.ll (+26-37)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ctlz-vp.ll (+8-8)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ctpop-vp.ll (+82-31)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-cttz-vp.ll (+90-90)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-floor-vp.ll (+26-37)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-interleave.ll (+28-8)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-explodevector.ll (+83-83)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-interleave.ll (+28-8)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll (+4-4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-interleaved-access.ll (+298-350)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll (+858-858)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-scatter.ll (+18-18)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-store-fp.ll (+8-48)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-store-int.ll (+19-72)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-nearbyint-vp.ll (+13-29)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-int.ll (+28-24)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-rint-vp.ll (+26-37)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-round-vp.ll (+26-37)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-roundeven-vp.ll (+26-37)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-roundtozero-vp.ll (+26-37)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-trunc-vp.ll (+64-64)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vselect-vp.ll (+31-31)
  • (modified) llvm/test/CodeGen/RISCV/rvv/floor-vp.ll (+21-32)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fpclamptosat_vec.ll (+244-244)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fshr-fshl-vp.ll (+8-8)
  • (modified) llvm/test/CodeGen/RISCV/rvv/nearbyint-vp.ll (+16-51)
  • (modified) llvm/test/CodeGen/RISCV/rvv/rint-vp.ll (+79-59)
  • (modified) llvm/test/CodeGen/RISCV/rvv/round-vp.ll (+79-59)
  • (modified) llvm/test/CodeGen/RISCV/rvv/roundeven-vp.ll (+79-59)
  • (modified) llvm/test/CodeGen/RISCV/rvv/roundtozero-vp.ll (+79-59)
  • (modified) llvm/test/CodeGen/RISCV/rvv/shuffle-reverse.ll (+26-26)
  • (modified) llvm/test/CodeGen/RISCV/rvv/sink-splat-operands.ll (+310-310)
  • (modified) llvm/test/CodeGen/RISCV/rvv/splat-vector-split-i64-vl-sdnode.ll (+17-17)
  • (modified) llvm/test/CodeGen/RISCV/rvv/strided-vpload.ll (+76-76)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-load.ll (+31-15)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vector-deinterleave.ll (+134-80)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vector-interleave-store.ll (+13-17)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vector-interleave.ll (+72-48)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfma-vp.ll (+23-13)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfmadd-sdnode.ll (+31-40)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfmuladd-vp.ll (+23-13)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfptrunc-vp.ll (+29-31)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfwnmacc-vp.ll (+6-6)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfwnmsac-vp.ll (+6-6)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vreductions-fp-vp.ll (+44-44)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vreductions-int-vp.ll (+6-6)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.ll (+9-9)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vtrunc-vp.ll (+24-12)
  • (modified) llvm/test/CodeGen/RISCV/shifts.ll (+47-47)
  • (modified) llvm/test/CodeGen/RISCV/srem-vector-lkk.ll (+90-90)
  • (modified) llvm/test/CodeGen/RISCV/stack-store-check.ll (+238-237)
  • (modified) llvm/test/CodeGen/RISCV/umulo-128-legalisation-lowering.ll (+24-24)
  • (modified) llvm/test/CodeGen/RISCV/urem-vector-lkk.ll (+90-90)
  • (modified) llvm/test/CodeGen/RISCV/wide-scalar-shift-by-byte-multiple-legalization.ll (+328-324)
  • (modified) llvm/test/CodeGen/RISCV/wide-scalar-shift-legalization.ll (+595-595)
  • (modified) llvm/test/CodeGen/SPARC/smulo-128-legalisation-lowering.ll (+35-35)
  • (modified) llvm/test/CodeGen/SPARC/umulo-128-legalisation-lowering.ll (+54-54)
  • (modified) llvm/test/CodeGen/SystemZ/inline-asm-i128.ll (+13-10)
  • (modified) llvm/test/CodeGen/SystemZ/int-uadd-01.ll (+14-14)
  • (modified) llvm/test/CodeGen/SystemZ/int-uadd-02.ll (+14-14)
  • (modified) llvm/test/CodeGen/SystemZ/int-usub-01.ll (+12-12)
diff --git a/llvm/lib/CodeGen/SlotIndexes.cpp b/llvm/lib/CodeGen/SlotIndexes.cpp
index 65726f06dedb473..f0cb187a2574256 100644
--- a/llvm/lib/CodeGen/SlotIndexes.cpp
+++ b/llvm/lib/CodeGen/SlotIndexes.cpp
@@ -238,8 +238,27 @@ void SlotIndexes::repairIndexesInRange(MachineBasicBlock *MBB,
 }
 
 void SlotIndexes::packIndexes() {
-  for (auto [Index, Entry] : enumerate(indexList))
-    Entry.setIndex(Index * SlotIndex::InstrDist);
+  unsigned Index = 0;
+  // Iterate over basic blocks in slot index order.
+  for (MachineBasicBlock *MBB : make_second_range(idx2MBBMap)) {
+    // Update entries for each instruction in the block and the dummy entry for
+    // the end of the block.
+    auto [MBBStartIdx, MBBEndIdx] = MBBRanges[MBB->getNumber()];
+    for (auto I = MBBStartIdx.listEntry()->getIterator(),
+              E = MBBEndIdx.listEntry()->getIterator();
+         I++ != E;) {
+      if (I == E || I->getInstr()) {
+        Index += SlotIndex::InstrDist;
+        I->setIndex(Index);
+      } else {
+        // LiveIntervals may still refer to entries for instructions that have
+        // been erased. We have to update these entries but we don't want them
+        // to affect the rest of the slot numbering, so set them to half way
+        // between the neighboring real instrucion indexes.
+        I->setIndex(Index + SlotIndex::InstrDist / 2);
+      }
+    }
+  }
 }
 
 #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-outline_atomics.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-outline_atomics.ll
index fb4bef33d9b4ffd..348528f02d93217 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-outline_atomics.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-outline_atomics.ll
@@ -236,8 +236,8 @@ define dso_local i128 @load_atomic_i128_aligned_unordered(ptr %ptr) {
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_unordered:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr unordered, align 16
     ret i128 %r
 }
@@ -251,8 +251,8 @@ define dso_local i128 @load_atomic_i128_aligned_unordered_const(ptr readonly %pt
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_unordered_const:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr unordered, align 16
     ret i128 %r
 }
@@ -266,8 +266,8 @@ define dso_local i128 @load_atomic_i128_aligned_monotonic(ptr %ptr) {
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_monotonic:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr monotonic, align 16
     ret i128 %r
 }
@@ -281,8 +281,8 @@ define dso_local i128 @load_atomic_i128_aligned_monotonic_const(ptr readonly %pt
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_monotonic_const:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr monotonic, align 16
     ret i128 %r
 }
@@ -296,8 +296,8 @@ define dso_local i128 @load_atomic_i128_aligned_acquire(ptr %ptr) {
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_acquire:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr acquire, align 16
     ret i128 %r
 }
@@ -311,8 +311,8 @@ define dso_local i128 @load_atomic_i128_aligned_acquire_const(ptr readonly %ptr)
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_acquire_const:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr acquire, align 16
     ret i128 %r
 }
@@ -326,8 +326,8 @@ define dso_local i128 @load_atomic_i128_aligned_seq_cst(ptr %ptr) {
 ; -O0:    stlxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_seq_cst:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stlxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stlxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr seq_cst, align 16
     ret i128 %r
 }
@@ -341,8 +341,8 @@ define dso_local i128 @load_atomic_i128_aligned_seq_cst_const(ptr readonly %ptr)
 ; -O0:    stlxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_seq_cst_const:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stlxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stlxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr seq_cst, align 16
     ret i128 %r
 }
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc.ll
index 373b040ebec65d1..c5c03cbb0763119 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-rcpc.ll
@@ -236,8 +236,8 @@ define dso_local i128 @load_atomic_i128_aligned_unordered(ptr %ptr) {
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_unordered:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr unordered, align 16
     ret i128 %r
 }
@@ -251,8 +251,8 @@ define dso_local i128 @load_atomic_i128_aligned_unordered_const(ptr readonly %pt
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_unordered_const:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr unordered, align 16
     ret i128 %r
 }
@@ -266,8 +266,8 @@ define dso_local i128 @load_atomic_i128_aligned_monotonic(ptr %ptr) {
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_monotonic:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr monotonic, align 16
     ret i128 %r
 }
@@ -281,8 +281,8 @@ define dso_local i128 @load_atomic_i128_aligned_monotonic_const(ptr readonly %pt
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_monotonic_const:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr monotonic, align 16
     ret i128 %r
 }
@@ -296,8 +296,8 @@ define dso_local i128 @load_atomic_i128_aligned_acquire(ptr %ptr) {
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_acquire:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr acquire, align 16
     ret i128 %r
 }
@@ -311,8 +311,8 @@ define dso_local i128 @load_atomic_i128_aligned_acquire_const(ptr readonly %ptr)
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_acquire_const:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr acquire, align 16
     ret i128 %r
 }
@@ -326,8 +326,8 @@ define dso_local i128 @load_atomic_i128_aligned_seq_cst(ptr %ptr) {
 ; -O0:    stlxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_seq_cst:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stlxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stlxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr seq_cst, align 16
     ret i128 %r
 }
@@ -341,8 +341,8 @@ define dso_local i128 @load_atomic_i128_aligned_seq_cst_const(ptr readonly %ptr)
 ; -O0:    stlxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_seq_cst_const:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stlxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stlxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr seq_cst, align 16
     ret i128 %r
 }
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-v8a.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-v8a.ll
index 045e080983d5f89..0368ec909e53639 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-v8a.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-load-v8a.ll
@@ -236,8 +236,8 @@ define dso_local i128 @load_atomic_i128_aligned_unordered(ptr %ptr) {
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_unordered:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr unordered, align 16
     ret i128 %r
 }
@@ -251,8 +251,8 @@ define dso_local i128 @load_atomic_i128_aligned_unordered_const(ptr readonly %pt
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_unordered_const:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr unordered, align 16
     ret i128 %r
 }
@@ -266,8 +266,8 @@ define dso_local i128 @load_atomic_i128_aligned_monotonic(ptr %ptr) {
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_monotonic:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr monotonic, align 16
     ret i128 %r
 }
@@ -281,8 +281,8 @@ define dso_local i128 @load_atomic_i128_aligned_monotonic_const(ptr readonly %pt
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_monotonic_const:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr monotonic, align 16
     ret i128 %r
 }
@@ -296,8 +296,8 @@ define dso_local i128 @load_atomic_i128_aligned_acquire(ptr %ptr) {
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_acquire:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr acquire, align 16
     ret i128 %r
 }
@@ -311,8 +311,8 @@ define dso_local i128 @load_atomic_i128_aligned_acquire_const(ptr readonly %ptr)
 ; -O0:    stxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_acquire_const:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr acquire, align 16
     ret i128 %r
 }
@@ -326,8 +326,8 @@ define dso_local i128 @load_atomic_i128_aligned_seq_cst(ptr %ptr) {
 ; -O0:    stlxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_seq_cst:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stlxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stlxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr seq_cst, align 16
     ret i128 %r
 }
@@ -341,8 +341,8 @@ define dso_local i128 @load_atomic_i128_aligned_seq_cst_const(ptr readonly %ptr)
 ; -O0:    stlxp w8, x0, x1, [x9]
 ;
 ; -O1-LABEL: load_atomic_i128_aligned_seq_cst_const:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    stlxp w9, x0, x1, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    stlxp w9, x8, x1, [x0]
     %r = load atomic i128, ptr %ptr seq_cst, align 16
     ret i128 %r
 }
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-lse2.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-lse2.ll
index 0c52a8a683e3a06..55d48f1bd6226b1 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-lse2.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-lse2.ll
@@ -156,8 +156,8 @@ define dso_local i32 @atomicrmw_xchg_i32_aligned_monotonic(ptr %ptr, i32 %value)
 ; -O0:    subs w8, w9, w8
 ;
 ; -O1-LABEL: atomicrmw_xchg_i32_aligned_monotonic:
-; -O1:    ldxr w0, [x8]
-; -O1:    stxr w9, w1, [x8]
+; -O1:    ldxr w8, [x0]
+; -O1:    stxr w9, w1, [x0]
     %r = atomicrmw xchg ptr %ptr, i32 %value monotonic, align 4
     ret i32 %r
 }
@@ -170,8 +170,8 @@ define dso_local i32 @atomicrmw_xchg_i32_aligned_acquire(ptr %ptr, i32 %value) {
 ; -O0:    subs w8, w9, w8
 ;
 ; -O1-LABEL: atomicrmw_xchg_i32_aligned_acquire:
-; -O1:    ldaxr w0, [x8]
-; -O1:    stxr w9, w1, [x8]
+; -O1:    ldaxr w8, [x0]
+; -O1:    stxr w9, w1, [x0]
     %r = atomicrmw xchg ptr %ptr, i32 %value acquire, align 4
     ret i32 %r
 }
@@ -184,8 +184,8 @@ define dso_local i32 @atomicrmw_xchg_i32_aligned_release(ptr %ptr, i32 %value) {
 ; -O0:    subs w8, w9, w8
 ;
 ; -O1-LABEL: atomicrmw_xchg_i32_aligned_release:
-; -O1:    ldxr w0, [x8]
-; -O1:    stlxr w9, w1, [x8]
+; -O1:    ldxr w8, [x0]
+; -O1:    stlxr w9, w1, [x0]
     %r = atomicrmw xchg ptr %ptr, i32 %value release, align 4
     ret i32 %r
 }
@@ -198,8 +198,8 @@ define dso_local i32 @atomicrmw_xchg_i32_aligned_acq_rel(ptr %ptr, i32 %value) {
 ; -O0:    subs w8, w9, w8
 ;
 ; -O1-LABEL: atomicrmw_xchg_i32_aligned_acq_rel:
-; -O1:    ldaxr w0, [x8]
-; -O1:    stlxr w9, w1, [x8]
+; -O1:    ldaxr w8, [x0]
+; -O1:    stlxr w9, w1, [x0]
     %r = atomicrmw xchg ptr %ptr, i32 %value acq_rel, align 4
     ret i32 %r
 }
@@ -212,8 +212,8 @@ define dso_local i32 @atomicrmw_xchg_i32_aligned_seq_cst(ptr %ptr, i32 %value) {
 ; -O0:    subs w8, w9, w8
 ;
 ; -O1-LABEL: atomicrmw_xchg_i32_aligned_seq_cst:
-; -O1:    ldaxr w0, [x8]
-; -O1:    stlxr w9, w1, [x8]
+; -O1:    ldaxr w8, [x0]
+; -O1:    stlxr w9, w1, [x0]
     %r = atomicrmw xchg ptr %ptr, i32 %value seq_cst, align 4
     ret i32 %r
 }
@@ -226,8 +226,8 @@ define dso_local i64 @atomicrmw_xchg_i64_aligned_monotonic(ptr %ptr, i64 %value)
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_xchg_i64_aligned_monotonic:
-; -O1:    ldxr x0, [x8]
-; -O1:    stxr w9, x1, [x8]
+; -O1:    ldxr x8, [x0]
+; -O1:    stxr w9, x1, [x0]
     %r = atomicrmw xchg ptr %ptr, i64 %value monotonic, align 8
     ret i64 %r
 }
@@ -240,8 +240,8 @@ define dso_local i64 @atomicrmw_xchg_i64_aligned_acquire(ptr %ptr, i64 %value) {
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_xchg_i64_aligned_acquire:
-; -O1:    ldaxr x0, [x8]
-; -O1:    stxr w9, x1, [x8]
+; -O1:    ldaxr x8, [x0]
+; -O1:    stxr w9, x1, [x0]
     %r = atomicrmw xchg ptr %ptr, i64 %value acquire, align 8
     ret i64 %r
 }
@@ -254,8 +254,8 @@ define dso_local i64 @atomicrmw_xchg_i64_aligned_release(ptr %ptr, i64 %value) {
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_xchg_i64_aligned_release:
-; -O1:    ldxr x0, [x8]
-; -O1:    stlxr w9, x1, [x8]
+; -O1:    ldxr x8, [x0]
+; -O1:    stlxr w9, x1, [x0]
     %r = atomicrmw xchg ptr %ptr, i64 %value release, align 8
     ret i64 %r
 }
@@ -268,8 +268,8 @@ define dso_local i64 @atomicrmw_xchg_i64_aligned_acq_rel(ptr %ptr, i64 %value) {
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_xchg_i64_aligned_acq_rel:
-; -O1:    ldaxr x0, [x8]
-; -O1:    stlxr w9, x1, [x8]
+; -O1:    ldaxr x8, [x0]
+; -O1:    stlxr w9, x1, [x0]
     %r = atomicrmw xchg ptr %ptr, i64 %value acq_rel, align 8
     ret i64 %r
 }
@@ -282,8 +282,8 @@ define dso_local i64 @atomicrmw_xchg_i64_aligned_seq_cst(ptr %ptr, i64 %value) {
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_xchg_i64_aligned_seq_cst:
-; -O1:    ldaxr x0, [x8]
-; -O1:    stlxr w9, x1, [x8]
+; -O1:    ldaxr x8, [x0]
+; -O1:    stlxr w9, x1, [x0]
     %r = atomicrmw xchg ptr %ptr, i64 %value seq_cst, align 8
     ret i64 %r
 }
@@ -852,9 +852,9 @@ define dso_local i64 @atomicrmw_add_i64_aligned_monotonic(ptr %ptr, i64 %value)
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_add_i64_aligned_monotonic:
-; -O1:    ldxr x0, [x8]
-; -O1:    add x9, x0, x1
-; -O1:    stxr w10, x9, [x8]
+; -O1:    ldxr x8, [x0]
+; -O1:    add x9, x8, x1
+; -O1:    stxr w10, x9, [x0]
     %r = atomicrmw add ptr %ptr, i64 %value monotonic, align 8
     ret i64 %r
 }
@@ -868,9 +868,9 @@ define dso_local i64 @atomicrmw_add_i64_aligned_acquire(ptr %ptr, i64 %value) {
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_add_i64_aligned_acquire:
-; -O1:    ldaxr x0, [x8]
-; -O1:    add x9, x0, x1
-; -O1:    stxr w10, x9, [x8]
+; -O1:    ldaxr x8, [x0]
+; -O1:    add x9, x8, x1
+; -O1:    stxr w10, x9, [x0]
     %r = atomicrmw add ptr %ptr, i64 %value acquire, align 8
     ret i64 %r
 }
@@ -884,9 +884,9 @@ define dso_local i64 @atomicrmw_add_i64_aligned_release(ptr %ptr, i64 %value) {
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_add_i64_aligned_release:
-; -O1:    ldxr x0, [x8]
-; -O1:    add x9, x0, x1
-; -O1:    stlxr w10, x9, [x8]
+; -O1:    ldxr x8, [x0]
+; -O1:    add x9, x8, x1
+; -O1:    stlxr w10, x9, [x0]
     %r = atomicrmw add ptr %ptr, i64 %value release, align 8
     ret i64 %r
 }
@@ -900,9 +900,9 @@ define dso_local i64 @atomicrmw_add_i64_aligned_acq_rel(ptr %ptr, i64 %value) {
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_add_i64_aligned_acq_rel:
-; -O1:    ldaxr x0, [x8]
-; -O1:    add x9, x0, x1
-; -O1:    stlxr w10, x9, [x8]
+; -O1:    ldaxr x8, [x0]
+; -O1:    add x9, x8, x1
+; -O1:    stlxr w10, x9, [x0]
     %r = atomicrmw add ptr %ptr, i64 %value acq_rel, align 8
     ret i64 %r
 }
@@ -916,9 +916,9 @@ define dso_local i64 @atomicrmw_add_i64_aligned_seq_cst(ptr %ptr, i64 %value) {
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_add_i64_aligned_seq_cst:
-; -O1:    ldaxr x0, [x8]
-; -O1:    add x9, x0, x1
-; -O1:    stlxr w10, x9, [x8]
+; -O1:    ldaxr x8, [x0]
+; -O1:    add x9, x8, x1
+; -O1:    stlxr w10, x9, [x0]
     %r = atomicrmw add ptr %ptr, i64 %value seq_cst, align 8
     ret i64 %r
 }
@@ -939,9 +939,9 @@ define dso_local i128 @atomicrmw_add_i128_aligned_monotonic(ptr %ptr, i128 %valu
 ; -O0:    subs x8, x8, #0
 ;
 ; -O1-LABEL: atomicrmw_add_i128_aligned_monotonic:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    adds x9, x0, x2
-; -O1:    stxp w11, x9, x10, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    adds x9, x8, x2
+; -O1:    stxp w11, x9, x10, [x0]
     %r = atomicrmw add ptr %ptr, i128 %value monotonic, align 16
     ret i128 %r
 }
@@ -962,9 +962,9 @@ define dso_local i128 @atomicrmw_add_i128_aligned_acquire(ptr %ptr, i128 %value)
 ; -O0:    subs x8, x8, #0
 ;
 ; -O1-LABEL: atomicrmw_add_i128_aligned_acquire:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    adds x9, x0, x2
-; -O1:    stxp w11, x9, x10, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    adds x9, x8, x2
+; -O1:    stxp w11, x9, x10, [x0]
     %r = atomicrmw add ptr %ptr, i128 %value acquire, align 16
     ret i128 %r
 }
@@ -985,9 +985,9 @@ define dso_local i128 @atomicrmw_add_i128_aligned_release(ptr %ptr, i128 %value)
 ; -O0:    subs x8, x8, #0
 ;
 ; -O1-LABEL: atomicrmw_add_i128_aligned_release:
-; -O1:    ldxp x0, x1, [x8]
-; -O1:    adds x9, x0, x2
-; -O1:    stlxp w11, x9, x10, [x8]
+; -O1:    ldxp x8, x1, [x0]
+; -O1:    adds x9, x8, x2
+; -O1:    stlxp w11, x9, x10, [x0]
     %r = atomicrmw add ptr %ptr, i128 %value release, align 16
     ret i128 %r
 }
@@ -1008,9 +1008,9 @@ define dso_local i128 @atomicrmw_add_i128_aligned_acq_rel(ptr %ptr, i128 %value)
 ; -O0:    subs x8, x8, #0
 ;
 ; -O1-LABEL: atomicrmw_add_i128_aligned_acq_rel:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    adds x9, x0, x2
-; -O1:    stlxp w11, x9, x10, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    adds x9, x8, x2
+; -O1:    stlxp w11, x9, x10, [x0]
     %r = atomicrmw add ptr %ptr, i128 %value acq_rel, align 16
     ret i128 %r
 }
@@ -1031,9 +1031,9 @@ define dso_local i128 @atomicrmw_add_i128_aligned_seq_cst(ptr %ptr, i128 %value)
 ; -O0:    subs x8, x8, #0
 ;
 ; -O1-LABEL: atomicrmw_add_i128_aligned_seq_cst:
-; -O1:    ldaxp x0, x1, [x8]
-; -O1:    adds x9, x0, x2
-; -O1:    stlxp w11, x9, x10, [x8]
+; -O1:    ldaxp x8, x1, [x0]
+; -O1:    adds x9, x8, x2
+; -O1:    stlxp w11, x9, x10, [x0]
     %r = atomicrmw add ptr %ptr, i128 %value seq_cst, align 16
     ret i128 %r
 }
@@ -1632,9 +1632,9 @@ define dso_local i64 @atomicrmw_sub_i64_aligned_monotonic(ptr %ptr, i64 %value)
 ; -O0:    subs x8, x9, x8
 ;
 ; -O1-LABEL: atomicrmw_sub_i64_aligned_monotonic:
-; -O1:    ldxr x0, [x8]
-; -O1:    sub x9, x0, x1
-; -O1:    stxr w10, x9, [x8]
+; -O1:    ldxr x8, [x0]
+; -O1:    sub x9, x8, x1
+; -O1:    stxr w10, x9, [x0]
     %r = atomicr...
[truncated]

@jayfoad jayfoad requested review from perlfu, qcolombet and a team September 21, 2023 16:53
@jayfoad
Copy link
Contributor Author

jayfoad commented Sep 21, 2023

TODO: update 33 failing tests with manual checks.

@asl
Copy link
Collaborator

asl commented Sep 21, 2023

MSP430 test changes seems to be fine

@jayfoad jayfoad marked this pull request as draft September 22, 2023 08:54
Copy link
Collaborator

@qcolombet qcolombet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

llvm/lib/CodeGen/SlotIndexes.cpp Outdated Show resolved Hide resolved
llvm/lib/CodeGen/SlotIndexes.cpp Outdated Show resolved Hide resolved
jayfoad added a commit that referenced this pull request Sep 28, 2023
This makes some tests robust against minor codegen differences
that will be caused by PR #67038.
@jayfoad
Copy link
Contributor Author

jayfoad commented Sep 28, 2023

Rebased and fixed almost all lit tests. (Sorry for the force push but quite a few lit tests conflicted with other upstream changes since this PR was opened.)

I might need help with the last two lit test failures:

Failed Tests (2):
  LLVM :: CodeGen/Hexagon/reg-scavengebug-2.ll
  LLVM :: CodeGen/Hexagon/swp-epilog-phi7.ll

legrosbuffle pushed a commit to legrosbuffle/llvm-project that referenced this pull request Sep 29, 2023
This makes some tests robust against minor codegen differences
that will be caused by PR llvm#67038.
blackgeorge-boom added a commit to blackgeorge-boom/llvm-project that referenced this pull request Sep 29, 2023
Applies:
llvm#66334
llvm#67038

Packing the slot indexes before register allocation is useful for us
because it evens the gaps between slots after all the optimization
passes that happen before `greedy` and may have removed a different number
of instructions between AArch64 and X86. This leads to different slot gaps
and, hence, slightly different regalloc in some cases.

We backport the above patches for our LLVM, with the main difference
being the absence of some convenient data structure iterators, which we
had to convert to be compatible with our ADT infrastructure.

We add the `-pack-indexes` flag to activate this.

Addressses: systems-nuts/unifico#291
@jayfoad
Copy link
Contributor Author

jayfoad commented Oct 3, 2023

I might need help with the last two lit test failures:

Failed Tests (2):
  LLVM :: CodeGen/Hexagon/reg-scavengebug-2.ll
  LLVM :: CodeGen/Hexagon/swp-epilog-phi7.ll

Hi @kparzysz-quic, could you help with these please? The code changes quite a bit, so I'm not even sure if they are still testing what they were supposed to test.

@nikic
Copy link
Contributor

nikic commented Oct 4, 2023

@jayfoad I'd suggest marking those tests as XFAIL, to avoid further bitrot.

@jayfoad
Copy link
Contributor Author

jayfoad commented Oct 4, 2023

@jayfoad I'd suggest marking those tests as XFAIL, to avoid further bitrot.

Done, thanks.

@jayfoad
Copy link
Contributor Author

jayfoad commented Oct 6, 2023

Any more comments? If not I will plan to merge this on Monday, based on @qcolombet's approval.

PR #66334 tried to renumber slot indexes before register allocation, but
the numbering was still affected by list entries for instructions which
had been erased. Fix this to make the register allocator's live range
length heuristics even less dependent on the history of how instructions
have been added to and removed from SlotIndexes's maps.
@jayfoad jayfoad marked this pull request as ready for review October 9, 2023 09:56
@jayfoad jayfoad merged commit 2501ae5 into llvm:main Oct 9, 2023
4 checks passed
@jayfoad jayfoad deleted the pack-slotindexes branch October 9, 2023 10:44
jayfoad added a commit that referenced this pull request Oct 9, 2023
…tion (#67038)"

This reverts commit 2501ae5.

Reverted due to various buildbot failures.
jayfoad added a commit to jayfoad/llvm-project that referenced this pull request Jan 29, 2024
Add a basic implementation of verifyAnalysis that just checks that the
analysis does not refer to any SlotIndexes for instructions that have
been deleted. This was useful for diagnosing some SlotIndexes-related
problems caused by llvm#67038.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants