Skip to content

Conversation

@rastogishubham
Copy link
Contributor

@rastogishubham rastogishubham commented Nov 1, 2025

This patch adds the target hooks required by Instruction Referencing for the AArch64 target, as mentioned in https://llvm.org/docs/InstrRefDebugInfo.html#target-hooks

Which allows the Instruction Referenced LiveDebugValues Pass to track spills and restore instructions.

With this patch we can use the llvm/utils/llvm-locstats/llvm-locstats.py to see the coverage statistics on a clang.dSYM built with in RelWithDebInfo we can see:

coverage with dbg_value:

=================================================
            Debug Location Statistics       
 =================================================
     cov%           samples         percentage(~)  
 -------------------------------------------------
   0%              5828021               38%
   (0%,10%)         127739                0%
   [10%,20%)        143344                0%
   [20%,30%)        172100                1%
   [30%,40%)        193173                1%
   [40%,50%)        127366                0%
   [50%,60%)        308350                2%
   [60%,70%)        257055                1%
   [70%,80%)        212410                1%
   [80%,90%)        295316                1%
   [90%,100%)       349280                2%
   100%            7313157               47%
 =================================================
 -the number of debug variables processed: 15327311
 -PC ranges covered: 67%
 -------------------------------------------------
 -total availability: 62%
 =================================================

coverage with InstrRef without target hooks fix:

 =================================================
            Debug Location Statistics       
 =================================================
     cov%           samples         percentage(~)  
 -------------------------------------------------
   0%              6052807               39%
   (0%,10%)         127710                0%
   [10%,20%)        129999                0%
   [20%,30%)        155011                1%
   [30%,40%)        171206                1%
   [40%,50%)        102861                0%
   [50%,60%)        264734                1%
   [60%,70%)        212386                1%
   [70%,80%)        176872                1%
   [80%,90%)        242120                1%
   [90%,100%)       254465                1%
   100%            7437215               48%
 =================================================
 -the number of debug variables processed: 15327386
 -PC ranges covered: 67%
 -------------------------------------------------
 -total availability: 60%
 =================================================

coverage with InstrRef with target hooks fix:

 =================================================
            Debug Location Statistics       
 =================================================
     cov%           samples         percentage(~)  
 -------------------------------------------------
   0%              5972267               39%
   (0%,10%)         118873                0%
   [10%,20%)        127138                0%
   [20%,30%)        153181                1%
   [30%,40%)        170102                1%
   [40%,50%)        102180                0%
   [50%,60%)        263672                1%
   [60%,70%)        212865                1%
   [70%,80%)        176633                1%
   [80%,90%)        242403                1%
   [90%,100%)       264441                1%
   100%            7494527               48%
 =================================================
 -the number of debug variables processed: 15298282
 -PC ranges covered: 71%
 -------------------------------------------------
 -total availability: 61%
 =================================================

I believe this should be a good indication that Instruction Referencing should be turned on for AArch64?

@llvmbot
Copy link
Member

llvmbot commented Nov 3, 2025

@llvm/pr-subscribers-debuginfo

@llvm/pr-subscribers-llvm-globalisel

Author: Shubham Sandeep Rastogi (rastogishubham)

Changes

This patch adds the target hooks required by Instruction Referencing for the AArch64 target, as mentioned in https://llvm.org/docs/InstrRefDebugInfo.html#target-hooks

Which allows the Instruction Referenced LiveDebugValues Pass to track spills and restore instructions.

With this patch we can use the llvm/utils/llvm-locstats/llvm-locstats.py to see the coverage statistics on a clang.dSYM built with in RelWithDebInfo we can see:

coverage with dbg_value:

=================================================
            Debug Location Statistics       
 =================================================
     cov%           samples         percentage(~)  
 -------------------------------------------------
   0%              5828021               38%
   (0%,10%)         127739                0%
   [10%,20%)        143344                0%
   [20%,30%)        172100                1%
   [30%,40%)        193173                1%
   [40%,50%)        127366                0%
   [50%,60%)        308350                2%
   [60%,70%)        257055                1%
   [70%,80%)        212410                1%
   [80%,90%)        295316                1%
   [90%,100%)       349280                2%
   100%            7313157               47%
 =================================================
 -the number of debug variables processed: 15327311
 -PC ranges covered: 67%
 -------------------------------------------------
 -total availability: 62%
 =================================================

coverage with InstrRef without target hooks fix:

 =================================================
            Debug Location Statistics       
 =================================================
     cov%           samples         percentage(~)  
 -------------------------------------------------
   0%              6052807               39%
   (0%,10%)         127710                0%
   [10%,20%)        129999                0%
   [20%,30%)        155011                1%
   [30%,40%)        171206                1%
   [40%,50%)        102861                0%
   [50%,60%)        264734                1%
   [60%,70%)        212386                1%
   [70%,80%)        176872                1%
   [80%,90%)        242120                1%
   [90%,100%)       254465                1%
   100%            7437215               48%
 =================================================
 -the number of debug variables processed: 15327386
 -PC ranges covered: 67%
 -------------------------------------------------
 -total availability: 60%
 =================================================

coverage with InstrRef with target hooks fix:

 =================================================
            Debug Location Statistics       
 =================================================
     cov%           samples         percentage(~)  
 -------------------------------------------------
   0%              5972267               39%
   (0%,10%)         118873                0%
   [10%,20%)        127138                0%
   [20%,30%)        153181                1%
   [30%,40%)        170102                1%
   [40%,50%)        102180                0%
   [50%,60%)        263672                1%
   [60%,70%)        212865                1%
   [70%,80%)        176633                1%
   [80%,90%)        242403                1%
   [90%,100%)       264441                1%
   100%            7494527               48%
 =================================================
 -the number of debug variables processed: 15298282
 -PC ranges covered: 71%
 -------------------------------------------------
 -total availability: 61%
 =================================================

I believe this should be a good indication that Instruction Referencing should be turned on for AArch64?


Patch is 2.76 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/165953.diff

181 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+53-11)
  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.h (+9)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-128.ll (+36-36)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic.ll (+676-676)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/atomic-anyextending-load-crash.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/byval-call.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/call-lowering-tail-call-fallback.ll (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-stack-protector-windows.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/select-fp-anyext-crash.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/split-wide-shifts-multiway.ll (+177-177)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/stacksave-stackrestore.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/swifterror.ll (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/aarch64-fastcc-stackup.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/aarch64-fixup-statepoint-regs-crash.ll (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/aarch64-mops.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/addsub-constant-folding.ll (+36-36)
  • (modified) llvm/test/CodeGen/AArch64/alias_mask_scalable.ll (+72-72)
  • (modified) llvm/test/CodeGen/AArch64/alias_mask_scalable_nosve2.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/argument-blocks-array-of-struct.ll (+7-7)
  • (modified) llvm/test/CodeGen/AArch64/arm64-fp128.ll (+215-215)
  • (modified) llvm/test/CodeGen/AArch64/arm64-memset-inline.ll (+34-34)
  • (modified) llvm/test/CodeGen/AArch64/arm64-neon-mul-div.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/arm64-register-pairing.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/arm64-windows-calls.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/arm64ec-reservedregs.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/arm64ec-varargs.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/atomicrmw-O0.ll (+132-132)
  • (modified) llvm/test/CodeGen/AArch64/atomicrmw-fadd.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/atomicrmw-fsub.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/cmp-select-sign.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/combine-storetomstore.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/exception-handling-windows-elf.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/fadd-combines.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/fcmp-fp128.ll (+44-44)
  • (modified) llvm/test/CodeGen/AArch64/fcmp.ll (+61-61)
  • (modified) llvm/test/CodeGen/AArch64/fexplog.ll (+1500-1500)
  • (modified) llvm/test/CodeGen/AArch64/fold-int-pow2-with-fmul-or-fdiv.ll (+12-12)
  • (modified) llvm/test/CodeGen/AArch64/fp8-sme2-cvtn.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/fpclamptosat_vec.ll (+144-144)
  • (modified) llvm/test/CodeGen/AArch64/fpext.ll (+56-56)
  • (modified) llvm/test/CodeGen/AArch64/fpow.ll (+310-310)
  • (modified) llvm/test/CodeGen/AArch64/fpowi.ll (+260-260)
  • (modified) llvm/test/CodeGen/AArch64/fptoi.ll (+200-200)
  • (modified) llvm/test/CodeGen/AArch64/fptosi-sat-scalar.ll (+29-29)
  • (modified) llvm/test/CodeGen/AArch64/fptosi-sat-vector.ll (+270-270)
  • (modified) llvm/test/CodeGen/AArch64/fptoui-sat-scalar.ll (+27-27)
  • (modified) llvm/test/CodeGen/AArch64/fptoui-sat-vector.ll (+176-176)
  • (modified) llvm/test/CodeGen/AArch64/fptrunc.ll (+39-39)
  • (modified) llvm/test/CodeGen/AArch64/framelayout-sve-calleesaves-fix.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/frem-power2.ll (+22-22)
  • (modified) llvm/test/CodeGen/AArch64/frem.ll (+310-310)
  • (modified) llvm/test/CodeGen/AArch64/fsincos.ll (+600-600)
  • (modified) llvm/test/CodeGen/AArch64/implicit-def-subreg-to-reg-regression.ll (+3-3)
  • (modified) llvm/test/CodeGen/AArch64/insertextract.ll (+30-30)
  • (modified) llvm/test/CodeGen/AArch64/intrinsic-vector-match-sve2.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/itofp.ll (+248-248)
  • (modified) llvm/test/CodeGen/AArch64/ldexp.ll (+10-10)
  • (modified) llvm/test/CodeGen/AArch64/ldst-paired-aliasing.ll (+3-3)
  • (modified) llvm/test/CodeGen/AArch64/llvm.exp10.ll (+134-134)
  • (modified) llvm/test/CodeGen/AArch64/llvm.frexp.ll (+167-167)
  • (modified) llvm/test/CodeGen/AArch64/llvm.modf.ll (+49-49)
  • (modified) llvm/test/CodeGen/AArch64/llvm.sincos.ll (+113-113)
  • (modified) llvm/test/CodeGen/AArch64/llvm.sincospi.ll (+21-21)
  • (modified) llvm/test/CodeGen/AArch64/luti-with-sme2.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/machine-combiner.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/machine-outliner-retaddr-sign-non-leaf.ll (+14-12)
  • (modified) llvm/test/CodeGen/AArch64/mingw-refptr.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/named-vector-shuffle-reverse-neon.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/named-vector-shuffle-reverse-sve.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/neon-dotreduce.ll (+34-34)
  • (modified) llvm/test/CodeGen/AArch64/nontemporal.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/outlining-with-streaming-mode-changes.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/perm-tb-with-sme2.ll (+14-14)
  • (modified) llvm/test/CodeGen/AArch64/pow.ll (+18-18)
  • (modified) llvm/test/CodeGen/AArch64/pr135821.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/pr142314.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/pr164181.ll (+48-48)
  • (modified) llvm/test/CodeGen/AArch64/pr48188.ll (+12-12)
  • (modified) llvm/test/CodeGen/AArch64/pr53315-returned-i128.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/pr58516.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/preserve_nonecc_call.ll (+10-10)
  • (modified) llvm/test/CodeGen/AArch64/preserve_nonecc_varargs_aapcs.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/preserve_nonecc_varargs_win64.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/ragreedy-csr.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/ragreedy-local-interval-cost.ll (+7-7)
  • (modified) llvm/test/CodeGen/AArch64/rem.ll (+152-152)
  • (modified) llvm/test/CodeGen/AArch64/settag-merge.ll (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/settag.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/sibling-call.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/sincos-stack-slots.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/sls-stackprotector-outliner.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/sme-agnostic-za.ll (+12-12)
  • (modified) llvm/test/CodeGen/AArch64/sme-call-streaming-compatible-to-normal-fn-wihout-sme-attr.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/sme-callee-save-restore-pairs.ll (+144-144)
  • (modified) llvm/test/CodeGen/AArch64/sme-darwin-sve-vg.ll (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/sme-disable-gisel-fisel.ll (+48-48)
  • (modified) llvm/test/CodeGen/AArch64/sme-lazy-save-call.ll (+18-18)
  • (modified) llvm/test/CodeGen/AArch64/sme-lazy-save-windows.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/sme-must-save-lr-for-vg.ll (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/sme-new-za-function.ll (+21-21)
  • (modified) llvm/test/CodeGen/AArch64/sme-peephole-opts.ll (+53-53)
  • (modified) llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll (+120-120)
  • (modified) llvm/test/CodeGen/AArch64/sme-streaming-body.ll (+22-22)
  • (modified) llvm/test/CodeGen/AArch64/sme-streaming-checkvl.ll (+38-38)
  • (modified) llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll (+60-60)
  • (modified) llvm/test/CodeGen/AArch64/sme-streaming-interface.ll (+66-66)
  • (modified) llvm/test/CodeGen/AArch64/sme-streaming-mode-changes-unwindinfo.ll (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/sme-streaming-mode-changing-call-disable-stackslot-scavenging.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/sme-vg-to-stack.ll (+71-71)
  • (modified) llvm/test/CodeGen/AArch64/sme-za-control-flow.ll (+10-10)
  • (modified) llvm/test/CodeGen/AArch64/sme-za-exceptions.ll (+18-18)
  • (modified) llvm/test/CodeGen/AArch64/sme-zt0-state.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/sme2-fp8-intrinsics-cvt.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/sme2-intrinsics-int-dots.ll (+36-36)
  • (modified) llvm/test/CodeGen/AArch64/sme2-intrinsics-ld1.ll (+64-64)
  • (modified) llvm/test/CodeGen/AArch64/sme2-intrinsics-ldnt1.ll (+64-64)
  • (modified) llvm/test/CodeGen/AArch64/sme2-intrinsics-qcvt.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/sme2-intrinsics-qrshr.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/sme2-intrinsics-vdot.ll (+12-12)
  • (modified) llvm/test/CodeGen/AArch64/split-sve-stack-frame-layout.ll (+81-79)
  • (modified) llvm/test/CodeGen/AArch64/stack-hazard-defaults.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/stack-hazard-windows.ll (+20-20)
  • (modified) llvm/test/CodeGen/AArch64/stack-hazard.ll (+683-683)
  • (modified) llvm/test/CodeGen/AArch64/stack-probing-dynamic.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/stack-probing-sve.ll (+14-14)
  • (modified) llvm/test/CodeGen/AArch64/stack-probing.ll (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/statepoint-call-lowering.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/streaming-compatible-memory-ops.ll (+10-10)
  • (modified) llvm/test/CodeGen/AArch64/sve-alloca.ll (+24-24)
  • (modified) llvm/test/CodeGen/AArch64/sve-callee-save-restore-pairs.ll (+120-120)
  • (modified) llvm/test/CodeGen/AArch64/sve-calling-convention-mixed.ll (+48-48)
  • (modified) llvm/test/CodeGen/AArch64/sve-extract-scalable-vector.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/sve-fixed-ld2-alloca.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/sve-fixed-length-fp128.ll (+20-20)
  • (modified) llvm/test/CodeGen/AArch64/sve-fixed-length-frame-offests-crash.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/sve-fixed-vector-llrint.ll (+70-70)
  • (modified) llvm/test/CodeGen/AArch64/sve-fixed-vector-lrint.ll (+126-126)
  • (modified) llvm/test/CodeGen/AArch64/sve-fptosi-sat.ll (+12-12)
  • (modified) llvm/test/CodeGen/AArch64/sve-fptoui-sat.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/sve-insert-vector.ll (+32-32)
  • (modified) llvm/test/CodeGen/AArch64/sve-llrint.ll (+114-114)
  • (modified) llvm/test/CodeGen/AArch64/sve-lrint.ll (+114-114)
  • (modified) llvm/test/CodeGen/AArch64/sve-pred-arith.ll (+20-20)
  • (modified) llvm/test/CodeGen/AArch64/sve-stack-frame-layout.ll (+34-34)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-fp-fma.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-fp-to-int.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll (+12-12)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-mulh.ll (+56-56)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-to-fp.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-vselect.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-ld2-alloca.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-masked-load.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-trunc.ll (+258-258)
  • (modified) llvm/test/CodeGen/AArch64/sve-tailcall.ll (+48-48)
  • (modified) llvm/test/CodeGen/AArch64/sve-trunc.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/sve2p1-intrinsics-loads.ll (+144-144)
  • (modified) llvm/test/CodeGen/AArch64/sve2p1-intrinsics-predicate-as-counter.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/sve2p1-intrinsics-selx2.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/sve2p1-intrinsics-selx4.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/sve2p1-intrinsics-stores.ll (+64-64)
  • (modified) llvm/test/CodeGen/AArch64/swift-async-win.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/swifterror.ll (+232-232)
  • (modified) llvm/test/CodeGen/AArch64/trampoline.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/unwind-preserved.ll (+80-80)
  • (modified) llvm/test/CodeGen/AArch64/vec-libcalls.ll (+156-156)
  • (modified) llvm/test/CodeGen/AArch64/veclib-llvm.modf.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/vecreduce-fadd-legalization-strict.ll (+12-12)
  • (modified) llvm/test/CodeGen/AArch64/vecreduce-fmax-legalization.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/vecreduce-fmin-legalization.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/vector-llrint.ll (+84-84)
  • (modified) llvm/test/CodeGen/AArch64/vector-lrint.ll (+172-172)
  • (modified) llvm/test/CodeGen/AArch64/win-sve.ll (+210-210)
  • (modified) llvm/test/CodeGen/AArch64/win64-fpowi.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/win64_vararg.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/win64_vararg2.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/win64_vararg_float.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/win64_vararg_float_cc.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/win64cc-backup-x18.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wincfi-missing-seh-directives.ll (+3-3)
  • (added) llvm/test/DebugInfo/AArch64/instr-ref-target-hooks.ll (+61)
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index ccc8eb8a9706d..17c2f8d07ea1c 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -2392,11 +2392,10 @@ bool AArch64InstrInfo::isFPRCopy(const MachineInstr &MI) {
   return false;
 }
 
-Register AArch64InstrInfo::isLoadFromStackSlot(const MachineInstr &MI,
-                                               int &FrameIndex) const {
-  switch (MI.getOpcode()) {
+static bool isFrameLoadOpcode(int Opcode) {
+  switch (Opcode) {
   default:
-    break;
+    return false;
   case AArch64::LDRWui:
   case AArch64::LDRXui:
   case AArch64::LDRBui:
@@ -2405,22 +2404,26 @@ Register AArch64InstrInfo::isLoadFromStackSlot(const MachineInstr &MI,
   case AArch64::LDRDui:
   case AArch64::LDRQui:
   case AArch64::LDR_PXI:
+    return true;
+  }
+}
+
+Register AArch64InstrInfo::isLoadFromStackSlot(const MachineInstr &MI,
+                                               int &FrameIndex) const {
+  if (isFrameLoadOpcode(MI.getOpcode())) {
     if (MI.getOperand(0).getSubReg() == 0 && MI.getOperand(1).isFI() &&
         MI.getOperand(2).isImm() && MI.getOperand(2).getImm() == 0) {
       FrameIndex = MI.getOperand(1).getIndex();
       return MI.getOperand(0).getReg();
     }
-    break;
   }
-
   return 0;
 }
 
-Register AArch64InstrInfo::isStoreToStackSlot(const MachineInstr &MI,
-                                              int &FrameIndex) const {
-  switch (MI.getOpcode()) {
+static bool isFrameStoreOpcode(int Opcode) {
+  switch (Opcode) {
   default:
-    break;
+    return false;
   case AArch64::STRWui:
   case AArch64::STRXui:
   case AArch64::STRBui:
@@ -2429,16 +2432,55 @@ Register AArch64InstrInfo::isStoreToStackSlot(const MachineInstr &MI,
   case AArch64::STRDui:
   case AArch64::STRQui:
   case AArch64::STR_PXI:
+    return true;
+  }
+}
+
+Register AArch64InstrInfo::isStoreToStackSlot(const MachineInstr &MI,
+                                              int &FrameIndex) const {
+  if (isFrameStoreOpcode(MI.getOpcode())) {
     if (MI.getOperand(0).getSubReg() == 0 && MI.getOperand(1).isFI() &&
         MI.getOperand(2).isImm() && MI.getOperand(2).getImm() == 0) {
       FrameIndex = MI.getOperand(1).getIndex();
       return MI.getOperand(0).getReg();
     }
-    break;
   }
   return 0;
 }
 
+Register AArch64InstrInfo::isStoreToStackSlotPostFE(const MachineInstr &MI,
+                                                    int &FrameIndex) const {
+  if (isFrameStoreOpcode(MI.getOpcode())) {
+    SmallVector<const MachineMemOperand *, 1> Accesses;
+    if (Register Reg = isStoreToStackSlot(MI, FrameIndex))
+      return Reg;
+
+    if (hasStoreToStackSlot(MI, Accesses)) {
+      FrameIndex =
+          cast<FixedStackPseudoSourceValue>(Accesses.front()->getPseudoValue())
+              ->getFrameIndex();
+      return MI.getOperand(0).getReg();
+    }
+  }
+  return Register();
+}
+
+Register AArch64InstrInfo::isLoadFromStackSlotPostFE(const MachineInstr &MI,
+                                                     int &FrameIndex) const {
+  if (isFrameLoadOpcode(MI.getOpcode())) {
+    if (Register Reg = isLoadFromStackSlot(MI, FrameIndex))
+      return Reg;
+    SmallVector<const MachineMemOperand *, 1> Accesses;
+    if (hasLoadFromStackSlot(MI, Accesses)) {
+      FrameIndex =
+          cast<FixedStackPseudoSourceValue>(Accesses.front()->getPseudoValue())
+              ->getFrameIndex();
+      return MI.getOperand(0).getReg();
+    }
+  }
+  return Register();
+}
+
 /// Check all MachineMemOperands for a hint to suppress pairing.
 bool AArch64InstrInfo::isLdStPairSuppressed(const MachineInstr &MI) {
   return llvm::any_of(MI.memoperands(), [](MachineMemOperand *MMO) {
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.h b/llvm/lib/Target/AArch64/AArch64InstrInfo.h
index 179574a73aa01..44863eb2f6d95 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.h
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.h
@@ -205,6 +205,15 @@ class AArch64InstrInfo final : public AArch64GenInstrInfo {
   Register isStoreToStackSlot(const MachineInstr &MI,
                               int &FrameIndex) const override;
 
+  /// isStoreToStackSlotPostFE - Check for post-frame ptr elimination
+  /// stack locations as well.  This uses a heuristic so it isn't
+  /// reliable for correctness.
+  Register isStoreToStackSlotPostFE(const MachineInstr &MI,
+                                    int &FrameIndex) const override;
+
+  Register isLoadFromStackSlotPostFE(const MachineInstr &MI,
+                                     int &FrameIndex) const override;
+
   /// Does this instruction set its full destination register to zero?
   static bool isGPRZero(const MachineInstr &MI);
 
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-128.ll b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-128.ll
index 1fe63c9be8c62..be51210882eaa 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-128.ll
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-128.ll
@@ -89,23 +89,23 @@ define void @val_compare_and_swap(ptr %p, i128 %oldval, i128 %newval) {
 ; CHECK-OUTLINE-LLSC-O0-LABEL: val_compare_and_swap:
 ; CHECK-OUTLINE-LLSC-O0:       // %bb.0:
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    sub sp, sp, #32
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_def_cfa_offset 32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_offset w30, -16
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x0, [sp, #8] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x0, [sp, #8] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x0, x2
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x1, x3
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x2, x4
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x4, [sp, #8] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x4, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x3, x5
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    bl __aarch64_cas16_acq
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x8, x0
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x0, [sp, #8] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x0, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    // implicit-def: $q0
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov v0.d[0], x8
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov v0.d[1], x1
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    str q0, [x0]
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    add sp, sp, #32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    ret
 ;
@@ -113,9 +113,9 @@ define void @val_compare_and_swap(ptr %p, i128 %oldval, i128 %newval) {
 ; CHECK-CAS-O0:       // %bb.0:
 ; CHECK-CAS-O0-NEXT:    sub sp, sp, #16
 ; CHECK-CAS-O0-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-CAS-O0-NEXT:    str x3, [sp, #8] // 8-byte Folded Spill
+; CHECK-CAS-O0-NEXT:    str x3, [sp, #8] // 8-byte Spill
 ; CHECK-CAS-O0-NEXT:    mov x1, x5
-; CHECK-CAS-O0-NEXT:    ldr x5, [sp, #8] // 8-byte Folded Reload
+; CHECK-CAS-O0-NEXT:    ldr x5, [sp, #8] // 8-byte Reload
 ; CHECK-CAS-O0-NEXT:    // kill: def $x2 killed $x2 def $x2_x3
 ; CHECK-CAS-O0-NEXT:    mov x3, x5
 ; CHECK-CAS-O0-NEXT:    // kill: def $x4 killed $x4 def $x4_x5
@@ -216,23 +216,23 @@ define void @val_compare_and_swap_monotonic_seqcst(ptr %p, i128 %oldval, i128 %n
 ; CHECK-OUTLINE-LLSC-O0-LABEL: val_compare_and_swap_monotonic_seqcst:
 ; CHECK-OUTLINE-LLSC-O0:       // %bb.0:
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    sub sp, sp, #32
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_def_cfa_offset 32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_offset w30, -16
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x0, [sp, #8] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x0, [sp, #8] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x0, x2
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x1, x3
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x2, x4
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x4, [sp, #8] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x4, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x3, x5
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    bl __aarch64_cas16_acq_rel
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x8, x0
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x0, [sp, #8] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x0, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    // implicit-def: $q0
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov v0.d[0], x8
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov v0.d[1], x1
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    str q0, [x0]
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    add sp, sp, #32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    ret
 ;
@@ -240,9 +240,9 @@ define void @val_compare_and_swap_monotonic_seqcst(ptr %p, i128 %oldval, i128 %n
 ; CHECK-CAS-O0:       // %bb.0:
 ; CHECK-CAS-O0-NEXT:    sub sp, sp, #16
 ; CHECK-CAS-O0-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-CAS-O0-NEXT:    str x3, [sp, #8] // 8-byte Folded Spill
+; CHECK-CAS-O0-NEXT:    str x3, [sp, #8] // 8-byte Spill
 ; CHECK-CAS-O0-NEXT:    mov x1, x5
-; CHECK-CAS-O0-NEXT:    ldr x5, [sp, #8] // 8-byte Folded Reload
+; CHECK-CAS-O0-NEXT:    ldr x5, [sp, #8] // 8-byte Reload
 ; CHECK-CAS-O0-NEXT:    // kill: def $x2 killed $x2 def $x2_x3
 ; CHECK-CAS-O0-NEXT:    mov x3, x5
 ; CHECK-CAS-O0-NEXT:    // kill: def $x4 killed $x4 def $x4_x5
@@ -343,23 +343,23 @@ define void @val_compare_and_swap_release_acquire(ptr %p, i128 %oldval, i128 %ne
 ; CHECK-OUTLINE-LLSC-O0-LABEL: val_compare_and_swap_release_acquire:
 ; CHECK-OUTLINE-LLSC-O0:       // %bb.0:
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    sub sp, sp, #32
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_def_cfa_offset 32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_offset w30, -16
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x0, [sp, #8] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x0, [sp, #8] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x0, x2
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x1, x3
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x2, x4
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x4, [sp, #8] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x4, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x3, x5
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    bl __aarch64_cas16_acq_rel
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x8, x0
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x0, [sp, #8] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x0, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    // implicit-def: $q0
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov v0.d[0], x8
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov v0.d[1], x1
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    str q0, [x0]
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    add sp, sp, #32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    ret
 ;
@@ -367,9 +367,9 @@ define void @val_compare_and_swap_release_acquire(ptr %p, i128 %oldval, i128 %ne
 ; CHECK-CAS-O0:       // %bb.0:
 ; CHECK-CAS-O0-NEXT:    sub sp, sp, #16
 ; CHECK-CAS-O0-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-CAS-O0-NEXT:    str x3, [sp, #8] // 8-byte Folded Spill
+; CHECK-CAS-O0-NEXT:    str x3, [sp, #8] // 8-byte Spill
 ; CHECK-CAS-O0-NEXT:    mov x1, x5
-; CHECK-CAS-O0-NEXT:    ldr x5, [sp, #8] // 8-byte Folded Reload
+; CHECK-CAS-O0-NEXT:    ldr x5, [sp, #8] // 8-byte Reload
 ; CHECK-CAS-O0-NEXT:    // kill: def $x2 killed $x2 def $x2_x3
 ; CHECK-CAS-O0-NEXT:    mov x3, x5
 ; CHECK-CAS-O0-NEXT:    // kill: def $x4 killed $x4 def $x4_x5
@@ -470,23 +470,23 @@ define void @val_compare_and_swap_monotonic(ptr %p, i128 %oldval, i128 %newval)
 ; CHECK-OUTLINE-LLSC-O0-LABEL: val_compare_and_swap_monotonic:
 ; CHECK-OUTLINE-LLSC-O0:       // %bb.0:
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    sub sp, sp, #32
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_def_cfa_offset 32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_offset w30, -16
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x0, [sp, #8] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x0, [sp, #8] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x0, x2
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x1, x3
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x2, x4
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x4, [sp, #8] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x4, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x3, x5
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    bl __aarch64_cas16_acq_rel
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x8, x0
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x0, [sp, #8] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x0, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    // implicit-def: $q0
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov v0.d[0], x8
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov v0.d[1], x1
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    str q0, [x0]
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    add sp, sp, #32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    ret
 ;
@@ -494,9 +494,9 @@ define void @val_compare_and_swap_monotonic(ptr %p, i128 %oldval, i128 %newval)
 ; CHECK-CAS-O0:       // %bb.0:
 ; CHECK-CAS-O0-NEXT:    sub sp, sp, #16
 ; CHECK-CAS-O0-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-CAS-O0-NEXT:    str x3, [sp, #8] // 8-byte Folded Spill
+; CHECK-CAS-O0-NEXT:    str x3, [sp, #8] // 8-byte Spill
 ; CHECK-CAS-O0-NEXT:    mov x1, x5
-; CHECK-CAS-O0-NEXT:    ldr x5, [sp, #8] // 8-byte Folded Reload
+; CHECK-CAS-O0-NEXT:    ldr x5, [sp, #8] // 8-byte Reload
 ; CHECK-CAS-O0-NEXT:    // kill: def $x2 killed $x2 def $x2_x3
 ; CHECK-CAS-O0-NEXT:    mov x3, x5
 ; CHECK-CAS-O0-NEXT:    // kill: def $x4 killed $x4 def $x4_x5
@@ -580,22 +580,22 @@ define void @atomic_load_relaxed(i64, i64, ptr %p, ptr %p2) {
 ; CHECK-OUTLINE-LLSC-O0-LABEL: atomic_load_relaxed:
 ; CHECK-OUTLINE-LLSC-O0:       // %bb.0:
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    sub sp, sp, #32
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_def_cfa_offset 32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_offset w30, -16
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x4, x2
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x3, [sp, #8] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x3, [sp, #8] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x3, xzr
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x0, x3
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x1, x3
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x2, x3
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    bl __aarch64_cas16_relax
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x3, [sp, #8] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x3, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    // implicit-def: $q0
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov v0.d[0], x0
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov v0.d[1], x1
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    str q0, [x3]
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    add sp, sp, #32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    ret
 ;
@@ -690,17 +690,17 @@ define i128 @val_compare_and_swap_return(ptr %p, i128 %oldval, i128 %newval) {
 ; CHECK-OUTLINE-LLSC-O0-LABEL: val_compare_and_swap_return:
 ; CHECK-OUTLINE-LLSC-O0:       // %bb.0:
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    sub sp, sp, #32
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_def_cfa_offset 32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_offset w30, -16
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x0, [sp, #8] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x0, [sp, #8] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x0, x2
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x1, x3
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x2, x4
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x4, [sp, #8] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x4, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x3, x5
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    bl __aarch64_cas16_acq
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    add sp, sp, #32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    ret
 ;
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic.ll b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic.ll
index e6bf3ab674717..3f51ec747182a 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic.ll
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic.ll
@@ -56,10 +56,10 @@ define i32 @val_compare_and_swap(ptr %p, i32 %cmp, i32 %new) #0 {
 ; CHECK-OUTLINE-O0:       ; %bb.0:
 ; CHECK-OUTLINE-O0-NEXT:    sub sp, sp, #32
 ; CHECK-OUTLINE-O0-NEXT:    stp x29, x30, [sp, #16] ; 16-byte Folded Spill
-; CHECK-OUTLINE-O0-NEXT:    str x0, [sp, #8] ; 8-byte Folded Spill
+; CHECK-OUTLINE-O0-NEXT:    str x0, [sp, #8] ; 8-byte Spill
 ; CHECK-OUTLINE-O0-NEXT:    mov w0, w1
 ; CHECK-OUTLINE-O0-NEXT:    mov w1, w2
-; CHECK-OUTLINE-O0-NEXT:    ldr x2, [sp, #8] ; 8-byte Folded Reload
+; CHECK-OUTLINE-O0-NEXT:    ldr x2, [sp, #8] ; 8-byte Reload
 ; CHECK-OUTLINE-O0-NEXT:    bl ___aarch64_cas4_acq
 ; CHECK-OUTLINE-O0-NEXT:    ldp x29, x30, [sp, #16] ; 16-byte Folded Reload
 ; CHECK-OUTLINE-O0-NEXT:    add sp, sp, #32
@@ -133,10 +133,10 @@ define i32 @val_compare_and_swap_from_load(ptr %p, i32 %cmp, ptr %pnew) #0 {
 ; CHECK-OUTLINE-O0:       ; %bb.0:
 ; CHECK-OUTLINE-O0-NEXT:    sub sp, sp, #32
 ; CHECK-OUTLINE-O0-NEXT:    stp x29, x30, [sp, #16] ; 16-byte Folded Spill
-; CHECK-OUTLINE-O0-NEXT:    str x0, [sp, #8] ; 8-byte Folded Spill
+; CHECK-OUTLINE-O0-NEXT:    str x0, [sp, #8] ; 8-byte Spill
 ; CHECK-OUTLINE-O0-NEXT:    mov w0, w1
 ; CHECK-OUTLINE-O0-NEXT:    mov x8, x2
-; CHECK-OUTLINE-O0-NEXT:    ldr x2, [sp, #8] ; 8-byte Folded Reload
+; CHECK-OUTLINE-O0-NEXT:    ldr x2, [sp, #8] ; 8-byte Reload
 ; CHECK-OUTLINE-O0-NEXT:    ldr w1, [x8]
 ; CHECK-OUTLINE-O0-NEXT:    bl ___aarch64_cas4_acq
 ; CHECK-OUTLINE-O0-NEXT:    ldp x29, x30, [sp, #16] ; 16-byte Folded Reload
@@ -211,10 +211,10 @@ define i32 @val_compare_and_swap_rel(ptr %p, i32 %cmp, i32 %new) #0 {
 ; CHECK-OUTLINE-O0:       ; %bb.0:
 ; CHECK-OUTLINE-O0-NEXT:    sub sp, sp, #32
 ; CHECK-OUTLINE-O0-NEXT:    stp x29, x30, [sp, #16] ; 16-byte Folded Spill
-; CHECK-OUTLINE-O0-NEXT:    str x0, [sp, #8] ; 8-byte Folded Spill
+; CHECK-OUTLINE-O0-NEXT:    str x0, [sp, #8] ; 8-byte Spill
 ; CHECK-OUTLINE-O0-NEXT:    mov w0, w1
 ; CHECK-OUTLINE-O0-NEXT:    mov w1, w2
-; CHECK-OUTLINE-O0-NEXT:    ldr x2, [sp, #8] ; 8-byte Folded Reload
+; CHECK-OUTLINE-O0-NEXT:    ldr x2, [sp, #8] ; 8-byte Reload
 ; CHECK-OUTLINE-O0-NEXT:    bl ___aarch64_cas4_acq_rel
 ; CHECK-OUTLINE-O0-NEXT:    ldp x29, x30, [sp, #16] ; 16-byte Folded Reload
 ; CHECK-OUTLINE-O0-NEXT:    add sp, sp, #32
@@ -285,10 +285,10 @@ define i64 @val_compare_and_swap_64(ptr %p, i64 %cmp, i64 %new) #0 {
 ; CHECK-OUTLINE-O0:       ; %bb.0:
 ; CHECK-OUTLINE-O0-NEXT:    sub sp, sp, #32
 ; CHECK-OUTLINE-O0-NEXT:    stp x29, x30, [sp, #16] ; 16-byte Folded Spill
-; CHECK-OUTLINE-O0-NEXT:    str x0, [sp, #8] ; 8-byte Folded Spill
+; CHECK-OUTLINE-O0-NEXT:    str x0, [sp, #8] ; 8-byte Spill
 ; CHECK-OUTLINE-O0-NEXT:    mov x0, x1
 ; CHECK-OUTLINE-O0-NEXT:    mov x1, x2
-; CHECK-OUTLINE-O0-NEXT:    ldr x2, [sp, #8] ; 8-byte Folded Reload
+; CHECK-OUTLINE-O0-NEXT:    ldr x2, [sp, #8] ; 8-byte Reload
 ; CHECK-OUTLINE-O0-NEXT:    bl ___aarch64_cas8_relax
 ; CHECK-OUTLINE-O0-NEXT:    ldp x29, x30, [sp, #16] ; 16-...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Nov 3, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Shubham Sandeep Rastogi (rastogishubham)

Changes

This patch adds the target hooks required by Instruction Referencing for the AArch64 target, as mentioned in https://llvm.org/docs/InstrRefDebugInfo.html#target-hooks

Which allows the Instruction Referenced LiveDebugValues Pass to track spills and restore instructions.

With this patch we can use the llvm/utils/llvm-locstats/llvm-locstats.py to see the coverage statistics on a clang.dSYM built with in RelWithDebInfo we can see:

coverage with dbg_value:

=================================================
            Debug Location Statistics       
 =================================================
     cov%           samples         percentage(~)  
 -------------------------------------------------
   0%              5828021               38%
   (0%,10%)         127739                0%
   [10%,20%)        143344                0%
   [20%,30%)        172100                1%
   [30%,40%)        193173                1%
   [40%,50%)        127366                0%
   [50%,60%)        308350                2%
   [60%,70%)        257055                1%
   [70%,80%)        212410                1%
   [80%,90%)        295316                1%
   [90%,100%)       349280                2%
   100%            7313157               47%
 =================================================
 -the number of debug variables processed: 15327311
 -PC ranges covered: 67%
 -------------------------------------------------
 -total availability: 62%
 =================================================

coverage with InstrRef without target hooks fix:

 =================================================
            Debug Location Statistics       
 =================================================
     cov%           samples         percentage(~)  
 -------------------------------------------------
   0%              6052807               39%
   (0%,10%)         127710                0%
   [10%,20%)        129999                0%
   [20%,30%)        155011                1%
   [30%,40%)        171206                1%
   [40%,50%)        102861                0%
   [50%,60%)        264734                1%
   [60%,70%)        212386                1%
   [70%,80%)        176872                1%
   [80%,90%)        242120                1%
   [90%,100%)       254465                1%
   100%            7437215               48%
 =================================================
 -the number of debug variables processed: 15327386
 -PC ranges covered: 67%
 -------------------------------------------------
 -total availability: 60%
 =================================================

coverage with InstrRef with target hooks fix:

 =================================================
            Debug Location Statistics       
 =================================================
     cov%           samples         percentage(~)  
 -------------------------------------------------
   0%              5972267               39%
   (0%,10%)         118873                0%
   [10%,20%)        127138                0%
   [20%,30%)        153181                1%
   [30%,40%)        170102                1%
   [40%,50%)        102180                0%
   [50%,60%)        263672                1%
   [60%,70%)        212865                1%
   [70%,80%)        176633                1%
   [80%,90%)        242403                1%
   [90%,100%)       264441                1%
   100%            7494527               48%
 =================================================
 -the number of debug variables processed: 15298282
 -PC ranges covered: 71%
 -------------------------------------------------
 -total availability: 61%
 =================================================

I believe this should be a good indication that Instruction Referencing should be turned on for AArch64?


Patch is 2.76 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/165953.diff

181 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+53-11)
  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.h (+9)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-128.ll (+36-36)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic.ll (+676-676)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/atomic-anyextending-load-crash.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/byval-call.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/call-lowering-tail-call-fallback.ll (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-stack-protector-windows.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/select-fp-anyext-crash.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/split-wide-shifts-multiway.ll (+177-177)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/stacksave-stackrestore.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/swifterror.ll (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/aarch64-fastcc-stackup.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/aarch64-fixup-statepoint-regs-crash.ll (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/aarch64-mops.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/addsub-constant-folding.ll (+36-36)
  • (modified) llvm/test/CodeGen/AArch64/alias_mask_scalable.ll (+72-72)
  • (modified) llvm/test/CodeGen/AArch64/alias_mask_scalable_nosve2.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/argument-blocks-array-of-struct.ll (+7-7)
  • (modified) llvm/test/CodeGen/AArch64/arm64-fp128.ll (+215-215)
  • (modified) llvm/test/CodeGen/AArch64/arm64-memset-inline.ll (+34-34)
  • (modified) llvm/test/CodeGen/AArch64/arm64-neon-mul-div.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/arm64-register-pairing.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/arm64-windows-calls.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/arm64ec-reservedregs.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/arm64ec-varargs.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/atomicrmw-O0.ll (+132-132)
  • (modified) llvm/test/CodeGen/AArch64/atomicrmw-fadd.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/atomicrmw-fsub.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/cmp-select-sign.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/combine-storetomstore.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/exception-handling-windows-elf.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/fadd-combines.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/fcmp-fp128.ll (+44-44)
  • (modified) llvm/test/CodeGen/AArch64/fcmp.ll (+61-61)
  • (modified) llvm/test/CodeGen/AArch64/fexplog.ll (+1500-1500)
  • (modified) llvm/test/CodeGen/AArch64/fold-int-pow2-with-fmul-or-fdiv.ll (+12-12)
  • (modified) llvm/test/CodeGen/AArch64/fp8-sme2-cvtn.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/fpclamptosat_vec.ll (+144-144)
  • (modified) llvm/test/CodeGen/AArch64/fpext.ll (+56-56)
  • (modified) llvm/test/CodeGen/AArch64/fpow.ll (+310-310)
  • (modified) llvm/test/CodeGen/AArch64/fpowi.ll (+260-260)
  • (modified) llvm/test/CodeGen/AArch64/fptoi.ll (+200-200)
  • (modified) llvm/test/CodeGen/AArch64/fptosi-sat-scalar.ll (+29-29)
  • (modified) llvm/test/CodeGen/AArch64/fptosi-sat-vector.ll (+270-270)
  • (modified) llvm/test/CodeGen/AArch64/fptoui-sat-scalar.ll (+27-27)
  • (modified) llvm/test/CodeGen/AArch64/fptoui-sat-vector.ll (+176-176)
  • (modified) llvm/test/CodeGen/AArch64/fptrunc.ll (+39-39)
  • (modified) llvm/test/CodeGen/AArch64/framelayout-sve-calleesaves-fix.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/frem-power2.ll (+22-22)
  • (modified) llvm/test/CodeGen/AArch64/frem.ll (+310-310)
  • (modified) llvm/test/CodeGen/AArch64/fsincos.ll (+600-600)
  • (modified) llvm/test/CodeGen/AArch64/implicit-def-subreg-to-reg-regression.ll (+3-3)
  • (modified) llvm/test/CodeGen/AArch64/insertextract.ll (+30-30)
  • (modified) llvm/test/CodeGen/AArch64/intrinsic-vector-match-sve2.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/itofp.ll (+248-248)
  • (modified) llvm/test/CodeGen/AArch64/ldexp.ll (+10-10)
  • (modified) llvm/test/CodeGen/AArch64/ldst-paired-aliasing.ll (+3-3)
  • (modified) llvm/test/CodeGen/AArch64/llvm.exp10.ll (+134-134)
  • (modified) llvm/test/CodeGen/AArch64/llvm.frexp.ll (+167-167)
  • (modified) llvm/test/CodeGen/AArch64/llvm.modf.ll (+49-49)
  • (modified) llvm/test/CodeGen/AArch64/llvm.sincos.ll (+113-113)
  • (modified) llvm/test/CodeGen/AArch64/llvm.sincospi.ll (+21-21)
  • (modified) llvm/test/CodeGen/AArch64/luti-with-sme2.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/machine-combiner.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/machine-outliner-retaddr-sign-non-leaf.ll (+14-12)
  • (modified) llvm/test/CodeGen/AArch64/mingw-refptr.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/named-vector-shuffle-reverse-neon.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/named-vector-shuffle-reverse-sve.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/neon-dotreduce.ll (+34-34)
  • (modified) llvm/test/CodeGen/AArch64/nontemporal.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/outlining-with-streaming-mode-changes.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/perm-tb-with-sme2.ll (+14-14)
  • (modified) llvm/test/CodeGen/AArch64/pow.ll (+18-18)
  • (modified) llvm/test/CodeGen/AArch64/pr135821.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/pr142314.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/pr164181.ll (+48-48)
  • (modified) llvm/test/CodeGen/AArch64/pr48188.ll (+12-12)
  • (modified) llvm/test/CodeGen/AArch64/pr53315-returned-i128.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/pr58516.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/preserve_nonecc_call.ll (+10-10)
  • (modified) llvm/test/CodeGen/AArch64/preserve_nonecc_varargs_aapcs.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/preserve_nonecc_varargs_win64.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/ragreedy-csr.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/ragreedy-local-interval-cost.ll (+7-7)
  • (modified) llvm/test/CodeGen/AArch64/rem.ll (+152-152)
  • (modified) llvm/test/CodeGen/AArch64/settag-merge.ll (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/settag.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/sibling-call.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/sincos-stack-slots.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/sls-stackprotector-outliner.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/sme-agnostic-za.ll (+12-12)
  • (modified) llvm/test/CodeGen/AArch64/sme-call-streaming-compatible-to-normal-fn-wihout-sme-attr.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/sme-callee-save-restore-pairs.ll (+144-144)
  • (modified) llvm/test/CodeGen/AArch64/sme-darwin-sve-vg.ll (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/sme-disable-gisel-fisel.ll (+48-48)
  • (modified) llvm/test/CodeGen/AArch64/sme-lazy-save-call.ll (+18-18)
  • (modified) llvm/test/CodeGen/AArch64/sme-lazy-save-windows.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/sme-must-save-lr-for-vg.ll (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/sme-new-za-function.ll (+21-21)
  • (modified) llvm/test/CodeGen/AArch64/sme-peephole-opts.ll (+53-53)
  • (modified) llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll (+120-120)
  • (modified) llvm/test/CodeGen/AArch64/sme-streaming-body.ll (+22-22)
  • (modified) llvm/test/CodeGen/AArch64/sme-streaming-checkvl.ll (+38-38)
  • (modified) llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll (+60-60)
  • (modified) llvm/test/CodeGen/AArch64/sme-streaming-interface.ll (+66-66)
  • (modified) llvm/test/CodeGen/AArch64/sme-streaming-mode-changes-unwindinfo.ll (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/sme-streaming-mode-changing-call-disable-stackslot-scavenging.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/sme-vg-to-stack.ll (+71-71)
  • (modified) llvm/test/CodeGen/AArch64/sme-za-control-flow.ll (+10-10)
  • (modified) llvm/test/CodeGen/AArch64/sme-za-exceptions.ll (+18-18)
  • (modified) llvm/test/CodeGen/AArch64/sme-zt0-state.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/sme2-fp8-intrinsics-cvt.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/sme2-intrinsics-int-dots.ll (+36-36)
  • (modified) llvm/test/CodeGen/AArch64/sme2-intrinsics-ld1.ll (+64-64)
  • (modified) llvm/test/CodeGen/AArch64/sme2-intrinsics-ldnt1.ll (+64-64)
  • (modified) llvm/test/CodeGen/AArch64/sme2-intrinsics-qcvt.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/sme2-intrinsics-qrshr.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/sme2-intrinsics-vdot.ll (+12-12)
  • (modified) llvm/test/CodeGen/AArch64/split-sve-stack-frame-layout.ll (+81-79)
  • (modified) llvm/test/CodeGen/AArch64/stack-hazard-defaults.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/stack-hazard-windows.ll (+20-20)
  • (modified) llvm/test/CodeGen/AArch64/stack-hazard.ll (+683-683)
  • (modified) llvm/test/CodeGen/AArch64/stack-probing-dynamic.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/stack-probing-sve.ll (+14-14)
  • (modified) llvm/test/CodeGen/AArch64/stack-probing.ll (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/statepoint-call-lowering.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/streaming-compatible-memory-ops.ll (+10-10)
  • (modified) llvm/test/CodeGen/AArch64/sve-alloca.ll (+24-24)
  • (modified) llvm/test/CodeGen/AArch64/sve-callee-save-restore-pairs.ll (+120-120)
  • (modified) llvm/test/CodeGen/AArch64/sve-calling-convention-mixed.ll (+48-48)
  • (modified) llvm/test/CodeGen/AArch64/sve-extract-scalable-vector.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/sve-fixed-ld2-alloca.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/sve-fixed-length-fp128.ll (+20-20)
  • (modified) llvm/test/CodeGen/AArch64/sve-fixed-length-frame-offests-crash.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/sve-fixed-vector-llrint.ll (+70-70)
  • (modified) llvm/test/CodeGen/AArch64/sve-fixed-vector-lrint.ll (+126-126)
  • (modified) llvm/test/CodeGen/AArch64/sve-fptosi-sat.ll (+12-12)
  • (modified) llvm/test/CodeGen/AArch64/sve-fptoui-sat.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/sve-insert-vector.ll (+32-32)
  • (modified) llvm/test/CodeGen/AArch64/sve-llrint.ll (+114-114)
  • (modified) llvm/test/CodeGen/AArch64/sve-lrint.ll (+114-114)
  • (modified) llvm/test/CodeGen/AArch64/sve-pred-arith.ll (+20-20)
  • (modified) llvm/test/CodeGen/AArch64/sve-stack-frame-layout.ll (+34-34)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-fp-fma.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-fp-to-int.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll (+12-12)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-mulh.ll (+56-56)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-to-fp.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-vselect.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-ld2-alloca.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-masked-load.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-trunc.ll (+258-258)
  • (modified) llvm/test/CodeGen/AArch64/sve-tailcall.ll (+48-48)
  • (modified) llvm/test/CodeGen/AArch64/sve-trunc.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/sve2p1-intrinsics-loads.ll (+144-144)
  • (modified) llvm/test/CodeGen/AArch64/sve2p1-intrinsics-predicate-as-counter.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/sve2p1-intrinsics-selx2.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/sve2p1-intrinsics-selx4.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/sve2p1-intrinsics-stores.ll (+64-64)
  • (modified) llvm/test/CodeGen/AArch64/swift-async-win.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/swifterror.ll (+232-232)
  • (modified) llvm/test/CodeGen/AArch64/trampoline.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/unwind-preserved.ll (+80-80)
  • (modified) llvm/test/CodeGen/AArch64/vec-libcalls.ll (+156-156)
  • (modified) llvm/test/CodeGen/AArch64/veclib-llvm.modf.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/vecreduce-fadd-legalization-strict.ll (+12-12)
  • (modified) llvm/test/CodeGen/AArch64/vecreduce-fmax-legalization.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/vecreduce-fmin-legalization.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/vector-llrint.ll (+84-84)
  • (modified) llvm/test/CodeGen/AArch64/vector-lrint.ll (+172-172)
  • (modified) llvm/test/CodeGen/AArch64/win-sve.ll (+210-210)
  • (modified) llvm/test/CodeGen/AArch64/win64-fpowi.ll (+16-16)
  • (modified) llvm/test/CodeGen/AArch64/win64_vararg.ll (+6-6)
  • (modified) llvm/test/CodeGen/AArch64/win64_vararg2.ll (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/win64_vararg_float.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/win64_vararg_float_cc.ll (+4-4)
  • (modified) llvm/test/CodeGen/AArch64/win64cc-backup-x18.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/wincfi-missing-seh-directives.ll (+3-3)
  • (added) llvm/test/DebugInfo/AArch64/instr-ref-target-hooks.ll (+61)
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index ccc8eb8a9706d..17c2f8d07ea1c 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -2392,11 +2392,10 @@ bool AArch64InstrInfo::isFPRCopy(const MachineInstr &MI) {
   return false;
 }
 
-Register AArch64InstrInfo::isLoadFromStackSlot(const MachineInstr &MI,
-                                               int &FrameIndex) const {
-  switch (MI.getOpcode()) {
+static bool isFrameLoadOpcode(int Opcode) {
+  switch (Opcode) {
   default:
-    break;
+    return false;
   case AArch64::LDRWui:
   case AArch64::LDRXui:
   case AArch64::LDRBui:
@@ -2405,22 +2404,26 @@ Register AArch64InstrInfo::isLoadFromStackSlot(const MachineInstr &MI,
   case AArch64::LDRDui:
   case AArch64::LDRQui:
   case AArch64::LDR_PXI:
+    return true;
+  }
+}
+
+Register AArch64InstrInfo::isLoadFromStackSlot(const MachineInstr &MI,
+                                               int &FrameIndex) const {
+  if (isFrameLoadOpcode(MI.getOpcode())) {
     if (MI.getOperand(0).getSubReg() == 0 && MI.getOperand(1).isFI() &&
         MI.getOperand(2).isImm() && MI.getOperand(2).getImm() == 0) {
       FrameIndex = MI.getOperand(1).getIndex();
       return MI.getOperand(0).getReg();
     }
-    break;
   }
-
   return 0;
 }
 
-Register AArch64InstrInfo::isStoreToStackSlot(const MachineInstr &MI,
-                                              int &FrameIndex) const {
-  switch (MI.getOpcode()) {
+static bool isFrameStoreOpcode(int Opcode) {
+  switch (Opcode) {
   default:
-    break;
+    return false;
   case AArch64::STRWui:
   case AArch64::STRXui:
   case AArch64::STRBui:
@@ -2429,16 +2432,55 @@ Register AArch64InstrInfo::isStoreToStackSlot(const MachineInstr &MI,
   case AArch64::STRDui:
   case AArch64::STRQui:
   case AArch64::STR_PXI:
+    return true;
+  }
+}
+
+Register AArch64InstrInfo::isStoreToStackSlot(const MachineInstr &MI,
+                                              int &FrameIndex) const {
+  if (isFrameStoreOpcode(MI.getOpcode())) {
     if (MI.getOperand(0).getSubReg() == 0 && MI.getOperand(1).isFI() &&
         MI.getOperand(2).isImm() && MI.getOperand(2).getImm() == 0) {
       FrameIndex = MI.getOperand(1).getIndex();
       return MI.getOperand(0).getReg();
     }
-    break;
   }
   return 0;
 }
 
+Register AArch64InstrInfo::isStoreToStackSlotPostFE(const MachineInstr &MI,
+                                                    int &FrameIndex) const {
+  if (isFrameStoreOpcode(MI.getOpcode())) {
+    SmallVector<const MachineMemOperand *, 1> Accesses;
+    if (Register Reg = isStoreToStackSlot(MI, FrameIndex))
+      return Reg;
+
+    if (hasStoreToStackSlot(MI, Accesses)) {
+      FrameIndex =
+          cast<FixedStackPseudoSourceValue>(Accesses.front()->getPseudoValue())
+              ->getFrameIndex();
+      return MI.getOperand(0).getReg();
+    }
+  }
+  return Register();
+}
+
+Register AArch64InstrInfo::isLoadFromStackSlotPostFE(const MachineInstr &MI,
+                                                     int &FrameIndex) const {
+  if (isFrameLoadOpcode(MI.getOpcode())) {
+    if (Register Reg = isLoadFromStackSlot(MI, FrameIndex))
+      return Reg;
+    SmallVector<const MachineMemOperand *, 1> Accesses;
+    if (hasLoadFromStackSlot(MI, Accesses)) {
+      FrameIndex =
+          cast<FixedStackPseudoSourceValue>(Accesses.front()->getPseudoValue())
+              ->getFrameIndex();
+      return MI.getOperand(0).getReg();
+    }
+  }
+  return Register();
+}
+
 /// Check all MachineMemOperands for a hint to suppress pairing.
 bool AArch64InstrInfo::isLdStPairSuppressed(const MachineInstr &MI) {
   return llvm::any_of(MI.memoperands(), [](MachineMemOperand *MMO) {
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.h b/llvm/lib/Target/AArch64/AArch64InstrInfo.h
index 179574a73aa01..44863eb2f6d95 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.h
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.h
@@ -205,6 +205,15 @@ class AArch64InstrInfo final : public AArch64GenInstrInfo {
   Register isStoreToStackSlot(const MachineInstr &MI,
                               int &FrameIndex) const override;
 
+  /// isStoreToStackSlotPostFE - Check for post-frame ptr elimination
+  /// stack locations as well.  This uses a heuristic so it isn't
+  /// reliable for correctness.
+  Register isStoreToStackSlotPostFE(const MachineInstr &MI,
+                                    int &FrameIndex) const override;
+
+  Register isLoadFromStackSlotPostFE(const MachineInstr &MI,
+                                     int &FrameIndex) const override;
+
   /// Does this instruction set its full destination register to zero?
   static bool isGPRZero(const MachineInstr &MI);
 
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-128.ll b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-128.ll
index 1fe63c9be8c62..be51210882eaa 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-128.ll
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic-128.ll
@@ -89,23 +89,23 @@ define void @val_compare_and_swap(ptr %p, i128 %oldval, i128 %newval) {
 ; CHECK-OUTLINE-LLSC-O0-LABEL: val_compare_and_swap:
 ; CHECK-OUTLINE-LLSC-O0:       // %bb.0:
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    sub sp, sp, #32
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_def_cfa_offset 32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_offset w30, -16
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x0, [sp, #8] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x0, [sp, #8] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x0, x2
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x1, x3
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x2, x4
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x4, [sp, #8] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x4, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x3, x5
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    bl __aarch64_cas16_acq
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x8, x0
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x0, [sp, #8] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x0, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    // implicit-def: $q0
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov v0.d[0], x8
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov v0.d[1], x1
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    str q0, [x0]
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    add sp, sp, #32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    ret
 ;
@@ -113,9 +113,9 @@ define void @val_compare_and_swap(ptr %p, i128 %oldval, i128 %newval) {
 ; CHECK-CAS-O0:       // %bb.0:
 ; CHECK-CAS-O0-NEXT:    sub sp, sp, #16
 ; CHECK-CAS-O0-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-CAS-O0-NEXT:    str x3, [sp, #8] // 8-byte Folded Spill
+; CHECK-CAS-O0-NEXT:    str x3, [sp, #8] // 8-byte Spill
 ; CHECK-CAS-O0-NEXT:    mov x1, x5
-; CHECK-CAS-O0-NEXT:    ldr x5, [sp, #8] // 8-byte Folded Reload
+; CHECK-CAS-O0-NEXT:    ldr x5, [sp, #8] // 8-byte Reload
 ; CHECK-CAS-O0-NEXT:    // kill: def $x2 killed $x2 def $x2_x3
 ; CHECK-CAS-O0-NEXT:    mov x3, x5
 ; CHECK-CAS-O0-NEXT:    // kill: def $x4 killed $x4 def $x4_x5
@@ -216,23 +216,23 @@ define void @val_compare_and_swap_monotonic_seqcst(ptr %p, i128 %oldval, i128 %n
 ; CHECK-OUTLINE-LLSC-O0-LABEL: val_compare_and_swap_monotonic_seqcst:
 ; CHECK-OUTLINE-LLSC-O0:       // %bb.0:
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    sub sp, sp, #32
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_def_cfa_offset 32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_offset w30, -16
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x0, [sp, #8] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x0, [sp, #8] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x0, x2
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x1, x3
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x2, x4
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x4, [sp, #8] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x4, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x3, x5
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    bl __aarch64_cas16_acq_rel
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x8, x0
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x0, [sp, #8] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x0, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    // implicit-def: $q0
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov v0.d[0], x8
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov v0.d[1], x1
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    str q0, [x0]
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    add sp, sp, #32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    ret
 ;
@@ -240,9 +240,9 @@ define void @val_compare_and_swap_monotonic_seqcst(ptr %p, i128 %oldval, i128 %n
 ; CHECK-CAS-O0:       // %bb.0:
 ; CHECK-CAS-O0-NEXT:    sub sp, sp, #16
 ; CHECK-CAS-O0-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-CAS-O0-NEXT:    str x3, [sp, #8] // 8-byte Folded Spill
+; CHECK-CAS-O0-NEXT:    str x3, [sp, #8] // 8-byte Spill
 ; CHECK-CAS-O0-NEXT:    mov x1, x5
-; CHECK-CAS-O0-NEXT:    ldr x5, [sp, #8] // 8-byte Folded Reload
+; CHECK-CAS-O0-NEXT:    ldr x5, [sp, #8] // 8-byte Reload
 ; CHECK-CAS-O0-NEXT:    // kill: def $x2 killed $x2 def $x2_x3
 ; CHECK-CAS-O0-NEXT:    mov x3, x5
 ; CHECK-CAS-O0-NEXT:    // kill: def $x4 killed $x4 def $x4_x5
@@ -343,23 +343,23 @@ define void @val_compare_and_swap_release_acquire(ptr %p, i128 %oldval, i128 %ne
 ; CHECK-OUTLINE-LLSC-O0-LABEL: val_compare_and_swap_release_acquire:
 ; CHECK-OUTLINE-LLSC-O0:       // %bb.0:
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    sub sp, sp, #32
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_def_cfa_offset 32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_offset w30, -16
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x0, [sp, #8] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x0, [sp, #8] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x0, x2
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x1, x3
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x2, x4
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x4, [sp, #8] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x4, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x3, x5
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    bl __aarch64_cas16_acq_rel
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x8, x0
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x0, [sp, #8] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x0, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    // implicit-def: $q0
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov v0.d[0], x8
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov v0.d[1], x1
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    str q0, [x0]
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    add sp, sp, #32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    ret
 ;
@@ -367,9 +367,9 @@ define void @val_compare_and_swap_release_acquire(ptr %p, i128 %oldval, i128 %ne
 ; CHECK-CAS-O0:       // %bb.0:
 ; CHECK-CAS-O0-NEXT:    sub sp, sp, #16
 ; CHECK-CAS-O0-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-CAS-O0-NEXT:    str x3, [sp, #8] // 8-byte Folded Spill
+; CHECK-CAS-O0-NEXT:    str x3, [sp, #8] // 8-byte Spill
 ; CHECK-CAS-O0-NEXT:    mov x1, x5
-; CHECK-CAS-O0-NEXT:    ldr x5, [sp, #8] // 8-byte Folded Reload
+; CHECK-CAS-O0-NEXT:    ldr x5, [sp, #8] // 8-byte Reload
 ; CHECK-CAS-O0-NEXT:    // kill: def $x2 killed $x2 def $x2_x3
 ; CHECK-CAS-O0-NEXT:    mov x3, x5
 ; CHECK-CAS-O0-NEXT:    // kill: def $x4 killed $x4 def $x4_x5
@@ -470,23 +470,23 @@ define void @val_compare_and_swap_monotonic(ptr %p, i128 %oldval, i128 %newval)
 ; CHECK-OUTLINE-LLSC-O0-LABEL: val_compare_and_swap_monotonic:
 ; CHECK-OUTLINE-LLSC-O0:       // %bb.0:
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    sub sp, sp, #32
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_def_cfa_offset 32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_offset w30, -16
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x0, [sp, #8] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x0, [sp, #8] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x0, x2
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x1, x3
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x2, x4
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x4, [sp, #8] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x4, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x3, x5
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    bl __aarch64_cas16_acq_rel
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x8, x0
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x0, [sp, #8] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x0, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    // implicit-def: $q0
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov v0.d[0], x8
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov v0.d[1], x1
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    str q0, [x0]
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    add sp, sp, #32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    ret
 ;
@@ -494,9 +494,9 @@ define void @val_compare_and_swap_monotonic(ptr %p, i128 %oldval, i128 %newval)
 ; CHECK-CAS-O0:       // %bb.0:
 ; CHECK-CAS-O0-NEXT:    sub sp, sp, #16
 ; CHECK-CAS-O0-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-CAS-O0-NEXT:    str x3, [sp, #8] // 8-byte Folded Spill
+; CHECK-CAS-O0-NEXT:    str x3, [sp, #8] // 8-byte Spill
 ; CHECK-CAS-O0-NEXT:    mov x1, x5
-; CHECK-CAS-O0-NEXT:    ldr x5, [sp, #8] // 8-byte Folded Reload
+; CHECK-CAS-O0-NEXT:    ldr x5, [sp, #8] // 8-byte Reload
 ; CHECK-CAS-O0-NEXT:    // kill: def $x2 killed $x2 def $x2_x3
 ; CHECK-CAS-O0-NEXT:    mov x3, x5
 ; CHECK-CAS-O0-NEXT:    // kill: def $x4 killed $x4 def $x4_x5
@@ -580,22 +580,22 @@ define void @atomic_load_relaxed(i64, i64, ptr %p, ptr %p2) {
 ; CHECK-OUTLINE-LLSC-O0-LABEL: atomic_load_relaxed:
 ; CHECK-OUTLINE-LLSC-O0:       // %bb.0:
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    sub sp, sp, #32
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_def_cfa_offset 32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_offset w30, -16
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x4, x2
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x3, [sp, #8] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x3, [sp, #8] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x3, xzr
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x0, x3
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x1, x3
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x2, x3
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    bl __aarch64_cas16_relax
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x3, [sp, #8] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x3, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    // implicit-def: $q0
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov v0.d[0], x0
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov v0.d[1], x1
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    str q0, [x3]
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    add sp, sp, #32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    ret
 ;
@@ -690,17 +690,17 @@ define i128 @val_compare_and_swap_return(ptr %p, i128 %oldval, i128 %newval) {
 ; CHECK-OUTLINE-LLSC-O0-LABEL: val_compare_and_swap_return:
 ; CHECK-OUTLINE-LLSC-O0:       // %bb.0:
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    sub sp, sp, #32
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x30, [sp, #16] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_def_cfa_offset 32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    .cfi_offset w30, -16
-; CHECK-OUTLINE-LLSC-O0-NEXT:    str x0, [sp, #8] // 8-byte Folded Spill
+; CHECK-OUTLINE-LLSC-O0-NEXT:    str x0, [sp, #8] // 8-byte Spill
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x0, x2
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x1, x3
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x2, x4
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x4, [sp, #8] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x4, [sp, #8] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    mov x3, x5
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    bl __aarch64_cas16_acq
-; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Folded Reload
+; CHECK-OUTLINE-LLSC-O0-NEXT:    ldr x30, [sp, #16] // 8-byte Reload
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    add sp, sp, #32
 ; CHECK-OUTLINE-LLSC-O0-NEXT:    ret
 ;
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic.ll b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic.ll
index e6bf3ab674717..3f51ec747182a 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic.ll
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic.ll
@@ -56,10 +56,10 @@ define i32 @val_compare_and_swap(ptr %p, i32 %cmp, i32 %new) #0 {
 ; CHECK-OUTLINE-O0:       ; %bb.0:
 ; CHECK-OUTLINE-O0-NEXT:    sub sp, sp, #32
 ; CHECK-OUTLINE-O0-NEXT:    stp x29, x30, [sp, #16] ; 16-byte Folded Spill
-; CHECK-OUTLINE-O0-NEXT:    str x0, [sp, #8] ; 8-byte Folded Spill
+; CHECK-OUTLINE-O0-NEXT:    str x0, [sp, #8] ; 8-byte Spill
 ; CHECK-OUTLINE-O0-NEXT:    mov w0, w1
 ; CHECK-OUTLINE-O0-NEXT:    mov w1, w2
-; CHECK-OUTLINE-O0-NEXT:    ldr x2, [sp, #8] ; 8-byte Folded Reload
+; CHECK-OUTLINE-O0-NEXT:    ldr x2, [sp, #8] ; 8-byte Reload
 ; CHECK-OUTLINE-O0-NEXT:    bl ___aarch64_cas4_acq
 ; CHECK-OUTLINE-O0-NEXT:    ldp x29, x30, [sp, #16] ; 16-byte Folded Reload
 ; CHECK-OUTLINE-O0-NEXT:    add sp, sp, #32
@@ -133,10 +133,10 @@ define i32 @val_compare_and_swap_from_load(ptr %p, i32 %cmp, ptr %pnew) #0 {
 ; CHECK-OUTLINE-O0:       ; %bb.0:
 ; CHECK-OUTLINE-O0-NEXT:    sub sp, sp, #32
 ; CHECK-OUTLINE-O0-NEXT:    stp x29, x30, [sp, #16] ; 16-byte Folded Spill
-; CHECK-OUTLINE-O0-NEXT:    str x0, [sp, #8] ; 8-byte Folded Spill
+; CHECK-OUTLINE-O0-NEXT:    str x0, [sp, #8] ; 8-byte Spill
 ; CHECK-OUTLINE-O0-NEXT:    mov w0, w1
 ; CHECK-OUTLINE-O0-NEXT:    mov x8, x2
-; CHECK-OUTLINE-O0-NEXT:    ldr x2, [sp, #8] ; 8-byte Folded Reload
+; CHECK-OUTLINE-O0-NEXT:    ldr x2, [sp, #8] ; 8-byte Reload
 ; CHECK-OUTLINE-O0-NEXT:    ldr w1, [x8]
 ; CHECK-OUTLINE-O0-NEXT:    bl ___aarch64_cas4_acq
 ; CHECK-OUTLINE-O0-NEXT:    ldp x29, x30, [sp, #16] ; 16-byte Folded Reload
@@ -211,10 +211,10 @@ define i32 @val_compare_and_swap_rel(ptr %p, i32 %cmp, i32 %new) #0 {
 ; CHECK-OUTLINE-O0:       ; %bb.0:
 ; CHECK-OUTLINE-O0-NEXT:    sub sp, sp, #32
 ; CHECK-OUTLINE-O0-NEXT:    stp x29, x30, [sp, #16] ; 16-byte Folded Spill
-; CHECK-OUTLINE-O0-NEXT:    str x0, [sp, #8] ; 8-byte Folded Spill
+; CHECK-OUTLINE-O0-NEXT:    str x0, [sp, #8] ; 8-byte Spill
 ; CHECK-OUTLINE-O0-NEXT:    mov w0, w1
 ; CHECK-OUTLINE-O0-NEXT:    mov w1, w2
-; CHECK-OUTLINE-O0-NEXT:    ldr x2, [sp, #8] ; 8-byte Folded Reload
+; CHECK-OUTLINE-O0-NEXT:    ldr x2, [sp, #8] ; 8-byte Reload
 ; CHECK-OUTLINE-O0-NEXT:    bl ___aarch64_cas4_acq_rel
 ; CHECK-OUTLINE-O0-NEXT:    ldp x29, x30, [sp, #16] ; 16-byte Folded Reload
 ; CHECK-OUTLINE-O0-NEXT:    add sp, sp, #32
@@ -285,10 +285,10 @@ define i64 @val_compare_and_swap_64(ptr %p, i64 %cmp, i64 %new) #0 {
 ; CHECK-OUTLINE-O0:       ; %bb.0:
 ; CHECK-OUTLINE-O0-NEXT:    sub sp, sp, #32
 ; CHECK-OUTLINE-O0-NEXT:    stp x29, x30, [sp, #16] ; 16-byte Folded Spill
-; CHECK-OUTLINE-O0-NEXT:    str x0, [sp, #8] ; 8-byte Folded Spill
+; CHECK-OUTLINE-O0-NEXT:    str x0, [sp, #8] ; 8-byte Spill
 ; CHECK-OUTLINE-O0-NEXT:    mov x0, x1
 ; CHECK-OUTLINE-O0-NEXT:    mov x1, x2
-; CHECK-OUTLINE-O0-NEXT:    ldr x2, [sp, #8] ; 8-byte Folded Reload
+; CHECK-OUTLINE-O0-NEXT:    ldr x2, [sp, #8] ; 8-byte Reload
 ; CHECK-OUTLINE-O0-NEXT:    bl ___aarch64_cas8_relax
 ; CHECK-OUTLINE-O0-NEXT:    ldp x29, x30, [sp, #16] ; 16-...
[truncated]

@rastogishubham rastogishubham force-pushed the AddTargetHooks branch 2 times, most recently from acec82a to f1103fd Compare November 3, 2025 19:18
; CHECK-OUTLINE-LLSC-O0: // %bb.0:
; CHECK-OUTLINE-LLSC-O0-NEXT: sub sp, sp, #32
; CHECK-OUTLINE-LLSC-O0-NEXT: str x30, [sp, #16] // 8-byte Folded Spill
; CHECK-OUTLINE-LLSC-O0-NEXT: str x30, [sp, #16] // 8-byte Spill
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be easier to land an NFC PR that just regenerates all the comments ahead of time, then the follow-up patch becomes much smaller.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! I am confused about this comment however, how do I land a patch with all the comments changed without landing the code changes first? The code changes are the reasons the comments are needed right? I can put up a separate PR for the comments, but I still need the code changes to land first, otherwise the tests will fail till the code changes are landed.

This is why I split the PR into 2 commits, to make reviewing easier

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand it, the massive comment updates are just a result of adding these functions, such that the assembly printing passes have more information available to them? In which case there's no way to split out the test changes from the implementation; it would be possible to add the hooks in separate commits so that the size of the individual commits are minimized, but I think that landing it all as one commit is better in this case: the main threat of changing so many files is if this commit has to be reverted after changes are placed on top of it, or if it similarly interferes with another revert; if that does happen though, splitting this into multiple commits would make the problem worse. Better to get it all done in one push imo, ymmv etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the comment updates are because some "Folded Spills" have become "Spills" and "Folded Reloads" have become "Reloads"

We would not be able to land the test changes unless we upstream the code changes first, but I think I want to keep it as one commit, so everything goes together in the case of a revert. I will squash the patch before I submit the PR. Does everything look good to you otherwise @SLTozer ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me, ideally I'd want to be sure that @adrian-prantl is happy with the above explanation as to why this should all land in one patch - though since it was just a suggestion, maybe a final confirmation isn't necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I would also want another final comment from @adrian-prantl

Thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me, the end result is desirable and if the updates come from adding the function it makes sense to land it all in one!

if (Register Reg = isLoadFromStackSlot(MI, FrameIndex))
return Reg;
SmallVector<const MachineMemOperand *, 1> Accesses;
if (hasLoadFromStackSlot(MI, Accesses)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is hasLoadFromStackSlot guaranteed to place an element in Accesses when it returns true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, hasLoadFromStackSlot only returns true if Accesses.size() changes, and Accesses is only being push_back()ed to, therefore, if hasLoadFromStackSlot returns true, it definitely placed an element in Accesses

Copy link
Collaborator

@adrian-prantl adrian-prantl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually this LGTM. I would recommend landing the NFC test update that regenerates the comments first though.

@@ -792,316 +792,5 @@ define void @zpr_and_ppr_local_stack_probing(<vscale x 16 x i1> %pred, <vscale x
store volatile i64 %gpr, ptr %gpr_local
ret void
}

; Only PPR callee-saves + a VLA
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this removed accidentally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you very much for looking over all the tests, it was removed accidentally, I have fixed that, I also looked over all the tests again just to make sure they were okay.

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking. LGTM.

Copy link
Member

@jmorse jmorse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with some nits, of which I think firming up the location of the DBG_VALUE_LIST in the check lines is the only critical part.

@rastogishubham rastogishubham force-pushed the AddTargetHooks branch 8 times, most recently from 42c25d4 to 8e8f903 Compare November 14, 2025 04:44
@rastogishubham rastogishubham enabled auto-merge (squash) November 14, 2025 05:30
@rastogishubham rastogishubham force-pushed the AddTargetHooks branch 4 times, most recently from f069bc6 to a0a8b6c Compare November 14, 2025 17:02
According to llvm/docs/InstrRefDebugInfo.md, to support proper
instruction referecing on any platform, the target specific
`TargetInstrInfo::isLoadFromStackSlotPostFE` and
`TargetInstrInfo::isStoreToStackSlotPostFE` functions are needed to be
implemented for the Instruction Reference-based LiveDebugValues pass to
identify spill and restore instructions.

It also fixes up all tests that were broken with adding target hooks.

The target hooks cause a lot of the spill and reload comments to go from
"X-byte Folded spill" to "X-byte spill", and "Y-byte Folded reload" to
"Y-byte reload".

Most tests were updated by using llvm/utils/update_llc_test_checks.py,
some had to be manually changed.

This is a separate commit for reviewability sake, and will be squashed.

I have also added 2 tests

1. llvm/test/DebugInfo/AArch64/instr-ref-target-hooks.ll
2. llvm/test/DebugInfo/AArch64/instr-ref-target-hooks-sp-clobber.ll

This patch is attempting to reland
llvm#162327
@rastogishubham rastogishubham merged commit 44b94a4 into llvm:main Nov 14, 2025
10 checks passed
@rastogishubham rastogishubham deleted the AddTargetHooks branch November 14, 2025 18:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants