Skip to content

Conversation

@lukel97
Copy link
Contributor

@lukel97 lukel97 commented Nov 12, 2025

This fixes an assert when compiling llvm-test-suite with -march=rva23u64 -O3 that started appearing sometime this week.

We get "Cannot overlap two segments with differing ValID's" because we try to coalescse these two vsetvlis:

%x:gprnox0 = COPY $x8
dead $x0 = PseudoVSETIVLI 1, 208, implicit-def $vl, implicit-def $vtype
%y:gprnox0 = COPY %x
%v:vr = COPY $v8, implicit $vtype
%x = PseudoVSETVLI %x, 208, implicit-def $vl, implicit-def $vtype

-->

%x:gprnox0 = COPY $x8
%x = PseudoVSETVLI %x, 208, implicit-def $vl, implicit-def $vtype
%y:gprnox0 = COPY %x
%v:vr = COPY $v8, implicit $vtype

However to do so would cause us to extend the segment of the new value of %x up past the first segment, which overlaps.

This fixes it by checking that its safe to extend the segment, by simply making sure the interval isn't live at the first vsetvli.

This unfortunately causes a regression in the existing coalesce_vl_avl_same_reg test because even though we could coalesce the vsetvlis there, we now bail. I couldn't think of an easy way to handle this safely, but I don't think this is an important case to handle: After testing this patch on SPEC CPU 2017 there are no codegen changes.

This fixes an assert when compiling llvm-test-suite with -march=rva23u64 -O3 that started appearing sometime this week.

We get "Cannot overlap two segments with differing ValID's" because we try to coalescse these two vsetvlis:

    %x:gprnox0 = COPY $x8
    dead $x0 = PseudoVSETIVLI 1, 208, implicit-def $vl, implicit-def $vtype
    %y:gprnox0 = COPY %x
    %v:vr = COPY $v8, implicit $vtype
    %x = PseudoVSETVLI %x, 208, implicit-def $vl, implicit-def $vtype

    -->

    %x:gprnox0 = COPY $x8
    %x = PseudoVSETVLI %x, 208, implicit-def $vl, implicit-def $vtype
    %y:gprnox0 = COPY %x
    %v:vr = COPY $v8, implicit $vtype

However to do so would cause us to extend the segment of the new value of %x up past the first segment, which overlaps.

This fixes it by checking that its safe to extend the segment, by simply making sure the interval isn't live at the first vsetvli.

This unfortunately causes a regression in the existing coalesce_vl_avl_same_reg test because even though we could coalesce the vsetvlis there, we now bail. I couldn't think of an easy way to handle this safely, but I don't think this is an important case to handle: After testing this patch on SPEC CPU 2017 there are no codegen changes.
@llvmbot
Copy link
Member

llvmbot commented Nov 12, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

Changes

This fixes an assert when compiling llvm-test-suite with -march=rva23u64 -O3 that started appearing sometime this week.

We get "Cannot overlap two segments with differing ValID's" because we try to coalescse these two vsetvlis:

%x:gprnox0 = COPY $x8
dead $x0 = PseudoVSETIVLI 1, 208, implicit-def $vl, implicit-def $vtype
%y:gprnox0 = COPY %x
%v:vr = COPY $v8, implicit $vtype
%x = PseudoVSETVLI %x, 208, implicit-def $vl, implicit-def $vtype

-->

%x:gprnox0 = COPY $x8
%x = PseudoVSETVLI %x, 208, implicit-def $vl, implicit-def $vtype
%y:gprnox0 = COPY %x
%v:vr = COPY $v8, implicit $vtype

However to do so would cause us to extend the segment of the new value of %x up past the first segment, which overlaps.

This fixes it by checking that its safe to extend the segment, by simply making sure the interval isn't live at the first vsetvli.

This unfortunately causes a regression in the existing coalesce_vl_avl_same_reg test because even though we could coalesce the vsetvlis there, we now bail. I couldn't think of an easy way to handle this safely, but I don't think this is an important case to handle: After testing this patch on SPEC CPU 2017 there are no codegen changes.


Full diff: https://github.com/llvm/llvm-project/pull/167712.diff

3 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp (+8)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.ll (+43)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.mir (+27)
diff --git a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
index bf9de0a4b5604..06268b947ab31 100644
--- a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
@@ -1755,6 +1755,14 @@ bool RISCVInsertVSETVLI::canMutatePriorConfig(
       if (!VNI || !PrevVNI || VNI != PrevVNI)
         return false;
     }
+
+    // If we define VL and need to move the definition up, check we can extend
+    // the live interval upwards from PrevMI to MI.
+    Register VL = MI.getOperand(0).getReg();
+    if (VL.isVirtual() && LIS &&
+        LIS->getInterval(VL).overlaps(LIS->getInstructionIndex(PrevMI),
+                                      LIS->getInstructionIndex(MI)))
+      return false;
   }
 
   assert(PrevMI.getOperand(2).isImm() && MI.getOperand(2).isImm());
diff --git a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.ll b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.ll
index b6e29cf76cd48..e8d89d4066e43 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.ll
@@ -879,3 +879,46 @@ entry:
   %3 = tail call { <vscale x 8 x i8>, i64 } @llvm.riscv.vleff(<vscale x 8 x i8> poison, ptr %p, i64 %2)
   ret void
 }
+
+; This will create a live interval in such a way we can't coalesce two vsetvlis,
+; see the corresponding .mir test for more details. Make sure we check for this
+; and don't crash.
+define void @coalesce_vl_clobber(ptr %p) {
+; CHECK-LABEL: coalesce_vl_clobber:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    li a2, 0
+; CHECK-NEXT:    li a1, 0
+; CHECK-NEXT:    vsetivli zero, 0, e8, mf2, ta, ma
+; CHECK-NEXT:    vmclr.m v8
+; CHECK-NEXT:    vmv.v.i v9, 0
+; CHECK-NEXT:    vmv1r.v v0, v8
+; CHECK-NEXT:    vmerge.vim v9, v9, 1, v0
+; CHECK-NEXT:  .LBB43_1: # %vector.body
+; CHECK-NEXT:    # =>This Inner Loop Header: Depth=1
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
+; CHECK-NEXT:    vmv1r.v v0, v8
+; CHECK-NEXT:    slli a3, a1, 32
+; CHECK-NEXT:    vsetvli a1, a2, e8, mf8, ta, ma
+; CHECK-NEXT:    vsetivli zero, 0, e8, mf2, ta, mu
+; CHECK-NEXT:    vmv.v.i v10, 0
+; CHECK-NEXT:    srli a3, a3, 32
+; CHECK-NEXT:    vmerge.vim v10, v10, 1, v0
+; CHECK-NEXT:    vslideup.vx v10, v9, a3, v0.t
+; CHECK-NEXT:    vsetvli zero, zero, e8, mf2, ta, ma
+; CHECK-NEXT:    vmsne.vi v0, v10, 0, v0.t
+; CHECK-NEXT:    vsetvli zero, a1, e32, m2, ta, ma
+; CHECK-NEXT:    vmv.v.i v10, 0
+; CHECK-NEXT:    vse32.v v10, (a0), v0.t
+; CHECK-NEXT:    li a2, 1
+; CHECK-NEXT:    j .LBB43_1
+entry:
+  br label %vector.body
+
+vector.body:
+  %avl = phi i64 [ 0, %entry ], [ 1, %vector.body ]
+  %prev.evl = phi i32 [ 0, %entry ], [ %0, %vector.body ]
+  %0 = tail call i32 @llvm.experimental.get.vector.length(i64 %avl, i32 1, i1 true)
+  %1 = tail call <vscale x 4 x i1> @llvm.experimental.vp.splice(<vscale x 4 x i1> zeroinitializer, <vscale x 4 x i1> zeroinitializer, i32 0, <vscale x 4 x i1> zeroinitializer, i32 %prev.evl, i32 0)
+  tail call void @llvm.vp.store(<vscale x 4 x float> zeroinitializer, ptr %p, <vscale x 4 x i1> %1, i32 %0)
+  br label %vector.body
+}
diff --git a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.mir b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.mir
index f9929c9caf712..396ca517e4017 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.mir
+++ b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.mir
@@ -100,6 +100,10 @@
     ret void
   }
 
+  define void @coalesce_vl_clobber() {
+    ret void
+  }
+
   define void @vsetvli_vleff() {
     ret void
   }
@@ -624,10 +628,33 @@ body: |
     ; CHECK: liveins: $x8, $v8
     ; CHECK-NEXT: {{  $}}
     ; CHECK-NEXT: %x:gprnox0 = COPY $x8
+    ; CHECK-NEXT: dead $x0 = PseudoVSETIVLI 1, 208 /* e32, m1, ta, ma */, implicit-def $vl, implicit-def $vtype
+    ; CHECK-NEXT: dead %v:vr = COPY $v8, implicit $vtype
     ; CHECK-NEXT: dead %x:gprnox0 = PseudoVSETVLI %x, 208 /* e32, m1, ta, ma */, implicit-def $vl, implicit-def $vtype
+    %x:gprnox0 = COPY $x8
+    dead $x0 = PseudoVSETIVLI 1, 208, implicit-def $vl, implicit-def $vtype
+    %v:vr = COPY $v8, implicit $vtype
+    %x = PseudoVSETVLI %x, 208, implicit-def $vl, implicit-def $vtype
+...
+---
+# Because of the %y:gprnox0 = COPY %x, we can't extend the live range of %x from
+# the second vsetvli to the first vsetvli when coalescing.
+name: coalesce_vl_clobber
+tracksRegLiveness: true
+body: |
+  bb.0:
+    liveins: $x8, $v8
+    ; CHECK-LABEL: name: coalesce_vl_clobber
+    ; CHECK: liveins: $x8, $v8
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: %x:gprnox0 = COPY $x8
+    ; CHECK-NEXT: dead $x0 = PseudoVSETIVLI 1, 208 /* e32, m1, ta, ma */, implicit-def $vl, implicit-def $vtype
+    ; CHECK-NEXT: dead %y:gprnox0 = COPY %x
     ; CHECK-NEXT: dead %v:vr = COPY $v8, implicit $vtype
+    ; CHECK-NEXT: dead %x:gprnox0 = PseudoVSETVLI %x, 208 /* e32, m1, ta, ma */, implicit-def $vl, implicit-def $vtype
     %x:gprnox0 = COPY $x8
     dead $x0 = PseudoVSETIVLI 1, 208, implicit-def $vl, implicit-def $vtype
+    %y:gprnox0 = COPY %x
     %v:vr = COPY $v8, implicit $vtype
     %x = PseudoVSETVLI %x, 208, implicit-def $vl, implicit-def $vtype
 ...

Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -1755,6 +1755,14 @@ bool RISCVInsertVSETVLI::canMutatePriorConfig(
if (!VNI || !PrevVNI || VNI != PrevVNI)
return false;
}

// If we define VL and need to move the definition up, check we can extend
// the live interval upwards from PrevMI to MI.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from MI to PrevMI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh woops thanks, fixed in 488edca

; CHECK-NEXT: dead %v:vr = COPY $v8, implicit $vtype
; CHECK-NEXT: dead %x:gprnox0 = PseudoVSETVLI %x, 208 /* e32, m1, ta, ma */, implicit-def $vl, implicit-def $vtype
%x:gprnox0 = COPY $x8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it bails out in this case because we're not in SSA form at this point, and thus the interval of %x covers both %x:gprnox0 = COPY $x8 to %x = PseudoVSETVLI %x as well as %x = PseudoVSETVLI %x onward.
In other words, the segment between %x:gprnox0 = COPY $x8 to %x = PseudoVSETVLI %x is false positive?

Copy link
Contributor Author

@lukel97 lukel97 Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the segment between %x = COPY $x8 and %x = PsuedoVSETVLI %x is a false positive because the vsetvli uses the same valno of %x.

It's more that we have two segments for %x, [16r,80r:1)[80r,80d:0):

16B       %x:gprnox0 = COPY $x8
32B       dead $x0 = PseudoVSETIVLI 1, 208, implicit-def $vl, implicit-def $vtype
48B       %y:gprnox0 = COPY %x:gprnox0
64B       %v:vr = COPY $v8, implicit $vtype
80B       %x:gprnox0 = PseudoVSETVLI %x:gprnox0, 208, implicit-def $vl, implicit-def $vtype

Because the two segments have different value nos from the defs at 16r and 80r, when we go to coalesce the vsetvlis we can't move the 80r definition of %x upwards, since it overlaps with [16r,80r:1). Which is what the assertion was complaining about.

The LIS->getInterval(VL).overlaps check added in this PR detects this case and causes it to bail now.

We could try to be really smart and here and figure out that because we're going to move the use of the 16r def up to 32r, the segment could be shrunk to 48r). That would fix the regression in coalesce_vl_avl_same_reg, but it seems like a lot of effort for something that doesn't seem to show up in practice

@lukel97 lukel97 enabled auto-merge (squash) November 13, 2025 08:38
@lukel97 lukel97 merged commit f2ed002 into llvm:main Nov 13, 2025
9 of 10 checks passed
; CHECK-NEXT: vmerge.vim v9, v9, 1, v0
; CHECK-NEXT: .LBB43_1: # %vector.body
; CHECK-NEXT: # =>This Inner Loop Header: Depth=1
; CHECK-NEXT: vsetivli zero, 1, e8, m1, ta, ma
Copy link
Collaborator

@topperc topperc Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This vsetivli seems unnecessary. It only exists because of the vmv1r.v right below it, but vtype is valid for all paths that can get here which is all the vmv1r.v should care about.

%avl = phi i64 [ 0, %entry ], [ 1, %vector.body ]
%prev.evl = phi i32 [ 0, %entry ], [ %0, %vector.body ]
%0 = tail call i32 @llvm.experimental.get.vector.length(i64 %avl, i32 1, i1 true)
%1 = tail call <vscale x 4 x i1> @llvm.experimental.vp.splice(<vscale x 4 x i1> zeroinitializer, <vscale x 4 x i1> zeroinitializer, i32 0, <vscale x 4 x i1> zeroinitializer, i32 %prev.evl, i32 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case looks fragile. Adding constant folding to experimental.vp.splice in SelectionDAG would break this.

lukel97 added a commit to lukel97/llvm-project that referenced this pull request Nov 14, 2025
…lvm#167712)

This fixes an assert when compiling llvm-test-suite with -march=rva23u64
-O3 that started appearing sometime this week.

We get "Cannot overlap two segments with differing ValID's" because we
try to coalescse these two vsetvlis:

    %x:gprnox0 = COPY $x8
dead $x0 = PseudoVSETIVLI 1, 208, implicit-def $vl, implicit-def $vtype
    %y:gprnox0 = COPY %x
    %v:vr = COPY $v8, implicit $vtype
    %x = PseudoVSETVLI %x, 208, implicit-def $vl, implicit-def $vtype

    -->

    %x:gprnox0 = COPY $x8
    %x = PseudoVSETVLI %x, 208, implicit-def $vl, implicit-def $vtype
    %y:gprnox0 = COPY %x
    %v:vr = COPY $v8, implicit $vtype

However to do so would cause us to extend the segment of the new value
of %x up past the first segment, which overlaps.

This fixes it by checking that its safe to extend the segment, by simply
making sure the interval isn't live at the first vsetvli.

This unfortunately causes a regression in the existing
coalesce_vl_avl_same_reg test because even though we could coalesce the
vsetvlis there, we now bail. I couldn't think of an easy way to handle
this safely, but I don't think this is an important case to handle:
After testing this patch on SPEC CPU 2017 there are no codegen changes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants