-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[MemCpyOpt][profcheck] Set unknown branch weights for certain selects
#167597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MemCpyOpt][profcheck] Set unknown branch weights for certain selects
#167597
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
@llvm/pr-subscribers-llvm-transforms Author: Mircea Trofin (mtrofin) ChangesFull diff: https://github.com/llvm/llvm-project/pull/167597.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp b/llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
index 08be5df9872b7..5d11286e2cd7f 100644
--- a/llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
+++ b/llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
@@ -47,6 +47,7 @@
#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Module.h"
#include "llvm/IR/PassManager.h"
+#include "llvm/IR/ProfDataUtils.h"
#include "llvm/IR/Type.h"
#include "llvm/IR/User.h"
#include "llvm/IR/Value.h"
@@ -1366,6 +1367,8 @@ bool MemCpyOptPass::processMemSetMemCpyDependence(MemCpyInst *MemCpy,
Value *SizeDiff = Builder.CreateSub(DestSize, SrcSize);
Value *MemsetLen = Builder.CreateSelect(
Ule, ConstantInt::getNullValue(DestSize->getType()), SizeDiff);
+ if (auto *SI = dyn_cast<SelectInst>(MemsetLen))
+ setExplicitlyUnknownBranchWeightsIfProfiled(*SI, DEBUG_TYPE);
Instruction *NewMemSet =
Builder.CreateMemSet(Builder.CreatePtrAdd(Dest, SrcSize),
MemSet->getOperand(1), MemsetLen, Alignment);
diff --git a/llvm/test/Transforms/MemCpyOpt/memset-memcpy-dbgloc.ll b/llvm/test/Transforms/MemCpyOpt/memset-memcpy-dbgloc.ll
index 3f577f09ada1e..edd64e5efe1aa 100644
--- a/llvm/test/Transforms/MemCpyOpt/memset-memcpy-dbgloc.ll
+++ b/llvm/test/Transforms/MemCpyOpt/memset-memcpy-dbgloc.ll
@@ -7,14 +7,14 @@ target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
declare void @llvm.memset.p0.i64(ptr nocapture, i8, i64, i1)
declare void @llvm.memcpy.p0.p0.i64(ptr nocapture, ptr nocapture readonly, i64, i1)
-define void @test_constant(i64 %src_size, ptr %dst, i64 %dst_size, i8 %c) !dbg !5 {
+define void @test_constant(i64 %src_size, ptr %dst, i64 %dst_size, i8 %c) !dbg !5 !prof !14 {
; CHECK-LABEL: define void @test_constant(
-; CHECK-SAME: i64 [[SRC_SIZE:%.*]], ptr [[DST:%.*]], i64 [[DST_SIZE:%.*]], i8 [[C:%.*]]) !dbg [[DBG5:![0-9]+]] {
+; CHECK-SAME: i64 [[SRC_SIZE:%.*]], ptr [[DST:%.*]], i64 [[DST_SIZE:%.*]], i8 [[C:%.*]]) !dbg [[DBG5:![0-9]+]] !prof {{.*}} {
; CHECK-NEXT: [[NON_ZERO:%.*]] = icmp ne i64 [[SRC_SIZE]], 0
; CHECK-NEXT: call void @llvm.assume(i1 [[NON_ZERO]])
; CHECK-NEXT: [[TMP1:%.*]] = icmp ule i64 [[DST_SIZE]], [[SRC_SIZE]], !dbg [[DBG11:![0-9]+]]
; CHECK-NEXT: [[TMP2:%.*]] = sub i64 [[DST_SIZE]], [[SRC_SIZE]], !dbg [[DBG11]]
-; CHECK-NEXT: [[TMP3:%.*]] = select i1 [[TMP1]], i64 0, i64 [[TMP2]], !dbg [[DBG11]]
+; CHECK-NEXT: [[TMP3:%.*]] = select i1 [[TMP1]], i64 0, i64 [[TMP2]], !dbg [[DBG11]], !prof [[SELPROF:![0-9]+]]
; CHECK-NEXT: [[TMP4:%.*]] = getelementptr i8, ptr [[DST]], i64 [[SRC_SIZE]], !dbg [[DBG11]]
; CHECK-NEXT: call void @llvm.memset.p0.i64(ptr align 1 [[TMP4]], i8 [[C]], i64 [[TMP3]], i1 false), !dbg [[DBG11]]
; CHECK-NEXT: call void @llvm.memcpy.p0.p0.i64(ptr [[DST]], ptr @C, i64 [[SRC_SIZE]], i1 false), !dbg [[DBG12:![0-9]+]]
@@ -29,6 +29,7 @@ define void @test_constant(i64 %src_size, ptr %dst, i64 %dst_size, i8 %c) !dbg !
; Validate that the memset is mapped to DILocation for the original memset.
; CHECK: [[DBG11]] = !DILocation(line: 1,
+; CHECK: [[SELPROF]] = !{!"unknown", !"memcpyopt"}
; CHECK: [[DBG12]] = !DILocation(line: 2,
; CHECK: [[DBG13]] = !DILocation(line: 3,
@@ -50,3 +51,4 @@ define void @test_constant(i64 %src_size, ptr %dst, i64 %dst_size, i8 %c) !dbg !
!11 = !DILocation(line: 1, column: 1, scope: !5)
!12 = !DILocation(line: 2, column: 1, scope: !5)
!13 = !DILocation(line: 3, column: 1, scope: !5)
+!14 = !{!"function_entry_count", i32 10}
\ No newline at end of file
|
9821ae7 to
562be61
Compare
e5f8a76 to
32b01bd
Compare
c9da492 to
28594e2
Compare
32b01bd to
cd79889
Compare
cd79889 to
c1e4550
Compare
28594e2 to
99a9fce
Compare
teresajohnson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm with one question below
| Value *SizeDiff = Builder.CreateSub(DestSize, SrcSize); | ||
| Value *MemsetLen = Builder.CreateSelect( | ||
| Ule, ConstantInt::getNullValue(DestSize->getType()), SizeDiff); | ||
| if (auto *SI = dyn_cast<SelectInst>(MemsetLen)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this have a TODO or FIXME? I.e. is there a hope that we would be able to put something other than unknown here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chatted offline with @xur-llvm, actually we do record a histogram of frequency of various kinds of sizes, so some kind of analysis would be possible. Let's see how many of these selects make it in CodeGenPrepare (we have the DEBUG_TYPE to help with that). My suspicion is that we (1) don't do a good VP propagation for memops in the first place, and (2) some of these memops are synthetic (from intrinsics). Currently the profcheck work is about branches (br, switch, select), but we should get back to this.
99a9fce to
b85a685
Compare
c1e4550 to
7e5054e
Compare
b85a685 to
52e98af
Compare
7e5054e to
1f12bc6
Compare
52e98af to
10f16a8
Compare
1f12bc6 to
22d3e3b
Compare
10f16a8 to
acee864
Compare
acee864 to
48c758d
Compare

Issue #147390