Skip to content

Commit ab7e61c

Browse files
committed
AMDGPU/SILoadStoreOptimizer: Optimize scanning for mergeable instructions
Summary: This adds a pre-pass to this optimization that scans through the basic block and generates lists of mergeable instructions with one list per unique address. In the optimization phase instead of scanning through the basic block for mergeable instructions, we now iterate over the lists generated by the pre-pass. The decision to re-optimize a block is now made per list, so if we fail to merge any instructions with the same address, then we do not attempt to optimize them in future passes over the block. This will help to reduce the time this pass spends re-optimizing instructions. In one pathological test case, this change reduces the time spent in the SILoadStoreOptimizer from 0.2s to 0.03s. This restructuring will also make it possible to implement further solutions in this pass, because we can now add less expensive checks to the pre-pass and filter instructions out early which will avoid the need to do the expensive scanning during the optimization pass. For example, checking for adjacent offsets is an inexpensive test we can move to the pre-pass. Reviewers: arsenm, pendingchaos, rampitec, nhaehnle, vpykhtin Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65961 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373630 91177308-0d34-0410-b5e6-96231b3b80d8
1 parent 9f29eb7 commit ab7e61c

File tree

1 file changed

+185
-82
lines changed

1 file changed

+185
-82
lines changed

0 commit comments

Comments
 (0)