Commit ab7e61c

committed

AMDGPU/SILoadStoreOptimizer: Optimize scanning for mergeable instructions

Summary: This adds a pre-pass to this optimization that scans through the basic block and generates lists of mergeable instructions with one list per unique address. In the optimization phase instead of scanning through the basic block for mergeable instructions, we now iterate over the lists generated by the pre-pass. The decision to re-optimize a block is now made per list, so if we fail to merge any instructions with the same address, then we do not attempt to optimize them in future passes over the block. This will help to reduce the time this pass spends re-optimizing instructions. In one pathological test case, this change reduces the time spent in the SILoadStoreOptimizer from 0.2s to 0.03s. This restructuring will also make it possible to implement further solutions in this pass, because we can now add less expensive checks to the pre-pass and filter instructions out early which will avoid the need to do the expensive scanning during the optimization pass. For example, checking for adjacent offsets is an inexpensive test we can move to the pre-pass. Reviewers: arsenm, pendingchaos, rampitec, nhaehnle, vpykhtin Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65961 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373630 91177308-0d34-0410-b5e6-96231b3b80d8

1 parent 9f29eb7 commit ab7e61cCopy full SHA for ab7e61c

1 file changed

+185

-82

lines changed

lib/Target/AMDGPU
- SILoadStoreOptimizer.cpp

1 file changed

+185

-82

lines changed

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit ab7e61c

1 file changed

1 file changed

File tree

1 file changed

1 file changed

0 commit comments