Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the jumpabsolute member from mstatement_s. #202

Merged
merged 1 commit into from
Oct 7, 2024

Conversation

divVerent
Copy link
Contributor

@divVerent divVerent commented Sep 23, 2024

This reduces the struct from 20 to 16 bytes, and thus may save some RAM. It also may improve CPU cache behavior by keeping more QC code in L1 and L2 cache, and possibly also improve instruction processing inside the CPU as all statements are aligned the same way.

On srv04, this speeds up Xonotic's serverbench from 58.92s real, 57.72s user to 57.72s real, 56.59s user (median of 25).

Change suggested by @uis246.

This reduces the struct from 20 to 16 bytes, and thus may save some
RAM. It also may improve CPU cache behavior by keeping more QC code in
L1 and L2 cache, and possibly also improve instruction processing inside
the CPU as all statements are aligned the same way.

On `srv04`, this speeds up Xonotic's `serverbench` from 58.92s real,
57.72s user to 57.72s real, 56.59s user (median of 25).
@divVerent
Copy link
Contributor Author

NOTE: I have not yet tested the disassembler changes. Will test those later.

@divVerent
Copy link
Contributor Author

Xonotic prvm_printfunction server StartFrame

before:

s4: :213: IFNOT      _StartFrame_init (=1), statement 8
s5: :219: IFNOT      _StartFrame (=_StartFrame()), statement 7
s6: :219: CALL0      _StartFrame (=_StartFrame())
s7: :212: RETURN      (=void)
s8: :214: STORE_F    GLOBAL669, _StartFrame_init (=1)
s9: :215: STORE_F    time (=5.57509995), GLOBAL49309
s10: :215: STORE_F    GLOBAL25876, time (=5.57509995)
s11: :216: STORE_F    GLOBAL25878, __spawnfunc_expecting (=0)
s12: :216: FIELD_FNC  world (=entity 0), .__spawnfunc_constructor (=.__spawnfunc_constructor), GLOBAL49310
s13: :216: STORE_ENT  world (=entity 0), GLOBAL4
s14: :216: CALL1      GLOBAL49310
s15: :217: STORE_F    GLOBAL49309, time (=5.57509995)
s16: :213: GOTO       , statement 5
s17: :213: DONE        (=void)

after:

s4: :213: IFNOT      _StartFrame_init (=1), statement 8
s5: :219: IFNOT      _StartFrame (=_StartFrame()), statement 7
s6: :219: CALL0      _StartFrame (=_StartFrame())
s7: :212: RETURN      (=void)
s8: :214: STORE_F    GLOBAL669, _StartFrame_init (=1)
s9: :215: STORE_F    time (=6.01259995), GLOBAL49309
s10: :215: STORE_F    GLOBAL25876, time (=6.01259995)
s11: :216: STORE_F    GLOBAL25878, __spawnfunc_expecting (=2)
s12: :216: FIELD_FNC  world (=entity 0), .__spawnfunc_constructor (=.__spawnfunc_constructor), GLOBAL49310
s13: :216: STORE_ENT  world (=entity 0), GLOBAL4
s14: :216: CALL1      GLOBAL49310
s15: :217: STORE_F    GLOBAL49309, time (=6.01259995)
s16: :213: GOTO       statement 5
s17: :213: DONE        (=void)

Looks good too. Slight formatting fix for GOTO.

@uis246
Copy link
Collaborator

uis246 commented Sep 23, 2024

Also it makes struct to fit into one cache line. Bonus points(and possible performamce boost) for aligned_alloc and alignas(16) or __attribute__((aligned(16))).

@divVerent
Copy link
Contributor Author

Also it makes struct to fit into one cache line. Bonus points(and possible performamce boost) for aligned_alloc and alignas(16) or __attribute__((aligned(16))).

Let's not go there yet - that might be a different PR. Mainly as the syntax for this is compiler dependent, so we probably need to design a wrapper in quakedef.h first.

@divVerent divVerent merged commit 59e7840 into master Oct 7, 2024
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants