Skip to content

[Tutorial] Compound indexing and query optimization #100

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

HeinrichvonStein
Copy link

This PR adds a compound indexing and query optimization tutorial.

## Conclusion

By following these best practices and leveraging PowerSync's schema capabilities, developers can achieve significant performance gains and ensure scalability for their applications.
The findings above can be integrated directly into PowerSync's documentation to assist other developers in navigating these challenges effectively.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this last note meant to be here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, copy pasta error. Thanks for catching


## Introduction

This tutorial outlines findings and recommendations based on extensive testing of compound indexes, query execution, and table performance using the [PowerSync Web SDK](https://github.com/powersync-ja/powersync-js).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to link to the SDK reference in docs since it's the higher-level starting point (and also includes link to source code): https://docs.powersync.com/client-sdk-references/javascript-web

@rkistner
Copy link
Contributor

rkistner commented Feb 6, 2025

I think this tutorial should be reworked a bit.

The guide focuses on compound indexes, but does not explain (1) what a compound index is, or (2) when to use a compound index versus a single-column index.

The performance benchmarks show the difference between using the index vs a table scan, but (1) does not give any details of how the benchmark was performed (e.g. number of table rows, number of results, the index used). The difference displayed us also quite small (206ms vs 556ms) - indexes can make a much bigger difference in some cases. And once again it doesn't show the difference compared to a single-column index, which could be a very small in this case.

The guide mentions order of indexed columns affects query performance, and skipping leading columns affect performance, but it doesn't clarify what this means, or how it affects performance. The examples don't clarify much here.

Overall, I'd recommend:

  1. Start with a guide focusing on single-column indexes, maybe just mentioning compound indexes as a footnote.
  2. Add detail on when the index can be used or not, perhaps also mentioning operator other than =. But keep it simple here - see the point below on the SQLite docs.
  3. Add a guide on testing query performance and viewing the query plain (the EXPLAIN QUERY PLAN bit). The performance timeline may also be useful here.
  4. SQLite already has some great and thorough docs on how the query planner and indexes work here and here. Maybe it's better to just include some basic examples in the doc here, specifically noting how indexes are defined in PowerSync, then link to those for the rest.

@michaelbarnes
Copy link
Contributor

Thanks for the review @rkistner we're going to rework this guide.

@HeinrichvonStein
Copy link
Author

This needs to be revisited. Closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants