An attempt at an open-source version of the Logseq Sync service, intended for individual, self-hosted use.
It's vaguely functional (see What Works? below), but decidedly pre-alpha software. Definitely don't try to point a real, populated Logseq client at it, I have no idea what will happen.
Right now, the repo contains (in cmd/server
) a mostly implemented version of the Logseq API, including credentialed blob uploads, signed blob downloads, a SQLite database for persistence, and most of the API surface at least somewhat implemented.
Currently, running any of this requires a modified version of the Logseq codebase (here), and the @logseq/rsapi
package (here)
On that note, many thanks to the Logseq Team for open-sourcing rsapi
recently, it made this project significantly easier to work with.
With a modified Logseq, you can use the local server to
- Create a graph
- Upload (passphrase-encrypted) encryption keys
- Get temporary AWS credentials to upload your encrypted files to your private S3 bucket
- Upload your encrypted files
And that's basically the full end-to-end flow! The big remaining things are:
- Figuring out the WebSockets protocol
- I think this is for sending "hey there's an update" notifications to clients, but I've only been testing with a single client so far.
- Figure out how/when to increment the transaction (
tx
) counter
There's some documentation for the API in docs/API.md. This is the area I could benefit the most from having more information/help on, see Contributing below
The real Logseq Sync API gets temp S3 credentials and uploads files direct to S3. I haven't looked closely enough to see if we can swap this out for something S3-compatible like s3proxy or MinIO, see #2 for a bit more discussion.
Currently, amazonaws.com
is hardcoded in the client, so that'll be part of a larger discussion on how to make all of this configurable in the long run.
Being able to connect to a self-hosted sync server requires some changes to Logseq as well, namely to specify where your sync server can be accessed. Those changes are in a rough, non-functional state here: https://github.com/logseq/logseq/compare/master...bcspragu:logseq:brandon/settings-hack
The self-hosted sync backend has rudimentary support for persistence in a SQLite database. We use sqlc to do Go codegen for SQL queries, and Atlas to manage generating diffs.
The process for changing the database schema looks like:
- Update
db/sqlite/schema.sql
with your desired changes - Run
./scripts/add_migration.sh <name of migration>
to generate the relevant migration - Run
./scripts/apply_migrations.sh
to apply the migrations to your SQLite database
With this workflow, the db/sqlite/migrations/
directory is more or less unused by both sqlc
and the actual server program. The reason it's structured this way is to keep a more reviewable audit log of the changes to a database, which a single schema.sql
doesn't give you.
If you're interested in contributing, thanks! I sincerely appreciate it. That said, at this stage in the project, there's not a lot of work that can be parallelized, I need to clean some stuff up before people can start implementing features, fixing bugs etc.
One area where I would love help is specifying the official API more accurately. My API docs are based on a dataset of one, my own account. So there are areas that are underspecified, unknown, or where I just don't understand the flow. Any help there would be great!