-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Experiment] WASM IPLD Codecs and ADLs #9016
base: master
Are you sure you want to change the base?
Conversation
� Conflicts: � core/corehttp/gateway_handler.go � go.mod � go.sum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll be interesting to see how this setup performs when we end up with a wasm ADL of something like UnixFS where we have multiple recursive iterations of reification as we traverse down a directory tree. I'm worried that the jumping in/out of wasm for each ADL evaluation may add up pretty quick, but i bet in practice this setup works fine for a lot of IPFS use cases.
Very excited for this future! 🚀
if cl, ok := link.(cidlink.Link); !ok { | ||
return nil, fmt.Errorf("cannot process link: %v", link) | ||
} else { | ||
block, err := api.blocks.GetBlock(linkContext.Ctx, cl.Cid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's regret around the structure we ended up here - with links in practice needing to be cidlink.Link but doing this check of it every time to pull out the Cid.
Instead, I think the pattern ipld-prime has been hoping to transition to is to use cid.Cast(link.Binary())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, I thought this pattern is what we've got at the moment. Also, wouldn't cid.Cast
do parsing again?
@@ -0,0 +1,209 @@ | |||
--- | |||
title: "WAC Specification" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is 'wac'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will get extracted (and maybe a better name?) and made into an spec PR to the IPLD repo, but basically I created a new IPLD codec called WAC (webassembly codec). The name isn't really great, just named after the first use case I had in mind.
The idea is basically to have a simple codec that fits the IPLD data model, which both:
a) Gives us something to discuss about what the IPLD data model is
b) Means I can get data in/out of WASM losslessly without requiring tons of calls to extract each integer, string, byte array, etc. manually. I can copy entire nested maps of nested maps of lists and it's fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
registry = &wasmRegistry{} | ||
|
||
for _, c := range cfg.Codecs { | ||
wasm, err := ioutil.ReadFile(c.WasmPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how much more annoying would it be to reference the wasm as a cid in the block store rather than an on-disk path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't that lead to some kind of weird attacks, where you could create a format shows up differently based on some unrepeatable sideeffects ?
(I guess we could configure the WASM interpreter to be deterministic and repeatable (no random, or io available, ...)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how much more annoying would it be to reference the wasm as a cid in the block store rather than an on-disk path?
Not that much more, I'd like to be able to do that. Some things I think we'd want to do to enable this include:
- A way to protect the WASM blocks from getting cleared by GC (should be pretty easy)
- At least for now assert a 2MiB limit on WASM blobs make that they must be Raw blocks
- Let the plugin get access to a blockservice
return nil | ||
} | ||
|
||
const fuelPerOp = 10_000_000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe make configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ya, makes sense. Just left it there for now, not sure what makes sense for limits. Could certainly allow limits to be per-module or something if people want to mess with things. Could also remove the limits by default 🤷
None of our codecs or ADLs have any sort of limits like this at the moment. It's worth considering what this means in terms of helping users make their code predictably deployable. Wouldn't want people to be surprised that some of their data loaded with implementation A, but not B due to limits here. Obviously people can do whatever they want as with block size limits, but we probably want some reasonably large safe boundaries to let people work with.
var fnBuildSel func(comps []string) (builder.SelectorSpec, error) | ||
fnBuildSel = func(comps []string) (builder.SelectorSpec, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var fnBuildSel func(comps []string) (builder.SelectorSpec, error) | |
fnBuildSel = func(comps []string) (builder.SelectorSpec, error) { | |
fnBuildSel := func(comps []string) (builder.SelectorSpec, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty sure the code won't compile if you do that since the function is recursive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok you are right, it's cursed to have a recursive virtual call to a function pointer if you aren't doing DP.
I would move this out as a private function on the global scope.
While @aschmahmann is out, I'm not imagining anyone will be driving it forward. I think relevant parties are aware of the code in case it is relevant in the short term. |
This PR is an experiment exploring how go-ipfs could leverage IPLD Codecs and ADLs written in WebAssembly.
Some pieces that are here that could be extracted having nothing to do with WASM:
This allows loading codecs and ADLs via the config file, e.g.
The work describing how to make WASM code that's compatible is being explored in https://github.com/aschmahmann/wasm-ipld. ipld/wasm-ipld#2 is the latest draft.
You can just do something like
cargo build --target wasm32-unknown-unknown --release
in thewasmlib
folder to generate the wasm blobs for inclusion and play around with it.If you want to see custom codecs/ADLs in action it can be a bit of a pain since you have to actually write the data somewhere and then import it to go-ipfs (e.g.
ipfs block put/ipfs dag put/ipfs dag import
). If you want to see a koala picture inside a BitTorrent folder load over a go-ipfs HTTP gateway then:http://localhost:8080/ipfs/f01631114d55f4390b4e6f5c980ff06340beda9bddd6ff926?selector=bafyqanvbmf7keyj6ufqwnilcmy7kc2kln5qwyyjonjygpilbf2qgeyltozrgs5dun5zhezloor3dcllenfzgky3un5zhs
ipfs dag get
on that selector to see what it looks like in dag-json)