Skip to content

Creating infallible queries #94

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
thehiddenwaffle opened this issue May 8, 2025 · 14 comments
Open

Creating infallible queries #94

thehiddenwaffle opened this issue May 8, 2025 · 14 comments

Comments

@thehiddenwaffle
Copy link
Contributor

thehiddenwaffle commented May 8, 2025

Let me start by saying this crate is awesome!

Use case similar to https://github.com//issues/92

I am working on a project that will involve using json paths, and those paths will be coming from a huge set of static strings. Because I have this set of static strings I would like to be able to perform compile time checks on them to ensure valid json path queries so that I don't have to deal with fallibility in the downstream code. To this end my goal is to simply make a macro that converts them to JpQuery structs, throwing a compiler error if any would fail to be parsed by parse_json_path.

I started looking in the source code and found js_path_process, which I could use in a new trait roughly as follows:

pub trait JsonPathCompiled: JsonPath {
    fn query_with_path_checked(&self, q: &JpQuery) -> Vec<QueryRef<Self>> {
        js_path_process(q, self).expect("When can this fail??")
    }

    fn query_only_path_checked(&self, q: &JpQuery) -> Vec<QueryPath> {
        js_path_process(q, self).expect("When can this fail??").into_iter()
            .map(|r| r.path())
            .collect::<Vec<_>>()
    }

    fn query_checked(&self, q: &JpQuery) -> Vec<&Self> {
        js_path_process(q, self).expect("When can this fail??").into_iter()
            .map(|r| r.val())
            .collect::<Vec<_>>()
    }
}
impl JsonPathCompiled for serde_json::Value {}

To my understanding the only notion success or failure in the original RFC for a valid path is either [some_stuff] for a match or the empty array [] for no matches found.

After reading #92, my real question becomes: is the Err match arm reachable code and under what circumstances would that arm be taken? I tried to dig into the source code but I'm still a bit new to rust and even a simple example would help a lot, if needed then maybe I can just "ban" that edge case with macros somehow.

@thehiddenwaffle
Copy link
Contributor Author

I should add that normally I wouldn't bother the creator of a crate with something that's so off topic to the original concept of the crate but I was thinking of creating a downstream crate off of these concepts, or if this is something that you would like to integrate into this crate I would be more than happy to do a pull request.

@thehiddenwaffle
Copy link
Contributor Author

#92 is very similar, though I think my use case and many others would benefit from the ability to run pre-parsed queries without the possibility of failure.

@besok
Copy link
Owner

besok commented May 9, 2025

Thank you for good words!

You can achive the compilation check using procedural macros.
The idea is interesting and definitely can be benefitial. If you want to tackle it and provide a pr it would be great and very welcomed.

Regarding examples of the error cases you can take a look into rfc9535 folder (it is a test suite for js path rfc doc) and especially here so all cases with the invalid_selector = true will be examples of how to handle it.

@thehiddenwaffle
Copy link
Contributor Author

Excellent, despite being relatively new to Rust I have been quite obsessed with how much proc macros can do and have written quite a few, though I can't promise it will be the most clean code on the planet.

My thought was to craft it to look something like

let q: JpQuery = json_query!($[?$.event.userId == $.context.id]);

My rationale for not using quotes is that the token tree used by JsonPath to me appears to be a strict subset of the rust one(and also doesn't seem to require any whitespace?). One of the benefits of not using quotes would be allow for the macro to have spanned errors at some point in the future that point straight to the invalid token.

@thehiddenwaffle
Copy link
Contributor Author

I guess I'm still curious as to what situation could cause the actual running of the query to fail like (slightly later in that test)[https://github.com/besok/jsonpath-rust/blob/244188496a5752be7e1bc19b2a9dd98736660ed3/rfc9535/src/suite.rs#L74]. If we could define that situation I could create a second macro that would not allow the "bad" situation and the crate could feature a query_unchecked(q: SafeJpQuery) -> Vec<QueryRef<'a, T>> that can safely guarantee a value

@thehiddenwaffle
Copy link
Contributor Author

Also the standard for most crates that I've seen seems to be adding the proc macros as {crate_name}_impl or {crate_name}_derive at the crate root, is there something you would prefer over that?

@besok
Copy link
Owner

besok commented May 9, 2025

My thought was to craft it to look something like ...

It looks a bit challenging but feel free to try. The main pitfall here is you need to reproduce the parsing process on the level where macro operates.
Anyway, don't hesitate to use this issue thread for that.

Also the standard for most crates that I've seen seems to be adding the proc macros as {crate_name}_impl or {crate_name}_derive at the crate root, is there something you would prefer over that?

yeah, it looks good.

I guess I'm still curious as to what situation could cause the actual running of the query to fail like

I think, this is just a safe guard in case if the new tests will be added to the suite and the parser does not cover it. I used to rely on it when I move the lib to comply to the standard recently.

But you can print the errors from invalid case like that:
Replace this if else to this:

       if jspath.is_ok() {
            Err(TestFailure::invalid(case))
        } else {
           if let Err(e) = jspath {
                println!("reason: {}", e);
                println!("selector: {}", case.selector);
            }
            Ok(())
        }

and run main.rs in rfc9535

it will print the reason from parse_json_path and the js path like that:

reason: Invalid json path: Invalid function expression `count   (@.*)`
selector: $[?count      (@.*)==1]

reason: Invalid json path: Invalid number of arguments for the function `match`: got 1
selector: $[?match(@.a)==1]

reason: Invalid json path: Invalid argument for the function `count`: expected a node, got a literal
selector: $[?count('string')>2]

reason: Invalid json path: Invalid number of arguments for the function `match`: got 3
selector: $[?match(@.a,@.b,@.c)==1]

selector: $[::-0]
reason: Failed to parse rule:  --> 1:5
  |
1 | $[::-01]
  |     ^---
  |
  = expected int

@thehiddenwaffle
Copy link
Contributor Author

thehiddenwaffle commented May 9, 2025

Alright so I'm starting on this and I have run into an issue of circular dependencies because in order for downstream crates to access the macros the main crate will need to have the proc macro crate as a dependency, however in order for the proc macro crate to use the functions such as parse_json_path() it will need to import the main source.

Is there a common way of dealing with this? I haven't had a similar situation come up in any macros I've written before because the parsing logic is usually not attached to the crate it's parsing for.

I think I have 2 possible solutions for this that I've thought of so far:

1. Pull the types and methods required by both(parse_json_path(), JsonPathError, JSPathParser, next_down(), jp_query()) out into a common crate that both of the other crates can depend on.
2. Use something like https://crates.io/crates/pest-ast to confirm that there won't be test errors then replicate some of the additional checks done in the main crate like the 9007199254740991 index check and such. Feels like this would be too much code replication and would not perfectly match up with the crate's current implementation.

EDIT: the more that I look at it I'm not sure if either of these solutions is good enough, they would either result in a lot of refactoring or a lot of new code

@besok
Copy link
Owner

besok commented May 9, 2025

The function-like macro can be created in the same crate. You can simply include it pub into this module

Speaking of the parsing problem, it is what i pointed out in my previous comment namely, you basically need to reproduce parsing logic in macro itself. It is possible but challenging.

You can probably experiment with proc-macro (for custom types) like

#[json_path]
struct SomeStruct(String);

or

#[derive(JsonPath)]
struct SomeStruct{
   #[json_path]
    query:String

}

And here you can use the default parser

@thehiddenwaffle
Copy link
Contributor Author

Honestly I may not have enough skill with "normal" declarative macros(if it's even possible?), feels like being able to use syn::Parse(which of course requires a separate proc-macro package) is the only way I personally will be able to make this happen. To that end I wonder about parsing either input tokens or input strings with syn or pest-ast respectively, then provide infallible conversions from the more explicit syn structs to JPQuery. This would allow the package to stay almost completely the same with some additional impl From<SelectionAST> for Selection{} that could be locked behind a feature by default. Also in order to minimize validation logic which pest-ast doesn't play nice with I have opened #95

@besok
Copy link
Owner

besok commented May 13, 2025

it is possible in compile time, for instance, in the following construction having a proc-macro:

#[json_path]
struct JsPath(String);
impl JsPath{
   fn new<T:Into<String>>( v:T){
        JsPath(v.into()) 
   }
}

fn some_fn(){
    let js = JsPath("invalid js path"); // here the error will be already while typing in editor
}

You can fiddle around with syn::Parse and pest-ast and maybe it will help. On the other hand it boils down to inability to reuse the parser (maybe with pest-ast it is possible, idk tbh)

@thehiddenwaffle
Copy link
Contributor Author

So I have a solution that seems to be working but I need to test it much more extensively.

There are some details:

  1. So far It's over 1200 lines of parser code. It should be noted that this AST could potentially completely replace the current datatypes as it is a strict superset. It doesn't have to be this way though, as I said I can just provide impls for impl From<JpQueryAST> for JpQuery{...} or the macro can directly translate to JpQuery::new(Segments::new(......so much more......)) using ToTokens.
  2. Using pest-ast/from-pest it is able to both parse token streams at compile time and raw strings at runtime, but those are two separate processes, granted that detail is not a huge worry to me because the test suite for that is as simple as taking all the RFC strings and parsing them with pest and then syn and checking that the results are strictly equal.

Basically I guess I'm asking, do you want some amount of all of this in the same crate? I think there are plenty of things that this crate could leverage by getting an AST for free, however I would understand if you feel that it's too much to pay for what many would consider scope creep. One option would be to just have feature locked dependency to this proc macro crate and have them be similar but separate entities.

I'm in this for the long haul in terms of maintenance because this crate is the bread and butter for an application I'm designing but I also understand that these are just the words of some random person on the internet from your perspective.

@besok
Copy link
Owner

besok commented May 15, 2025

Basically I guess I'm asking, do you want some amount of all of this in the same crate? I think there are plenty of things that this crate could leverage by getting an AST for free

I think it would be beneficial. You can roll out the pr, and we can analyze it together.

One small thing is to ensure that all tests pass and tests in rfc folder (running main) remain the same state :)

@thehiddenwaffle
Copy link
Contributor Author

Absolutely, I'll probably have a separate test suite for the macro(at least while the macro is still new, maybe this suite could be less comprehensive once it's stable) and will run all suites before creating a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants