Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Respect structured field types (YAML/JSON) in search #5490

Open
GiantCrocodile opened this issue Aug 17, 2023 · 10 comments
Open

Respect structured field types (YAML/JSON) in search #5490

GiantCrocodile opened this issue Aug 17, 2023 · 10 comments
Labels
type: enhancement ✨ Suggests an enhancement; improves Kirby

Comments

@GiantCrocodile
Copy link

GiantCrocodile commented Aug 17, 2023

Description

I have a search function to search articles in my blog. This works in general. I just noticed that a specific 2 word search term results in 2 results where I would expect just 1 result.

Search function:

return function ($site) {

  $query   = get('q');
  $results = page('blog')->index()->listed()->search($query, 'title|text|tags');
  $results = $results->paginate(20);

  return [
    'query'      => $query,
    'results'    => $results,
    'pagination' => $results->pagination()
  ];

};

I checked the article.de.txt file that gets found and I see that the metadata field id of the blocks field text contains the search string by chance. From a technical perspective it makes sense that the search function finds this. From an user perspective it doesn’t make sense because the article doesn’t contain the 2 chars long search phrase so this can’t be what the user actually searches for.

Expected behavior
The metadata is being ignored and thus the search results are what the user will expect.

Your setup

v3.9 and v4.0 alpha

Additional context
see https://forum.getkirby.com/t/page-search-returns-unexpected-result/29317/2?u=warg

@distantnative distantnative added needs: delay ⏳️ Requires more time, on hold type: bug 🐛 Is a bug; fixes a bug labels Aug 17, 2023
@distantnative
Copy link
Member

Summary for the team

I don't think there is a good way currently to resolve layout and blocks field content first before search, so that their json markup doesn't accidentally is matched. For that, we would need to load the blueprints and check which field is what type. Currently, this would be quite the performance drag.

We could revisit this once our blueprints are refactored and thus not as performance heavy.

Of if anyone has another idea, please add it.

@distantnative distantnative changed the title Page search returns irrelevant pages due to id metadata field of block field containing the searched term by chance Search matches block/layout field metadata Aug 17, 2023
@afbora
Copy link
Member

afbora commented Aug 21, 2023

I've an idea 💡

Actually, the right thing is to read the blueprint as @distantnative said, but this will cause serious performance problems. Instead, I think we can easily solve this issue by specifying the data type in the search as we use ->toStructure(), ->toBlocks() for field output. Here rough example:

 $results = $site->search($query, 'title|adresses:yaml|text:json|tags');

What do you think? We can decide the syntax and data type names later.

@lukasbestle
Copy link
Member

I like the idea. It could even be useful in the long run (parsing the blueprints will still be slower than explicit configuration, even after the refactoring). The syntax is great IMO.

@GiantCrocodile
Copy link
Author

Sounds good and sounds more stable than auto-detecting something. So I think the "needs: delay" tag can be removed.

@lukasbestle lukasbestle added type: enhancement ✨ Suggests an enhancement; improves Kirby and removed needs: delay ⏳️ Requires more time, on hold type: bug 🐛 Is a bug; fixes a bug labels Sep 16, 2023
@lukasbestle lukasbestle changed the title Search matches block/layout field metadata Respect structured field types (YAML/JSON) in search Sep 16, 2023
@distantnative
Copy link
Member

$results = $site->search($query, 'title|adresses:yaml|text:json|tags');

Following the example, I got the $field for text and now my search components sees the type specified as json, what does it do now to retrieve a string it can use for searching?

@afbora
Copy link
Member

afbora commented Sep 16, 2023

I see what you mean. Yes, it can be difficult to extract the relevant field data from JSON. For that may need ->toString() method for each data handler. When the search is called, I think the following code would work: Data::decode($field->value())->toString()

Or it may be better to specify the field type directly, not sure: 'title|adresses:structure|text:blocks|tags'

@lukasbestle
Copy link
Member

We could convert the values into a cleaned string, maybe like so:

  1. First parse the field depending on the data type (Data::decode($value, $type)).
  2. Walk over the array structure, remove typical metadata fields and concatenate all remaining values recursively, ignoring the keys.
  3. Search in that cleaned string

Rough code for 2:

function convert(array $array): string {
    unset($array['id']);
    unset($array['type']);
    // ...

    foreach ($array as $key => $value) {
        if (is_array($value) === true) {
            // recurse
            $array[$key] = convert($value);
        }
    }

    return implode(' ', array_values($array));
}

Or it may be better to specify the field type directly, not sure: 'title|adresses:structure|text:blocks|tags'

I like that. It would allow us to fine-tuned the ignored keys per field type.

@distantnative
Copy link
Member

@lukasbestle I think this is the right direction. However, I would suggest to rather have dedicated search methods in the respective classes (Block, StructureObject) which are called from the component when specified. That way those methods can much better decide what fields can be thrown out for their own datatype.

@lukasbestle
Copy link
Member

Makes sense 👍

@GiantCrocodile
Copy link
Author

GiantCrocodile commented Jan 27, 2024

Is this something that could be part of v4.2 or v4.3?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement ✨ Suggests an enhancement; improves Kirby
Projects
None yet
Development

No branches or pull requests

4 participants