Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Searching for compound words doesn't appear to work #484

Open
vr8hub opened this issue Mar 13, 2025 · 3 comments
Open

Searching for compound words doesn't appear to work #484

vr8hub opened this issue Mar 13, 2025 · 3 comments
Assignees

Comments

@vr8hub
Copy link

vr8hub commented Mar 13, 2025

I just tried searching for Haycraft-Queen on the Books page, and it returned a bunch of books that had "Queen" somewhere, but none of the Haycraft-Queen series books.

I then tried it with quotes around it, "Haycraft-Queen", and it returned no books.

I then tried just Haycraft, and that didn't return anything, either.

The combination of those don't make sense to me—it seems that it's splitting the two words and searching for them individually, but it's not finding either of them in the "Haycraft-Queen" series name?

I then tried Haycraft-Queen Cornerstones, and it did return the Haycraft-Queen items (because of Cornerstones presumably), but it also returned anything with just queen in it (and, I'm assuming, anything with just cornerstones in it, but there was 18 pages and I didn't go them them all :) ). Putting quotes around the whole thing, "Haycraft-Queen Cornerstones", again returned nothing.

Pulling up one of the series' books and clicking on the collection name does return all the books in the collection, but surely we should be able to search for the collection name to find them?

I would think the interface should search for a compound word as a whole before it searches for it's component parts?

Also, shouldn't we be able to put quotes around something to have it search for the entire term rather than individual pieces of it? In this case I don't want the "queen"s, I just want the "Haycraft-Queens."

@acabal
Copy link
Member

acabal commented Mar 13, 2025

@colagrosso

@colagrosso
Copy link
Collaborator

Thanks for the report, Vince, and sorry for the trouble.

I'll take a closer look, but I believe the problem is that:

  1. We remove hyphens from the user's search query here
  2. We also remove hyphens from the collection title in the index here

I'll test this out to see if there are other problems. The fix will require code changes to the lines above and re-indexing the IndexableCollections column.

Regarding your question about quotes, we recently (Feb 2025) enabled support for searching with quotes. For example, searching for "war and peace" works:

https://standardebooks.org/ebooks?query=%22war+and+peace%22

So after we fix the problem with hyphens, your "Haycraft-Queen" example should work, too.

@vr8hub
Copy link
Author

vr8hub commented Mar 14, 2025

Ignoring whether we should eliminate hyphens for a second, if they're eliminated both from the query and from the index, why doesn't the query work? IOW, if the first is turning my query into "HaycraftQueen" and the second is turning the index into "HaycraftQueen," then why isn't the former finding the latter?

Back to the hyphens, and this might be a bigger can of worms that shouldn't be opened. :)
Searching for "queen" finds anything that has queen in it wherever we're searching (I don't know all the places we're searching, but it's not pertinent to the discussion.) My expectation, which again may be unreasonable, is that searching for queen would find the queen in Haycraft-Queen, because it's a "word" within that phrase. Same for Haycraft.

Thus, if we had an author with a compound name Alpha-Beta, I would also expect to find him when searching for "Alpha" or "Beta". Or Alpha-Beta.

Again, maybe that's unreasonable. Or not in scope. Consider this the ramblings of a mad man. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants