Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lower automatic suggestions threshold in case of no results #13908

Open
strepon opened this issue Feb 17, 2025 · 5 comments
Open

Lower automatic suggestions threshold in case of no results #13908

strepon opened this issue Feb 17, 2025 · 5 comments

Comments

@strepon
Copy link

strepon commented Feb 17, 2025

Describe the problem

In automatic suggestions, the threshold to show them (75 as written here) is sometimes too high and useful suggestions are omitted. As a workaround, user can enter the string into the "Translation memory" field and see the search results where the threshold is lower (10).

Describe the solution you would like

For me, it would make sense to lower the threshold automatically until a first result is found.

As an alternative, a more convenient and helpful UI could be introduced - e.g. a button which would perform the search without the need to copypaste the string manually (or a button which would copy the string to the "Translation memory" field).

Describe alternatives you have considered

No response

Screenshots

An example from The Document Foundation instance of Weblate:

There are strings:

  • original: "This setting enables you to export the document as a .pdf file containing two file formats: PDF and ODF."
  • and newer "This setting enables you to export the document as a .pdf file containing two file formats: PDF and ODF as an attachment."

differing only in the last "as an attachment" text.

However, the original string is not shown in the suggestions for the newer one (but translators would expect it):

Image

If using the search, the string is there:

Image

(Which corresponds with the thresholds: for these strings, thresholds 75 and 10 mean similarities 97 % and 62 %; the suggestion has 85 %).

Additional context

No response

@tomkolp
Copy link
Contributor

tomkolp commented Feb 17, 2025

We have this problem too, especially with very long strings. A manual search button with a lower threshold would be ideal.

@nijel
Copy link
Member

nijel commented Feb 18, 2025

#13841 should make the matching better for longer strings, even though it was more targeted at strings containing markup. I see TDF is still on 5.9, so the improvement might be there once the server is upgraded to 5.10.

@tomkolp did you observe any change in this recently? You probably already run this.

@strepon
Copy link
Author

strepon commented Feb 18, 2025

OK, let's wait to see this in 5.10 (I didn't realize that #13841 affects all strings with sentences, it would be great if it helps).

@nijel
Copy link
Member

nijel commented Feb 18, 2025

It does as it removes all punctuation and spaces from the calculation. This will always be a trade-off between results quality and performance. In the long term, I think we will have to migrate translation memory outside the database and use OpenSearch for it (or something entirely different, I haven't really investigated other options).

@strepon
Copy link
Author

strepon commented Feb 23, 2025

After testing it in 5.10, I am confused - it seems the suggestions are not provided even for high similarities, e.g. in this case - I think this worked previously:

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants