Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removal/Sanitization of HTML from translations #24402

Open
lunny opened this issue Apr 28, 2023 · 9 comments
Open

Removal/Sanitization of HTML from translations #24402

lunny opened this issue Apr 28, 2023 · 9 comments
Labels
modifies/translation topic/security Something leaks user information or is otherwise vulnerable. Should be fixed!

Comments

@lunny
Copy link
Member

lunny commented Apr 28, 2023

Is it possible to inject <script>alert('xss')</script> via translation string or is there sanitization to prevent this? It's not a new issues but I suspect all translations may be vulnerable to attacks like this. Of course, given Crowdin's review process, something like this is unlikely to pass review, but it's good to have defense in depth.

Originally posted by @silverwind in #24397 (comment)

@lunny lunny changed the title Is it possible to inject <script>alert('xss')</script> via translation string or is there sanitization to prevent this? It's not a new issues but I suspect all translations may be vulnerable to attacks like this. Of course, given Crowdin's review process, something like this is unlikely to pass review, but it's good to have defense in depth. Is it possible to inject <script>alert('xss')</script> via translation string or is there sanitization to prevent this? Apr 28, 2023
@lunny lunny added the topic/security Something leaks user information or is otherwise vulnerable. Should be fixed! label Apr 28, 2023
@silverwind silverwind changed the title Is it possible to inject <script>alert('xss')</script> via translation string or is there sanitization to prevent this? Sanitization of HTML from translations Apr 28, 2023
@wxiaoguang
Copy link
Contributor

Also related to #23863

We need our own INI package first, then we can either pre-check or reject malicious translations.

Without our own INI package support, we can do the sanitizing when loading the locale assets

@silverwind
Copy link
Member

silverwind commented Apr 28, 2023

I think we could eliminate all HTML from translations, after which this sanitization could just strip all HTML tags from the translated string.

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Apr 28, 2023

It's also a big work.

Just like proposing "dropping jQuery", "dropping Fomantic UI", without a feasible plan and enough time spent on it, I do not think the good end would come.


At least, we can do more strict reviewing from now on, do the best to avoid more HTML appearing in translation

@silverwind
Copy link
Member

HTML removal isn't so hard, see #24397. This can be started now. Only issue is there's quite a lot of them:

$ rg '<' options/locale/locale_en-US.ini | wc -l
202

@lunny
Copy link
Member Author

lunny commented Apr 28, 2023

I used a new key so that old system will not be broken.

@silverwind
Copy link
Member

silverwind commented Apr 28, 2023

I used a new key so that old system will not be broken.

If it's a single-use key, we must use the same key, as the old key will no longer be used and remain as a "dead" translation entry.

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Apr 28, 2023

I used a new key so that old system will not be broken.

If it's a single-use key, we must use the same key, as the old key will no longer be used and remain as a "dead" translation entry.

Sometimes we backport the locales from main to 1.19, old key could be removed safely, the removal doesn't affect backporting.

@silverwind silverwind changed the title Sanitization of HTML from translations Removal or sanitization of HTML from translations Apr 28, 2023
@silverwind
Copy link
Member

Maybe it would also be useful to have a script to detect dead translation entries. Could even run it as a verification on CI.

@wxiaoguang
Copy link
Contributor

Actually I also have such plan, the problem is that some keys are constructed dynamically, so we need some "hint" syntax to tell the linter that "the next line would use the following keys: ...."

@silverwind silverwind changed the title Removal or sanitization of HTML from translations Remova/Sanitization of HTML from translations Apr 28, 2023
@silverwind silverwind changed the title Remova/Sanitization of HTML from translations Removal/Sanitization of HTML from translations Apr 28, 2023
lunny pushed a commit that referenced this issue Sep 2, 2024
Part of #27700

Removes all URLs from translation strings to easy up changing them in
the future and to exclude people injecting malicious URLs through
translations. First measure as long as #24402 is out of scope.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
modifies/translation topic/security Something leaks user information or is otherwise vulnerable. Should be fixed!
Projects
None yet
Development

No branches or pull requests

3 participants