Skip to content

Commit

Permalink
update to latest
Browse files Browse the repository at this point in the history
  • Loading branch information
MoserMichael committed Mar 12, 2022
1 parent e69e07e commit efa9ae8
Show file tree
Hide file tree
Showing 5 changed files with 23,658 additions and 11,826 deletions.
14 changes: 14 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,8 @@ Projects like this usually end up with a number of processing steps, and it's be
- Note:
- a special run is required, after we finished with amending the description_cache.json files

----

- Command: ./build_cats.py –cache –timeout TIMEOUT_SEC
- Input file: description_cache.json
- Output file: description_cache.json
Expand All @@ -72,24 +74,35 @@ Projects like this usually end up with a number of processing steps, and it's be
- Purpose: like previous command, just uses the firefox browser via selenium package.
- Some hosts can’t be scanned by regular scanner (example: cloudlflare and other DDOS protection mechanisms involve several http redirects, where javascript code is run to determine the next step; therefore get html by automating the browser)

----

- Command: build_geoip.py
- Input file: description_cache.json
- Output file: description_cache.json
- Purpose: set geoip_lan attribute based on geo-ip lookup of host name

----

- Command: build_lang.py
- Input file: ui_text_string.txt
- Output file: description_cache.json
- Purpose: detects the language of the description, and sets the language_description attribute (needed for automatic translation, but maybe i would be better off with auto...)

----

- Command: build_translate.py –descr
- Input file: description_cache.json
- Output file: description_cache.json
- Purpose: build auto translation of description into supported set of languages (build translations field in each host entry)

----

- Command: build_translate.py –uitext
- Input file: ui_text_string.txt
- Output file: ui_text_translated.json
- Purpose: build auto translation of each user interface string that appears on the page (not site descriptions)

----

- Command: build_translate.py –translate
- Input files:
Expand All @@ -114,3 +127,4 @@ Now github actions are disabled after a while, when they don't see any action in

Also I am using expect to automate pushing stuff into the repo by the build script (see build subdirectory in this project); sometimes that's a useful tool to know.

Interesting that modern systems have this tendency to evolve into Rube Goldberg Devices; microservices are there by definition, data processing stuff is also there... All by virtue of being divided into small self-contained but interdependent parts...
4 changes: 1 addition & 3 deletions build_lang.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,6 @@ def run_identify_language():
entry_obj.language_description = descr
cache.cache_set(base_url, entry_obj)
num_set += 1

if cache.write_description_cache():
print(f"*** description cache changed, number of items set: {num_set}")
cache.write_description_cache()

run_identify_language()
2 changes: 1 addition & 1 deletion build_translate.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ def process(self, from_lang_name, to_lang_name, text):
# print('------End--------')
#
# return None

#
class TranslateHelpText:
def __init__(self, list_of_target_langs):
self.list_of_target_langs = list_of_target_langs
Expand Down
2 changes: 1 addition & 1 deletion globs.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ class Globals:
ui_text_strings = "ui_text_strings.txt"
ui_text_translated = "ui_text_translated.json"

supported_languages = [ 'en', 'de', 'fr', 'ru', 'es', 'ja', 'zh', 'uk' ]
supported_languages = [ 'it', 'en', 'de', 'fr', 'ru', 'es', 'ja', 'zh', 'uk' ]
failed_lookups = "failed_lookups.txt"

flag_list = "flag_list.txt"
Loading

0 comments on commit efa9ae8

Please sign in to comment.