Skip to content
/ trsltx Public

Tools for automatic translation of texts written with LaTeX

License

Notifications You must be signed in to change notification settings

phelluy/trsltx

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

80 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

trsltx

Tools for automatic translation of texts written with LaTeX.

You need first to get a valid API key from https://textsynth.com/ and put it in a file named api_key.txt in the working directory or in an environment variable by

export TEXTSYNTH_API_KEY=<the_api_key>

Usage: go in the trsltx directory and run (you need a working install of Rust)

cargo run

By default, the French LaTeX file test/simple.tex is translated into English in test/simple_en.tex.

The languages are specified in the filename by the _xy mark, where xy is the abbreviated language name. Currently, the available languages are: en, fr, es, de, it, pt, ru.

For changing the default behavior do, for instance

cargo run -- -i fr -o de -f test/simple.tex

Or, for installing trsltx in your user account

cargo install --path .
trsltx --help
trsltx -i fr -o de -f test/simple.tex

cargo installis the recommend method: it takes into accound bug fixes both in the parser ltxprsand in the translator trsltx.

The translation is completed using a Large Language Model (LLM) available on the Texsynth server. It may contain some LaTeX errors. Therefore, it is essential to review and manually correct the translated code as necessary.

trsltx uses a unique feature of the Textsynth API, which allows the possibility to use a formal BNF grammar to constraint the generated output. See https://textsynth.com/documentation.html#grammar.

The original LaTeX file is split in not too long chunks by using markers %trsltx-split in the .tex file on single lines. trsltx will complain if a chunk is too long. It is possible to specify a split length with the -l option of trsltx. In the process an intermediate file test/simple_fr.tex is generated with split markers. For now, the automatic split is not very powerful. It is recomended to adjust the position of the markers manually if the translation is not satisfactory.

Each chunk is analyzed using a lightweight parser for a subset of the LaTeX syntax (see ltxprs). A special grammar is generated for each fragment, which encourages the LLM to stick to the original text. This discourages invented labels, references or citations. In addition, LaTeX commands that are not in the original text are less likely to be generated.

The grammar function is deactivated if the light syntax analyser fails. The chunk is partially translated if the server returns an error. In this case, the translation must be corrected manually...

It is also possible to mark a region that should not be translated with the markers %trsltx-begin-ignore and %trsltx-end-ignore on single lines. Ignored regions should not contain %trsltx-split markers. See the file test/simple.tex for an example.

Here are a few tips for improved results:

  • Your initial .tex file must compile without any error, of course. Be careful, the LaTeX compiler sometimes ignores unpaired braces {...}, which trsltx will not accept.
  • If a part of your initial .tex file is not recognized by the parser, comment it, remove the temporary file and restart trsltx.
  • You can define fancy LaTeX macros, but only in the preamble, before \begin{document}.
  • Give meaningful names to your macros for helping the translator (e.g. don't call a macro that displays the energy \foo. A better choice is \energy!).
  • Don't use alternatives to the following commands: \cite, \label, \ref. Otherwise, the labels, refs and citations may be lost in translation.
  • Avoid using %trsltx-split in the middle of math formulas, {...} groups or \begin ... \end environments.
  • The parser has other limitations (such has \verbatim envs). See ltxprs for limitations and possible workarounds.

About

Tools for automatic translation of texts written with LaTeX

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published