Skip to content
/ gan Public

📜 the Great Automatic Nomenclator — The Next Million Names for Archaea and Bacteria

License

Notifications You must be signed in to change notification settings

telatin/gan

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

seqfu logo

GAN: The Great Automatic Nomenclator

The Next Million Names for Archaea and Bacteria

Principle

To generate a large number of new names, we apply a combinatorial approach starting with two or three sets of curated roots, that are processed to produce all their possible combinations while keeping trace of their grammatical metadata to draft a valid etymology.

Gan flowchart

Dependencies

The scripts in this repository require Python (at least 3.6) and these modules:

  • itertools (ships with Python)
  • pandas (>1.0)
  • xlrd (1.2.0)

Genera generator

A set of two (or three) Excel tables formatted as shown below is used to generate the list of combinations in JSON, HTML and LaTeX format.

Excel input format

Synopsis:

usage: gan-genus.py [-h] -1 FIRST -2 SECOND [-3 THIRD] -o OUTDIR [-p PREFIX] [-c CONNECTOR] [-v]

For full usage and installation instructions, please check the documentation.

Etymology

"The great automatic nomenclaturer" is a reference to a short story ("The Great Automatic Grammatizator") written by the British author Roald Dahl [link].