These are the possible values assigned for token.pos
:
POS | Explanation |
---|---|
ADJ | adjective |
ADP | adposition |
ADV | adverb |
AUX | auxiliary verb |
CCONJ | coordinating conjunction |
INTJ | interjection |
NOUN | noun |
NUM | numeral |
PRON | pronoun |
PROPN | proper noun |
PUNCT | punctuation |
SCONJ | subordinating conjunction |
SPACE | space |
SYM | symbol |
VERB | verb |
X | other, e.g. foreing |
These are the possible values for token.dep
:
The morphology labels (token.morph
) follow the UD for Finnish specification.
The recognized named entities (token.ent_type
) follow the OntoNotes scheme:
ent_type | Explanation |
---|---|
CARDINAL | Numerals that do not fall under another type |
DATE | Absolute or relative dates or periods |
EVENT | Named hurricanes, battles, wars, sports events, etc. |
FAC | Buildings, airports, highways, bridges, etc. |
GPE | Geo-political entity: Countries, cities, states |
LANGUAGE | Any named language |
LAW | Named documents made into laws |
LOC | Non-GPE locations, mountain ranges, bodies of water |
MONEY | Monetary values, including unit |
NORP | Nationalities or religious or political groups |
ORDINAL | Ordinal numbers: ensimmäinen, toinen, etc. |
ORG | Companies, agencies, institutions, etc. |
PERCENT | Percentage (including “%”) |
PERSON | People, including fictional |
PRODUCT | Vehicles, weapons, foods, etc. (Not services) |
QUANTITY | Measurements, as of weight or distance |
TIME | Times smaller than a day |
WORK_OF_ART | Titles of books, songs, etc. |