Skip to content

tbarnier/spark-nlp-models

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spark NLP Models

Build Status Maven Central PyPI version Anaconda-Cloud License

We use this repository to maintain our releases of pre-trained pipelines and models for the Spark NLP library. For more info please take a look at our releases.

Project's website

Take a look at our official Spark NLP page: http://nlp.johnsnowlabs.com/ for user documentation and examples

Slack community channel

Join Slack

Table of contents

Pretrained Models

Public Models

pretrained(name, lang) function to use

English - Models

Model Name Build Description Notes Offline
LemmatizerModel (Lemmatizer) lemma_antbnc 2.0.2-2019.04.28 Download
PerceptronModel (POS) pos_anc 2.0.2-2019.04.30 Download
NerCrfModel (NER with GloVe) ner_crf 2.0.2-2019.04.30 Download
NerDLModel (NER with GloVe) ner_dl 2.0.2-2019.05.25 Download
NerDLModel (NER with GloVe) ner_dl_contrib 2.0.2-2019.04.29 Download
NerDLModel (NER with BERT) ner_dl_bert_base_cased 2.2.2-2019.09.07 Download
NerDLModel (OntoNotes with GloVe 100d) onto_100 2.1.0-2019.07.27 Download
NerDLModel (OntoNotes with GloVe 300d) onto_300 2.1.0-2019.07.27 Download
WordEmbeddings (GloVe) glove_100d 2.0.2-2019.04.29 Download
BertEmbeddings (base_uncased) bert_base_uncased 2.2.0-2019.08.24 Download
BertEmbeddings (base_cased) bert_base_cased 2.2.0-2019.08.24 Download
BertEmbeddings (large_uncased) bert_large_uncased 2.2.0-2019.08.24 Download
BertEmbeddings (large_cased) bert_large_cased 2.2.0-2019.08.24 Download
DeepSentenceDetector ner_dl_sentence 2.0.2-2019.04.30 Download
SymmetricDeleteModel (Spell Checker) spellcheck_sd 2.0.2-2019.04.30 Download
NorvigSweetingModel (Spell Checker) spellcheck_norvig 2.0.2-2019.04.30 Download
ViveknSentimentModel (Sentiment) sentiment_vivekn 2.0.2-2019.04.30 Download
DependencyParser (Dependency) dependency_conllu 2.0.8-2019.06.25 Download
TypedDependencyParser (Dependency) dependency_typed_conllu 2.0.8-2019.06.25 Download

French - Models

Model Name Build Notes Description Offline
LemmatizerModel (Lemmatizer) lemma 2.0.2-2019.04.29 Download
PerceptronModel (POS UD) pos_ud_gsd 2.0.2-2019.04.29 Download
NerDLModel (glove_840B_300) wikiner_840B_300 2.1.0-2019.07.13 Download
Feature Description
Lemma Trained by Lemmatizer annotator on lemmatization-lists by Michal Měchura
POS Trained by PerceptronApproach annotator on the Universal Dependencies
NER Trained by NerDLApproach annotator with Char CNNs - BiLSTM - CRF and GloVe Embeddings on the WikiNER corpus and supports the identification of PER, LOC, ORG and MISC entities

German - Models

Model Name Build Notes Description Offline
LemmatizerModel (Lemmatizer) lemma 2.0.8-2019.06.23 Download
PerceptronModel (POS UD) pos_ud_hdt 2.0.8-2019.06.22 Download
NerDLModel (glove_840B_300) wikiner_840B_300 2.1.0-2019.07.13 Download
Feature Description
Lemma Trained by Lemmatizer annotator on lemmatization-lists by Michal Měchura
POS Trained by PerceptronApproach annotator on the Universal Dependencies
NER Trained by NerDLApproach annotator with Char CNNs - BiLSTM - CRF and GloVe Embeddings on the WikiNER corpus and supports the identification of PER, LOC, ORG and MISC entities

Italian - Models

Model Name Build Notes Description Offline
LemmatizerModel (Lemmatizer) lemma_dxc 2.0.2-2019.04.29 Download
SentimentDetector (Sentiment) sentiment_dxc 2.0.2-2019.04.29 Download
PerceptronModel (POS UD) pos_ud_isdt 2.0.8-2019.06.10 Download
NerDLModel (glove_840B_300) wikiner_840B_300 2.1.0-2019.07.14 Download
Feature Description
Lemma Trained by Lemmatizer annotator on DXC Technology dataset
POS Trained by PerceptronApproach annotator on the Universal Dependencies
NER Trained by NerDLApproach annotator with Char CNNs - BiLSTM - CRF and GloVe Embeddings on the WikiNER corpus and supports the identification of PER, LOC, ORG and MISC entities

Multi-language

Model Name Build Notes Description Offline
WordEmbeddings (GloVe) glove_840B_300 2.0.2-2019.05.23 Download
WordEmbeddings (GloVe) glove_6B_300 2.0.2-2019.05.28 Download
BertEmbeddings (multi_cased) bert_multi_cased 2.2.0-2019.08.24 Download

Pretrained Pipelines

English - Pipelines

NOTE: noncontrib pipelines are compatible with Windows operating systems.

Pipeline Name Build Notes Description Offline
Explain Document ML explain_document_ml 2.1.0-2019.07.15 Download
Explain Document DL explain_document_dl 2.1.0-2019.08.02 Download
Explain Document DL Win explain_document_dl_noncontrib 2.1.0-2019.08.02 Download
Explain Document DL Fast explain_document_dl_fast 2.1.0-2019.07.12 Download
Explain Document DL Fast Win explain_document_dl_fast_noncontrib 2.1.0-2019.07.12 Download
Recognize Entities DL recognize_entities_dl 2.1.0-2019.07.12 Download
Recognize Entities DL Win recognize_entities_dl_noncontrib 2.1.0-2019.07.12 Download
OntoNotes Entities Small onto_recognize_entities_sm 2.1.0-2019.07.28 Download
OntoNotes Entities Large onto_recognize_entities_lg 2.1.0-2019.07.28 Download
Match Datetime match_datetime 2.1.0-2019.07.12 Download
Match Pattern match_pattern 2.1.0-2019.07.12 Download
Match Chunk match_chunks 2.2.0-2019.09.10 Download
Match Phrases match_phrases 2.1.0-2019.07.12 Download
Clean Stop clean_stop 2.1.0-2019.07.12 Download
Clean Pattern clean_pattern 2.1.0-2019.07.12 Download
Clean Slang clean_slang 2.1.0-2019.07.12 Download
Check Spelling check_spelling 2.1.0-2019.07.12 Download
Analyze Sentiment analyze_sentiment 2.1.0-2019.07.15 Download
Dependency Parse dependency_parse 2.1.0-2019.07.15 Download

French - Pipelines

Pipeline Name Build Notes Description Offline
Explain Document Large explain_document_lg 2.1.0-2019.07.15 Download
Explain Document Medium explain_document_md 2.1.0-2019.07.15 Download
Entity Recognizer Large entity_recognizer_lg 2.1.0-2019.07.15 Download
Entity Recognizer Medium entity_recognizer_md 2.1.0-2019.07.15 Download

Italian - Pipelines

Pipeline Name Build Notes Description Offline
Explain Document Large explain_document_lg 2.1.0-2019.07.15 Download
Explain Document Medium explain_document_md 2.1.0-2019.07.15 Download
Entity Recognizer Large entity_recognizer_lg 2.1.0-2019.07.15 Download
Entity Recognizer Medium entity_recognizer_md 2.1.0-2019.07.15 Download

Licensed Enterprise

pretrained(name, lang) function to use

Pretrained Models - Spark NLP For Healthcare

English

It is required to specify 3rd argument to pretrained(name, lang, loc) function (location) to add the location of these

Model Name Build Notes Description location
NerDLModel ner_clinical 2.0.2-2019.04.30 clinical/models
NerDLModel ner_clinical_noncontrib 2.3.0-2019.11.14 clinical/models
NerDLModel ner_bionlp 2.3.4-2019.11.27 link clinical/models
NerDLModel ner_bionlp_noncontrib 2.3.4-2019.11.27 link clinical/models
NerDLModel deidentify_dl 2.0.2-2019.06.04 clinical/models
AssertionDLModel assertion_dl 2.3.4-2019.11.27 clinical/models
AssertionLogRegModel assertion_ml 2.3.4-2019.11.27 clinical/models
DeIdentificationModel deidentify_rb 2.0.2-2019.06.04 clinical/models
WordEmbeddingsModel embeddings_clinical 2.0.2-2019.05.21 clinical/models
WordEmbeddingsModel embeddings_icdoem 2.3.2-2019.11.12 link clinical/models
BertEmbeddingsModel biobert_pubmed_cased 2.3.1-2019.11.23 link clinical/models
BertEmbeddingsModel biobert_pmc_cased 2.3.1-2019.11.23 link clinical/models
BertEmbeddingsModel biobert_pubmed_pmc_cased 2.3.1-2019.11.23 link clinical/models
BertEmbeddingsModel biobert_clinical_cased 2.3.1-2019.11.23 link clinical/models
BertEmbeddingsModel biobert_discharge_cased 2.3.1-2019.11.23 link clinical/models
PerceptronModel pos_clinical 2.0.2-2019.04.30 clinical/models
EntityResolverModel resolve_icd10 2.0.2-2019.06.05 clinical/models
EntityResolverModel resolve_icd10cm_cl_em 2.0.8-2019.06.28 clinical/models
EntityResolverModel resolve_icd10pcs_cl_em 2.0.8-2019.06.28 clinical/models
EntityResolverModel resolve_cpt_cl_em 2.0.8-2019.06.28 clinical/models
EntityResolverModel resolve_icd10cm_icdem 2.2.0-2019.10.03 link clinical/models
EntityResolverModel resolve_icd10cm_icdoem 2.3.2-2019.11.13 link clinical/models
EntityResolverModel resolve_cpt_icdoem 2.3.2-2019.11.13 link clinical/models
EntityResolverModel resolve_icdo_icdoem 2.3.2-2019.11.14 clinical/models
ContextSpellCheckerModel spellcheck_dl 2.2.2-2019.11.12 clinical/models
TextMatcherModel textmatch_icdo_ner_n2c4 2.3.3-2019.11.22 link clinical/models
TextMatcherModel textmatch_cpt_token_n2c1 2.3.3-2019.11.25 link clinical/models
Disambiguator people_disambiguator 2.3.4-2019.11.27 clinical/models
ChunkEntityResolverModel chunkresolve_icdo_icdoem 2.3.3-2019.11.25 clinical/models
ChunkEntityResolverModel chunkresolve_cpt_icdoem 2.3.3-2019.11.25 clinical/models

Contact

[email protected]

John Snow Labs

http://johnsnowlabs.com

About

Models and Pipelines for the Spark NLP library

Resources

License

Stars

Watchers

Forks

Packages

No packages published