Skip to content

Commit

Permalink
Tokenizer pipe() method for duck-type compatibility with Spacy Tokeni…
Browse files Browse the repository at this point in the history
…zer (#8)

Tokenizer pipe() method for duck-type compatibility with Spacy Tokenizer
  • Loading branch information
ines authored Apr 26, 2019
2 parents fc1c1df + bc549b3 commit b972009
Showing 1 changed file with 9 additions and 0 deletions.
9 changes: 9 additions & 0 deletions spacy_stanfordnlp/language.py
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,15 @@ def __call__(self, text):
doc.is_parsed = True
return doc

def pipe(self, texts):
"""Tokenize a stream of texts.
texts: A sequence of unicode texts.
YIELDS (Doc): A sequence of Doc objects, in order.
"""
for text in texts:
yield self(text)

def get_tokens_with_heads(self, snlp_doc):
"""Flatten the tokens in the StanfordNLP Doc and extract the token indices
of the sentence start tokens to set is_sent_start.
Expand Down

0 comments on commit b972009

Please sign in to comment.