Skip to content

Commit

Permalink
Fix empty vocabulary issue.
Browse files Browse the repository at this point in the history
Summary:
Training with small files and large -minCount values can lead to empty vocabulary.
In that case, fastText hangs indefinitely.

Reviewed By: piotr-bojanowski

Differential Revision: D3810851

fbshipit-source-id: 3be1a4da943c07845d00377f9d7a3d6b93f3ddd4
  • Loading branch information
Edouard Grave authored and Facebook Github Bot 3 committed Sep 2, 2016
1 parent 763901c commit 602355a
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions src/dictionary.cc
Original file line number Diff line number Diff line change
Expand Up @@ -185,6 +185,10 @@ void Dictionary::readFromFile(std::istream& in) {
initNgrams();
std::cout << "Number of words: " << nwords_ << std::endl;
std::cout << "Number of labels: " << nlabels_ << std::endl;
if (size_ == 0) {
std::cerr << "Empty vocabulary. Try a smaller -minCount value." << std::endl;
exit(EXIT_FAILURE);
}
}

void Dictionary::threshold(int64_t t) {
Expand Down

0 comments on commit 602355a

Please sign in to comment.