Fasttext – Library for efficient text classification and representation learning
Download pre-trained fasttext word vectors
Pre-trained word vectors learned on different sources can be downloaded below:
- wiki-news-300d-1M.vec.zip: 1 million word vectors trained on Wikipedia 2017, UMBC webbase corpus and statmt.org news dataset (16B tokens).
- wiki-news-300d-1M-subword.vec.zip: 1 million word vectors trained with subword infomation on Wikipedia 2017, UMBC webbase corpus and statmt.org news dataset (16B tokens).
- crawl-300d-2M.vec.zip: 2 million word vectors trained on Common Crawl (600B tokens).
These models have state of the art performance on several benchmarks (up to 88% accuracy on the popular word analogy dataset). https://www.facebook.com/groups/1174547215919768/permalink/1631075090266976/