Fasttext .vec和.bin文件之间的区别 [英] Difference between Fasttext .vec and .bin file
问题描述
我最近下载了英语的Fasttext预训练模型.我有两个文件:
I recently downloaded fasttext pretrained model for english. I got two files:
- wiki.en.vec
- wiki.en.bin
我不确定两个文件有什么区别?
I am not sure what is the difference between the two files?
推荐答案
.vec
文件仅包含明文形式的聚合词向量. .bin
文件 还包含模型参数,并且至关重要的是,还包含所有n-gram的向量.
The .vec
files contain only the aggregated word vectors, in plain-text. The .bin
files in addition contain the model parameters, and crucially, the vectors for all the n-grams.
因此,如果您想使用这些n-gram(FastText著名的子词信息")对您没有训练过的单词进行编码,则需要找到可以处理FastText 的API.bin
文件(不过,大多数文件仅支持 .vec
文件...).
So if you want to encode words you did not train with using those n-grams (FastText's famous "subword information"), you need to find an API that can handle FastText .bin
files (most only support the .vec
files, however...).
这篇关于Fasttext .vec和.bin文件之间的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!