在setup.py脚本中安装nltk数据依赖项 [英] Installing nltk data dependencies in setup.py script
问题描述
我在项目中将NLTK与wordnet一起使用.我使用pip在PC上手动进行了安装:
pip3 install nltk --user
在终端中,然后在nltk.download()
在python shell中下载wordnet.
I use NLTK with wordnet in my project. I did the installation manually on my PC, with pip:
pip3 install nltk --user
in a terminal, then nltk.download()
in a python shell to download wordnet.
我想使用setup.py
文件自动执行这些操作,但是我不知道安装Wordnet的好方法.
I want to automatize these with a setup.py
file, but I don't know a good way to install wordnet.
目前,在调用setup
之后,我有这段代码("nltk"
在对setup
的调用的install_requires
列表中):
For the moment, I have this piece of code after the call to setup
("nltk"
is in the install_requires
list of the call to setup
):
import sys
if 'install' in sys.argv:
import nltk
nltk.download("wordnet")
有更好的方法吗?
推荐答案
我设法通过用自己的Install
类覆盖cmdclass
来在setup.py中安装NLTK数据:
I managed to install the NLTK data in setup.py by overriding cmdclass
with my own Install
class :
from setuptools import setup, find_packages
from setuptools.command.install import install as _install
class Install(_install):
def run(self):
_install.do_egg_install(self)
import nltk
nltk.download("popular")
setup(...
cmdclass={'install': Install},
...
install_requires=[
'nltk',
],
setup_requires=['nltk']
...
)
在run()
方法中使用方法do_egg_install()
来确保在调用import nltk
之前已安装nltk,这一点很重要(另请参见
It is important to use the method do_egg_install()
in your run()
method to make sure nltk gets installed, before import nltk
is called (See also here python setuptools install_requires is ignored when overriding cmdclass). Also don't forget to add nltk
to setup_requires
.
这篇关于在setup.py脚本中安装nltk数据依赖项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!