在 Heroku 上找不到资源“语料库/wordnet" [英] Resource 'corpora/wordnet' not found on Heroku

查看:31
本文介绍了在 Heroku 上找不到资源“语料库/wordnet"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试让 NLTK 和 wordnet 在 Heroku 上工作.我已经做了

I'm trying to get NLTK and wordnet working on Heroku. I've already done

heroku run python
nltk.download()
  wordnet
pip install -r requirements.txt

但我收到此错误:

Resource 'corpora/wordnet' not found.  Please use the NLTK
  Downloader to obtain the resource:  >>> nltk.download()
  Searched in:
    - '/app/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'

然而,我查看了/app/nltk_data 并且它在那里,所以我不确定发生了什么.

Yet, I've looked at in /app/nltk_data and it's there, so I'm not sure what's going on.

推荐答案

我刚刚遇到了同样的问题.最终对我有用的是在应用程序的文件夹本身中创建一个nltk_data"目录,将语料库下载到该目录并在我的代码中添加一行,让 nltk 知道在该目录中查找.您可以在本地完成所有这些操作,然后将更改推送到 Heroku.

I just had this same problem. What ended up working for me is creating an 'nltk_data' directory in the application's folder itself, downloading the corpus to that directory and adding a line to my code that lets the nltk know to look in that directory. You can do this all locally and then push the changes to Heroku.

因此,假设我的 Python 应用程序位于名为myapp/"的目录中

So, supposing my python application is in a directory called "myapp/"

第一步:创建目录

cd myapp/
mkdir nltk_data

第 2 步:将语料库下载到新目录

python -m nltk.downloader

这将弹出 nltk 下载器.将您的下载目录设置为whatever_the_absolute_path_to_myapp_is/nltk_data/.如果您使用的是 GUI 下载器,则下载目录是通过 UI 底部的文本字段设置的.如果您使用命令行一,则在配置菜单中进行设置.

This'll pop up the nltk downloader. Set your Download Directory to whatever_the_absolute_path_to_myapp_is/nltk_data/. If you're using the GUI downloader, the download directory is set through a text field on the bottom of the UI. If you're using the command line one, you set it in the config menu.

一旦下载器知道指向您新创建的 nltk_data 目录,请下载您的语料库.

Once the downloader knows to point to your newly created nltk_data directory, download your corpus.

或者从 Python 代码一步:

Or in one step from Python code:

nltk.download("wordnet", "whatever_the_absolute_path_to_myapp_is/nltk_data/")

第 3 步:让 nltk 知道去哪里寻找

ntlk 查找数据、资源等.在 nltk.data.path 变量中指定的位置.您需要做的就是将 nltk.data.path.append('./nltk_data/') 添加到实际使用 nltk 的 python 文件中,它将在其中查找语料库、标记器等除了默认路径.

ntlk looks for data,resources,etc. in the locations specified in the nltk.data.path variable. All you need to do is add nltk.data.path.append('./nltk_data/') to the python file actually using nltk, and it will look for corpora, tokenizers, and such in there in addition to the default paths.

第 4 步:将其发送到 Heroku

git add nltk_data/
git commit -m 'super useful commit message'
git push heroku master

那应该可行!无论如何,它对我有用.值得注意的一件事是,从执行 nltk 内容的 python 文件到 nltk_data 目录的路径可能会有所不同,具体取决于您构建应用程序的方式,因此在执行 nltk.data.path.append 时只需考虑这一点('path_to_nltk_data')

That should work! It did for me anyway. One thing worth noting is that the path from the python file executing nltk stuff to the nltk_data directory may be different depending on how you've structured your application, so just account for that when you do nltk.data.path.append('path_to_nltk_data')

这篇关于在 Heroku 上找不到资源“语料库/wordnet"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆