TreeTagger安装成功,但无法打开.par文件 [英] TreeTagger installation successful but cannot open .par file
问题描述
有人知道如何解决TreeTagger
中的这种文件读取错误吗?是用于POS
标记,词条化和语句块化的常见自然语言处理工具?
Do anyone know how to resolve this file reading error in TreeTagger
that is a common Natural Language Processing tool used to POS
tag, lemmatize and chunk sentences?
alvas@ikoma:~/treetagger$ echo 'Hello world!' | cmd/tree-tagger-english
reading parameters ...
ERROR: Can't open for reading: /home/alvas/treetagger/lib/english.par
aborted.
我没有遇到 http: //www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/#Linux ):
I didn't encounter any possible installation problems as hinted on http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/installation-hints.txt. I've followed the instructions on the webpage and it's installed properly (http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/#Linux):
alvas@ikoma:~$ mkdir treetagger
alvas@ikoma:~$ cd treetagger
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/tree-tagger-linux-3.2.tar.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/tagger-scripts.tar.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/install-tagger.sh
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/dutch-par-linux-3.2-utf8.bin.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/german-par-linux-3.2-utf8.bin.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/italian-par-linux-3.2-utf8.bin.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/spanish-par-linux-3.2-utf8.bin.gz
alvas@ikoma:~/treetagger$ wget ftp://ftp.ims.uni-stuttgart.de/pub/corpora/french-par-linux-3.2-utf8.bin.gz
alvas@ikoma:~/treetagger$ sh install-tagger.sh
Linux version of TreeTagger installed.
Tagging scripts installed.
German parameter file (Linux, UTF8) installed.
German chunker parameter file (Linux) installed.
French parameter file (Linux, UTF8) installed.
French chunker parameter file (Linux, UTF8) installed.
Italian parameter file (Linux, UTF8) installed.
Spanish parameter file (Linux, UTF8) installed.
Dutch parameter file (Linux, UTF8) installed.
Path variables modified in tagging scripts.
You might want to add /home/alvas/treetagger/cmd and /home/alvas/treetagger/bin to the PATH variable so that you do not need to specify the full path to run the tagging scripts.
但是当我尝试测试该软件时,出现以下错误:
alvas@ikoma:~/treetagger$ echo 'Hello world!' | cmd/tree-tagger-english
reading parameters ...
ERROR: Can't open for reading: /home/alvas/treetagger/lib/english.par
aborted.
alvas@ikoma:~/treetagger$ echo 'Das ist ein Test.' | cmd/tagger-chunker-german
ERROR: Can't open for reading: /home/alvas/treetagger/lib/german-chunker.par
aborted.
ERROR: Can't open for reading: /home/alvas/treetagger/lib/german.par
aborted.
reading parameters ...
ERROR: Can't open for reading: /home/alvas/treetagger/lib/german.par
aborted.
推荐答案
我认为有两个问题:首先,脚本的名称应带有"-utf8",例如cmd/tagger-chunker-german-utf8
,因为您下载了UTF-8数据.其次,标记和分块分别需要一个数据文件.参见主页上的用于PC的参数文件"和用于PC的Chunker参数文件"部分-从这两个部分下载文件,然后重新执行install-tagger.sh
.
I think there are two problems: first, the scripts should have "-utf8" in their name, e.g. cmd/tagger-chunker-german-utf8
, because you downloaded the UTF-8 data. Second, tagging and chunking requires a data file each. See the homepage which has a section "Parameter files for PC" and "Chunker parameter files for PC" - download the files from both sections, then re-execute install-tagger.sh
.
这篇关于TreeTagger安装成功,但无法打开.par文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!