CoreNLP服务器的UTF-8问题 [英] UTF-8 issue with CoreNLP server

查看：102 发布时间：2020/7/13 6:36:30 utf-8 server stanford-nlp

本文介绍了CoreNLP服务器的UTF-8问题的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

I run a Stanford CoreNLP Server with the following command:

java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer

我尝试解析句子Who was Darth Vader’s son?.请注意，Vader后面的撇号不是ASCII字符.

I try to parse the sentence Who was Darth Vader’s son?. Note that the apostrophe behind Vader is not an ASCII character.

在线演示成功解析了该句子:

我在本地主机上运行的服务器失败:

The server I run on localhost fails:

我还尝试使用Python执行查询.

I also tried to perform the query using Python.

import requests
url = 'http://localhost:9000/'
sentence = 'Who was Darth Vader’s son?'
r=requests.post(url, params={'properties' : '{"annotators": "tokenize,ssplit,pos,ner", "outputFormat": "json"}'}, data=sentence.encode('utf8'))
tree = r.json()

最后一条命令引发异常:

The last command raises an exception:

ValueError: Invalid control character at: line 1 column 1172 (char 1171)

但是，我注意到在文本中出现了字符\x00(即r.text).如果删除它们，则json解析成功:

However, I noticed occurrences of the character \x00 in the text (i.e. r.text). If I remove them, the json parsing succeeds:

import json
tree = json.loads(r.text.replace('\x00', ''))

最后，即使我没有使用选项-strict来运行服务器，r.encoding还是ISO-8859-1.请注意，如果我手动将其替换为UTF-8，则不会更改任何内容.

Finally, r.encoding is ISO-8859-1, even though I did not use the option -strict to run the server. Note that it does not change anything if I manually replace it by UTF-8.

如果我运行相同的代码，将url = 'http://localhost:9000/'替换为url = 'http://corenlp.run/'，则一切成功.调用r.json()返回一个dict，r.encoding确实是UTF-8，并且文本中没有字符\x00.

If I run the same code replacing url = 'http://localhost:9000/' by url = 'http://corenlp.run/', then everything succeeds. The call r.json() returns a dict, r.encoding is indeed UTF-8, and no character \x00 is in the text.

我运行的CoreNLP服务器怎么了?

What is wrong with the CoreNLP server I run?

CoreNLP服务器的UTF-8问题 [英] UTF-8 issue with CoreNLP server

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

CoreNLP服务器的UTF-8问题 [英] UTF-8 issue with CoreNLP server

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭