使用spaCy NLP的简单Flask应用程序间歇性挂起 [英] Simple Flask app using spaCy NLP hangs intermittently
问题描述
我正在开发一个简单的Flask应用程序,该应用程序最终将变成一个简单的REST API,以便使用spaCy对给定的文本字符串进行命名实体识别.我有一个简单的原型,如下所示:
I'm working on a simple Flask app that will eventually turn into a simple REST API for doing named entity recognition using spaCy on a given text string. I have a simple prototype as follows:
from flask import Flask, render_template, request, json
import spacy
from spacy import displacy
def to_json(doc):
return [
{
'start': ent.start_char,
'end': ent.end_char,
'type': ent.label_,
'text': str(ent),
} for ent in doc.ents
]
nlp = spacy.load('en')
app = Flask(__name__)
@app.route('/')
def index():
return render_template('index.html')
@app.route('/demo', methods=['GET', 'POST'])
def demo():
q = request.values.get('text')
doc = nlp(q)
if request.values.get('type') == 'html':
return displacy.render(doc, style='ent', page=True)
else:
return app.response_class(
response=json.dumps(to_json(doc), indent=4),
status=200,
mimetype='text/string'
)
if __name__ == '__main__':
app.run(host='0.0.0.0')
Flask应用程序是使用Ubuntu上的Apache网络服务器提供的.我使用简单的Web表单向应用程序提交文本,它以HTML或JSON文本形式返回结果.
The Flask app is served using an Apache webserver on Ubuntu. I submit text to the app using a simple web form and it returns results as either HTML or JSON text.
我遇到的问题是该应用程序间歇性挂起...我无法弄清楚导致其挂起的模式.Apache错误日志中未显示任何内容,并且挂起的请求未显示在Apache访问日志中.如果在浏览器旋转时杀死服务器,浏览器会报告服务器提供了空响应.如果我重新启动服务器,错误日志将报告在SIGTERM之后没有退出1或2个子进程,并且必须发送SIGKILL.
The problem I am having is that the app hangs intermittently...I can't figure out a pattern of what causes it to hang. Nothing shows up in the Apache error log, and the request that hangs does not appear in the Apache access log. If I kill the server while the browser is spinning, the browser reports that the server provided an empty response. If I restart the server, the error log reports that 1 or 2 child processes don't exit after a SIGTERM, and a SIGKILL has to be sent.
一个可能的线索是服务器启动时错误日志报告以下内容:
One possible clue is that the error log reports the following when the server starts up:
[Wed Dec 06 20:19:33.753041 2017] [wsgi:warn] [pid 1822:tid 140029812619136] mod_wsgi: Compiled for Python/2.7.11.
[Wed Dec 06 20:19:33.753055 2017] [wsgi:warn] [pid 1822:tid 140029812619136] mod_wsgi: Runtime using Python/2.7.12.
另一个可能的线索是索引"路由(/)似乎从未挂起.但是"/demo"路由可以挂在 request.values.get('type')=='html'
if
语句的两个分支上.
Another possible clue is that the "index" route (/) never seems to hang. But the "/demo" route can hang for both branches of the request.values.get('type') == 'html'
if
statement.
我已经将Apache和mod_wsgi带出了循环,现在正在使用独立的Flask服务器运行该应用程序.该应用程序仍然偶尔会挂起...当它挂起时,我可以按Ctrl-c并始终返回以下内容作为最新代码:
I've taken Apache and mod_wsgi out of the loop, and am now running the app using the standalone Flask server. The app still hangs occasionally...when it does, I can press control-c and it consistently returns the following as the most recent code:
Exception happened during processing of request from ('xxx.xxx.xxx.xxx', 55608)
Traceback (most recent call last):
File "/usr/lib/python2.7/SocketServer.py", line 290, in _handle_request_noblock
self.process_request(request, client_address)
File "/usr/lib/python2.7/SocketServer.py", line 318, in process_request
self.finish_request(request, client_address)
File "/usr/lib/python2.7/SocketServer.py", line 331, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "/usr/lib/python2.7/SocketServer.py", line 652, in __init__
self.handle()
File "/usr/local/lib/python2.7/dist-packages/werkzeug/serving.py", line 232, in handle
rv = BaseHTTPRequestHandler.handle(self)
File "/usr/lib/python2.7/BaseHTTPServer.py", line 340, in handle
self.handle_one_request()
File "/usr/local/lib/python2.7/dist-packages/werkzeug/serving.py", line 263, in handle_one_request
self.raw_requestline = self.rfile.readline()
File "/usr/lib/python2.7/socket.py", line 451, in readline
data = self._sock.recv(self._rbufsize)
KeyboardInterrupt
----------------------------------------
按下Ctrl-c后,Flask被释放",然后返回我期望的结果.服务器将继续正常运行,并会接受更多请求,直到再次挂起.如果我等待足够长的时间,有时挂起的请求会自行返回.
After pressing control-c, Flask gets "released" and then returns the result I expect. The server continues on as normal and will accept more requests until it hangs again. Sometimes a hung request will come back on its own if I wait long enough.
这似乎越来越像Flask的问题(或我的使用方式).如果有人可以提供有关如何解决问题的建议,我将不胜感激!
This seems more and more like it's a problem with Flask (or how I'm using it). If anyone can provide advice on how to track down the problem, I would appreciate it!
推荐答案
这似乎是Spacy v2.0中的一个已知问题.我降级为Spacy v1.9之后,问题就消失了.
This appears to be a known issue in Spacy v2.0. The issue went away after I downgraded to Spacy v1.9.
有关更多详细信息,请参见:
For more details, see:
https://github.com/explosion/spaCy/issues/1571
和
https://github.com/explosion/spaCy/issues/1572
这篇关于使用spaCy NLP的简单Flask应用程序间歇性挂起的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!