POS标记性能降低.我可以做一些预热吗? [英] Slow performance of POS tagging. Can I do some kind of pre-warming?

查看：112 发布时间：2020/5/18 1:12:07 python nltk

本文介绍了POS标记性能降低.我可以做一些预热吗?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用NLTK来对Web请求中的tweet标签进行POS标签.如您所知，Django会为每个请求实例化一个请求处理程序.

I am using NLTK to POS-tag hundereds of tweets in a web request. As you know, Django instantiates a request handler for each request.

我注意到了这一点:对于一个请求(〜200条推文)，第一个推文需要〜18秒来标记，而所有后续推文都需要〜120毫秒来标记.我该怎么做才能加快这一过程?

I noticed this: for a request (~200 tweets), the first tweet needs ~18 seconds to tag, while all subsequent tweets need ~120 milliseconds to tag. What can I do to speed up the process?

我可以执行预热请求"，以便为每个请求加载模块数据吗?

Can I do a "pre-warming request" so that the module data is already loaded for each request?

class MyRequestHandler(BaseHandler):
    def read(self, request): #this runs for a GET request
        #...in a loop:
            tokens = nltk.word_tokenize( tweet)
            tagged = nltk.pos_tag( tokens)

推荐答案

最初的18秒钟是POS标记器从磁盘解压到RAM中.如果要解决此问题，请在请求函数之外自行加载标记器.

Those first 18 seconds are the POS tagger being unpickled from disk into RAM. If you want to get around this, load the tagger yourself outside of a request function.

import nltk.data, nltk.tag
tagger = nltk.data.load(nltk.tag._POS_TAGGER)

，然后将nltk.pos_tag替换为tagger.tag. 权衡是应用程序启动现在将花费+18秒.

And then replace nltk.pos_tag with tagger.tag. The tradeoff is that app startup will now take +18seconds.

这篇关于POS标记性能降低.我可以做一些预热吗?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

POS标记性能降低.我可以做一些预热吗? [英] Slow performance of POS tagging. Can I do some kind of pre-warming?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

POS标记性能降低.我可以做一些预热吗? [英] Slow performance of POS tagging. Can I do some kind of pre-warming?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭