NLTK WordNetLemmatizer中的多线程? [英] Multi Threading in NLTK WordNetLemmatizer?
问题描述
我正在尝试使用多线程来加快进程.我正在使用wordnetlemmatizer对词进行词素化,并且sendiwordnet可以进一步使用这些词来计算文本的情绪.我正在使用WordNetLemmatizer的情绪分析功能如下:
I am trying to use multi threading to speed up the process. I am using the wordnetlemmatizer to lemmatize the words and those words can be further used by sentiwordnet to calculate the sentiment of the text. My Sentiment analysis function where I am using the WordNetLemmatizer is as follows:
import nltk
from nltk.corpus import sentiwordnet as swn
def SentimentA(doc, file_path):
sentences = nltk.sent_tokenize(doc)
# print(sentences)
stokens = [nltk.word_tokenize(sent) for sent in sentences]
taggedlist = []
for stoken in stokens:
taggedlist.append(nltk.pos_tag(stoken))
wnl = nltk.WordNetLemmatizer()
score_list = []
for idx, taggedsent in enumerate(taggedlist):
score_list.append([])
for idx2, t in enumerate(taggedsent):
newtag = ''
lemmatized = wnl.lemmatize(t[0])
if t[1].startswith('NN'):
newtag = 'n'
elif t[1].startswith('JJ'):
newtag = 'a'
elif t[1].startswith('V'):
newtag = 'v'
elif t[1].startswith('R'):
newtag = 'r'
else:
newtag = ''
if (newtag != ''):
synsets = list(swn.senti_synsets(lemmatized, newtag))
score = 0
if (len(synsets) > 0):
for syn in synsets:
score += syn.pos_score() - syn.neg_score()
score_list[idx].append(score / len(synsets))
return SentiCal(score_list)
运行4个线程后,前三个线程出现以下错误,最后一个线程运行正常.
After running 4 threads, I am getting the following error for the first 3 threads and the last thread is working perfectly.
AttributeError: 'WordNetCorpusReader' object has no attribute '_LazyCorpusLoader__args'
我已经尝试按照此NLTK中的说明在本地导入NLTK包问题 并尝试了页面上给出的解决方案.
I have already tried importing the NLTK package locally as given in this NLTK issue and tried the solution given on this page.
推荐答案
快速攻克:
import nltk
from nltk.corpus import sentiwordnet as swn
# Do this first, that'll do something eval()
# to "materialize" the LazyCorpusLoader
next(swn.all_senti_synsets())
# Your other code here.
稍后会有更多详细信息...仍在输入
More details later... Still typing
这篇关于NLTK WordNetLemmatizer中的多线程?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!