向nltk停止列表添加单词 [英] Adding words to nltk stoplist
问题描述
我有一些代码可以从数据集中删除停用词,因为停用列表似乎也无法删除我想要的大多数单词,因此我希望将单词添加到此停用列表中,以便在这种情况下,它将删除它们. 我用来删除停用词的代码是:
I have some code that removes stop words from my data set, as the stop list doesn't seem to remove a majority of the words I would like it too, I'm looking to add words to this stop list so that it will remove them for this case. The code i'm using to remove stop words is:
word_list2 = [w.strip() for w in word_list if w.strip() not in nltk.corpus.stopwords.words('english')]
我不确定添加单词的正确语法,而且似乎无法在任何地方找到正确的语法.任何帮助表示赞赏.谢谢.
I'm unsure of the correct syntax for adding words and can't seem to find the correct one anywhere. Any help is appreciated. Thanks.
推荐答案
您可以简单地使用append方法向其中添加单词:
You can simply use the append method to add words to it:
stopwords = nltk.corpus.stopwords.words('english')
stopwords.append('newWord')
或扩展以添加单词列表,如查理在评论中所建议的那样.
or extend to append a list of words, as suggested by Charlie on the comments.
stopwords = nltk.corpus.stopwords.words('english')
newStopWords = ['stopWord1','stopWord2']
stopwords.extend(newStopWords)
这篇关于向nltk停止列表添加单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!