向nltk停止列表添加单词 [英] Adding words to nltk stoplist

查看:228
本文介绍了向nltk停止列表添加单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些代码可以从数据集中删除停用词,因为停用列表似乎也无法删除我想要的大多数单词,因此我希望将单词添加到此停用列表中,以便在这种情况下,它将删除它们. 我用来删除停用词的代码是:

I have some code that removes stop words from my data set, as the stop list doesn't seem to remove a majority of the words I would like it too, I'm looking to add words to this stop list so that it will remove them for this case. The code i'm using to remove stop words is:

word_list2 = [w.strip() for w in word_list if w.strip() not in nltk.corpus.stopwords.words('english')]

我不确定添加单词的正确语法,而且似乎无法在任何地方找到正确的语法.任何帮助表示赞赏.谢谢.

I'm unsure of the correct syntax for adding words and can't seem to find the correct one anywhere. Any help is appreciated. Thanks.

推荐答案

您可以简单地使用append方法向其中添加单词:

You can simply use the append method to add words to it:

stopwords = nltk.corpus.stopwords.words('english')
stopwords.append('newWord')

或扩展以添加单词列表,如查理在评论中所建议的那样.

or extend to append a list of words, as suggested by Charlie on the comments.

stopwords = nltk.corpus.stopwords.words('english')
newStopWords = ['stopWord1','stopWord2']
stopwords.extend(newStopWords)

这篇关于向nltk停止列表添加单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆