提取非内容英语单词字符串-python [英] Extract non-content English language words string - python

查看：125 发布时间：2020/5/18 1:23:13 python python-2.7 nltk wordnet

本文介绍了提取非内容英语单词字符串-python的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在研究Python脚本，在该脚本中，我想从字符串中删除常见的英语单词，例如"the"，"an"，"and"，"for"以及更多.目前，我所做的是已在本地列出所有此类单词，并且我只是调用remove()将其从字符串中删除.但是我想在这里找到一些Python风格的方法来实现这一目标.已经阅读过有关nltk和wordnet的信息，但完全不知道那是我应该使用的内容以及如何使用它.

I am working on Python script in which I want to remove the common english words like "the","an","and","for" and many more from a String. Currently what I have done is I have made a local list of all such words and I just call remove() to remove them from the string. But I want here some pythonish way to achieve this. Have read about nltk and wordnet but totally clueless about that's what I should use and how to use it.

修改

好吧，我不明白为什么标记为重复的问题，因为我的问题丝毫不意味着我了解停用词，现在我只想知道如何使用它.....问题是关于我的问题可以在我的场景中使用，答案是停用词...但是当我发布此问题时，我对停用词一无所知.

Well I don't understand why marked as duplicate as my question does not in any way mean that I know about Stop words and now I just want to know how to use it.....the question is about what I can use in my scenario and answer to that was stop words...but when I posted this question I din't know anything about stop words.

推荐答案

我发现我要找的是这个东西:

I have found that what I was looking for is this:

from nltk.corpus import stopwords
my_stop_words = stopwords.words('english')

现在，我可以从列表/字符串中找到匹配项my_stop_words的单词中删除或替换单词.

Now I can remove or replace the words from my list/string where I find the match in my_stop_words which is a list.

要执行此操作，我必须下载python的NLTK，并使用其下载程序下载了stopwords软件包.

For this to work I had to download the NLTK for python and the using its downloader I downloaded stopwords package.

它还包含许多其他软件包，这些软件包可以在不同情况下用于NLP，例如words,brown,wordnet etc.

It also contains many other packages which can be used in different situations for NLP like words,brown,wordnet etc.

这篇关于提取非内容英语单词字符串-python的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

提取非内容英语单词字符串-python [英] Extract non-content English language words string - python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

提取非内容英语单词字符串-python [英] Extract non-content English language words string - python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭