将NLTK Rake应用于数据帧中的每一行 [英] Apply NLTK Rake to each row in Dataframe

查看:128
本文介绍了将NLTK Rake应用于数据帧中的每一行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想应用Rake函数( https://pypi.org/project/rake-nltk/)到我数据框中的每一行.

I'd like to apply the Rake function (https://pypi.org/project/rake-nltk/) to each row in my dataframe.

我可以将函数分别应用于特定行,但不能将其附加到数据框.

I can apply the function individually to a specific row, but not append it to the dataframe.

这是我到目前为止所拥有的:

This is what I have so far:

r = Rake(ranking_metric= Metric.DEGREE_TO_FREQUENCY_RATIO, language= 'english', min_length=1, max_length=4)
r.extract_keywords_from_text(test.document[177])
r.get_ranked_phrases() #prints a list of keywords
test['keywords'] = test.applymap(lambda x: r.extract_keywords_from_text(x)) #trying to apply it to each row.

它只是无限期运行.我只想在数据框测试"中添加一个名为关键字"的新列,其中包含r.get_ranked_phrases()中的关键字列表.

It just runs indefinitely. I just want to append a new column to my dataframe 'test' called "keywords" that has the list of keywords from r.get_ranked_phrases().

推荐答案

r.extract_keywords_from_text(x)将返回无

r.extract_keywords_from_text(x) will return you None

import pandas as pd
from  rake_nltk import Rake  

r = Rake()    

df=pd.DataFrame(data = ['machine learning and fraud detection are a must learn',
                  'monte carlo method is great and so is hmm,pca, svm and neural net',
                  'clustering and cloud',
                  'logistical regression and data management and fraud detection'] ,columns = ['Comments'])


 def rake_implement(x,r):
     r.extract_keywords_from_text(x)
     return r.get_ranked_phrases()

df['new_col'] =df['Comments'].apply(lambda x: rake_implement(x,r))
print(df['new_col'])
#o/p
0      [must learn, machine learning, fraud detection]
1    [monte carlo method, neural net, svm, pca, hmm...
2                                  [clustering, cloud]
3    [logistical regression, fraud detection, data ...
Name: new_col, dtype: object  

这篇关于将NLTK Rake应用于数据帧中的每一行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆