如何在 pandas 数据框中拆分文本数据并计算出现次数? [英] How to split text data and count number of occurrences in pandas dataframe?
本文介绍了如何在 pandas 数据框中拆分文本数据并计算出现次数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我在数据框中有以下格式的数据:
I have data in dataframe in the following format:
df=pd.DataFrame([
[42,{"tags":["illustration","logo","design","ui"]}],
[81,{"tags":["typography","icon","vector","ux"]}],
[98,{"tags":["branding","app"]}],
[52,{"tags":["animation","web","flat"]}],
[17,{"tags":["type","lettering"]}],
[37,{"tags":["illustration","typography","branding","typography","branding"]}],
[63,{"tags":["logo","icon","app","web","lettering"]}],
[47,{"tags":["ui","ux"]}],
[6,{"tags":["design","vector","icon","flat","lettering","branding","app"]}],
[53,{"tags":["ui","ux","lettering","branding","app","animation","web","flat"]}],
[64,{"tags":["branding","app","typography","branding"]}],
[89,{"tags":["typography","branding","ux","lettering","branding"]}]
],columns=["_id","tags"])
我想用特定数量的标签来计算"id"的数量(此数量的分布),因此对于上面的数据,它应该是:
I want to count the number of 'id' with specific number of tags (distribution of this number), so for the data above it would be:
Number of posts Number of tags
3 2
1 3
3 4
3 5
1 7
对于该任务,我应该如何处理给定格式的文本标签?
How should I handle the text tags in the given format for this task?
谢谢
推荐答案
使用DataFrame
构造函数+ Counter
具有list
的理解,对于每个tags
的计数长度为list
s:
Use DataFrame
constructor + Counter
with list
comprehension for count lengths of each tags
as list
s:
from collections import Counter
c = Counter([len(x['tags']) for x in df['tags']])
df = pd.DataFrame({'Number of posts':list(c.values()), ' Number of tags ': list(c.keys())})
print (df)
Number of posts Number of tags
0 3 4
1 3 2
2 1 3
3 3 5
4 1 7
5 1 8
或将apply
与 value_counts
一起使用:
Or use apply
with value_counts
:
df = (df['tags'].apply(lambda x: len(x['tags']))
.value_counts()
.rename_axis('Number of tags')
.reset_index(name='Number of posts')
[['Number of posts','Number of tags']])
print (df)
Number of posts Number of tags
0 3 5
1 3 4
2 3 2
3 1 8
4 1 7
5 1 3
这篇关于如何在 pandas 数据框中拆分文本数据并计算出现次数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文