计算Pandas DataFrame列中字符串中某个位置的字符频率 [英] Count the frequency of characters at a position in a string in a Pandas DataFrame column

查看：375 发布时间：2020/5/24 4:14:25 python pandas

本文介绍了计算Pandas DataFrame列中字符串中某个位置的字符频率的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个与df['columnA'].value_counts()方法有关的问题，还有以前的帖子在这里:

I have a question related to the df['columnA'].value_counts() method and a previous post here: Count frequency of values in pandas DataFrame column

以该示例DataFrame为例:

Take this example DataFrame:

fake_data = {'columnA': ['XAVY', 'XAVY', 'XAVY', 'XAVY', 'XAVY', 'AXYV', 'AXYV', 'AXYV', 'AXYV', 'AXYV', 'AXYV']}
df = pd.DataFrame(fake_data, columns = ['columnA'])
df

我正在尝试确定此列字符串中每个位置的每个字母(X，A，V，Y)的频率.

在此示例中，位置0将为54％A，46％X，位置3将为46％Y，54％V ...依此类推.

In this example, position 0 would be 54% A, 46% X, position 3 would be 46% Y, 54% V...and so on.

推荐答案

也许有帮助:

new_data = df.columnA.str.split('',n=4, expand=True).drop(0, axis=1)
stats = new_data.apply(pd.Series.value_counts)
stats = stats.apply(lambda x: (x/x.sum())*100).round(2).fillna(0)
print(stats)

输出

    1      2    3     4
A   54.54 45.45 0     0
V   0     0     45.45 54.54
X   45.45 54.54 0     0
Y   0     0     54.54 45.45

这篇关于计算Pandas DataFrame列中字符串中某个位置的字符频率的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

计算Pandas DataFrame列中字符串中某个位置的字符频率 [英] Count the frequency of characters at a position in a string in a Pandas DataFrame column

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

计算Pandas DataFrame列中字符串中某个位置的字符频率 [英] Count the frequency of characters at a position in a string in a Pandas DataFrame column

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭