获取关于每一行的统计信息并将其放入新的列。 pandas [英] Getting stats about each row and putting them into a new column. Pandas
问题描述
所以我有一个数据框与一些值。这是我的数据框:
So I have a dataframe with some values. This is my dataframe:
|in|x|y|z|
+--+-+-+-+
| 1|a|a|b|
| 2|a|b|b|
| 3|a|b|c|
| 4|b|b|c|
我想获取每行唯一值的数量,不等于的值的数量在列x中进行计值。结果应如下所示:
I would like to get number of unique values of each row, and number of values that are not equal to value in column x. The result should look like this:
|in | x | y | z | count of not x |unique|
+---+---+---+---+---+---+
| 1 | a | a | b | 1 | 2 |
| 2 | a | b | b | 2 | 2 |
| 3 | a | b | c | 2 | 3 |
| 4 | b | b |nan| 0 | 1 |
我可以在这里提出一些肮脏的决定。但是,这样做一定要有一些优雅的方式。我的想法是转移dropduplicates(这不工作在系列);变成数组和.unique();我想逃避的df.iterrows()
I could come up with some dirty decisions here. But there must be some elegant way of doing that. My mind is turning around dropduplicates(that does not work on series); turning into array and .unique(); df.iterrows() that I want to evade; and .apply on each row.
推荐答案
以下是使用申请的解决方案。
Here are solutions using apply.
df['count of not x'] = df.apply(lambda x: (x[['y','z']] != x['x']).sum(), axis=1)
df['unique'] = df.apply(lambda x: x[['x','y','z']].nunique(), axis=1)
一个非适用的解决方案来获取不是x的数量:
A non-apply solution for getting count of not x:
df['count of not x'] = (~df[['y','z']].isin(df['x'])).sum(1)
不能想到任何伟大的独特的。
Can't think of anything great for unique. This uses apply, but may be faster, depending on the shape of the data.
df['unique'] = df[['x','y','z']].T.apply(lambda x: x.nunique())
这篇关于获取关于每一行的统计信息并将其放入新的列。 pandas 的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!