获取关于每一行的统计信息并将其放入新的列。 pandas [英] Getting stats about each row and putting them into a new column. Pandas

查看:76
本文介绍了获取关于每一行的统计信息并将其放入新的列。 pandas 的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我有一个数据框与一些值。这是我的数据框:

So I have a dataframe with some values. This is my dataframe:

|in|x|y|z|
+--+-+-+-+
| 1|a|a|b|
| 2|a|b|b|
| 3|a|b|c|
| 4|b|b|c|

我想获取每行唯一值的数量,不等于的值的数量在列x中进行计值。结果应如下所示:

I would like to get number of unique values of each row, and number of values that are not equal to value in column x. The result should look like this:

|in | x | y | z | count of not x |unique|
+---+---+---+---+---+---+
| 1 | a | a | b | 1 | 2 |
| 2 | a | b | b | 2 | 2 |
| 3 | a | b | c | 2 | 3 |
| 4 | b | b |nan| 0 | 1 |

我可以在这里提出一些肮脏的决定。但是,这样做一定要有一些优雅的方式。我的想法是转移dropduplicates(这不工作在系列);变成数组和.unique();我想逃避的df.iterrows()

I could come up with some dirty decisions here. But there must be some elegant way of doing that. My mind is turning around dropduplicates(that does not work on series); turning into array and .unique(); df.iterrows() that I want to evade; and .apply on each row.

推荐答案

以下是使用申请的解决方案。

Here are solutions using apply.

df['count of not x'] = df.apply(lambda x: (x[['y','z']] != x['x']).sum(), axis=1)
df['unique'] = df.apply(lambda x: x[['x','y','z']].nunique(), axis=1)

一个非适用的解决方案来获取不是x的数量:

A non-apply solution for getting count of not x:

df['count of not x'] = (~df[['y','z']].isin(df['x'])).sum(1)

不能想到任何伟大的独特的。

Can't think of anything great for unique. This uses apply, but may be faster, depending on the shape of the data.

df['unique'] = df[['x','y','z']].T.apply(lambda x: x.nunique())

这篇关于获取关于每一行的统计信息并将其放入新的列。 pandas 的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆