使用管道表达 pandas 子集 [英] Expressing pandas subset using pipe

查看：77 发布时间：2020/5/24 1:58:05 python pandas pipe

本文介绍了使用管道表达 pandas 子集的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个像这样子集的数据框:

I have a dataframe that I subset like this:

   a  b   x  y
0  1  2   3 -1
1  2  4   6 -2
2  3  6   6 -3
3  4  8   3 -4

df = df[(df.a >= 2) & (df.b <= 8)]
df = df.groupby(df.x).mean()

如何使用pandas管道运算符表达这一点?

How do I express this using the pandas pipe operator?

df = (df
      .pipe((x.a > 2) & (x.b < 6)
      .groupby(df.x)
      .apply(lambda x: x.mean())

推荐答案

只要您可以将步骤归类为返回DataFrame并采用DataFrame(可能包含更多参数)的东西，那么就可以使用pipe .这样做是否有好处，是另一个问题.

As long as you can categorize a step as something that returns a DataFrame, and takes a DataFrame (with possibly more arguments), then you can use pipe. Whether there's an advantage to doing so, is another question.

例如，您可以在这里使用

Here, e.g., you can use

df\
    .pipe(lambda df_, x, y: df_[(df_.a >= x) & (df_.b <= y)], 2, 8)\
    .pipe(lambda df_: df_.groupby(df_.x))\
    .mean()

请注意，第一阶段是一个lambda，它需要3个参数，而2和8是作为参数传递的.这不是唯一的方法-等效于

Notice how the first stage is a lambda that takes 3 arguments, with the 2 and 8 passed as parameters. That's not the only way to do so - it is equivalent to

    .pipe(lambda df_: df_[(df_.a >= 2) & (df_.b <= 8)])\

还请注意，您可以使用

df\
    .pipe(lambda df_, x, y: df[(df.a >= x) & (df.b <= y)], 2, 8)\
    .groupby('x')\
    .mean()

这里的lambda取df_，但在df上运行，并且第二个pipe被替换为groupby.

Here the lambda takes df_, but operates on df, and the second pipe has been replaced with a groupby.

第一个更改在这里起作用，但是很麻烦.因为这是 first 管道阶段，所以发生工作.如果要在以后阶段，则可能需要一个具有一个维度的DataFrame，然后尝试在具有另一个维度的蒙版上对其进行过滤.

The first change works here, but is gragile. It happens to work since this is the first pipe stage. If it would be a later stage, it might take a DataFrame with one dimension, and attempt to filter it on a mask with another dimension, for example.

第二个更改很好.面对现实，我认为它更具可读性.基本上，任何需要使用DataFrame并返回1的东西都可以直接调用，也可以通过pipe调用.

The second change is fine. In face, I think it is more readable. Basically, anything that takes a DataFrame and returns one, can be either be called directly or through pipe.

这篇关于使用管道表达 pandas 子集的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用管道表达 pandas 子集 [英] Expressing pandas subset using pipe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用管道表达 pandas 子集 [英] Expressing pandas subset using pipe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭