pandas -是否可以使用两种不同的聚合方式聚合两列? [英] Pandas - possible to aggregate two columns using two different aggregations?

查看：74 发布时间：2020/5/23 22:54:41 pandas aggregation

本文介绍了 pandas -是否可以使用两种不同的聚合方式聚合两列?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在加载一个csv文件，该文件包含以下列: 日期，textA，textB，numberA，numberB

I'm loading a csv file, which has the following columns: date, textA, textB, numberA, numberB

我想按以下列进行分组:日期，textA和textB-但要对numberA应用"sum"，但对numberB应用"min".

I want to group by the columns: date, textA and textB - but want to apply "sum" to numberA, but "min" to numberB.

data = pd.read_table("file.csv", sep=",", thousands=',')
grouped = data.groupby(["date", "textA", "textB"], as_index=False)

...但是我看不到如何将两个不同的聚合函数应用于两个不同的列? IE. sum(numberA), min(numberB)

...but I cannot see how to then apply two different aggregate functions, to two different columns? I.e. sum(numberA), min(numberB)

推荐答案

agg方法可以接受一个dict，在这种情况下，键指示要应用该功能的列:

The agg method can accept a dict, in which case the keys indicate the column to which the function is applied:

grouped.agg({'numberA':'sum', 'numberB':'min'})

例如，

For example,

import numpy as np
import pandas as pd
df = pd.DataFrame({'A': ['foo', 'bar', 'foo', 'bar',
                         'foo', 'bar', 'foo', 'foo'],
                   'B': ['one', 'one', 'two', 'three',
                         'two', 'two', 'one', 'three'],
                   'number A': np.arange(8),
                   'number B': np.arange(8) * 2})
grouped = df.groupby('A')

print(grouped.agg({
    'number A': 'sum',
    'number B': 'min'}))

收益

     number B  number A
A                      
bar         2         9
foo         0        19

这也表明Pandas可以处理列名中的空格.我不确定问题的根源是什么，但是文字空间应该不会造成问题.如果您想进一步调查，

This also shows that Pandas can handle spaces in column names. I'm not sure what the origin of the problem was, but literal spaces should not have posed a problem. If you wish to investigate this further,

print(df.columns)

而无需重新分配列名，将向我们显示名称的repr.列名中也许有一个很难看的字符，看起来像一个空格(或其他字符)，但实际上是一个u'\xa0'(NO-BREAK SPACE).

without reassigning the column names, will show show us the repr of the names. Maybe there was a hard-to-see character in the column name that looked like a space (or some other character) but was actually a u'\xa0' (NO-BREAK SPACE), for example.

这篇关于 pandas -是否可以使用两种不同的聚合方式聚合两列?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

pandas -是否可以使用两种不同的聚合方式聚合两列? [英] Pandas - possible to aggregate two columns using two different aggregations?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas -是否可以使用两种不同的聚合方式聚合两列? [英] Pandas - possible to aggregate two columns using two different aggregations?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭