Python Pandas groupby forloop& Idxmax [英] Python Pandas groupby forloop & Idxmax

查看：444 发布时间：2018/1/28 13:55:07 python for-loop pandas

本文介绍了Python Pandas groupby forloop& Idxmax的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个DataFrame，必须在三个级别上分组，然后返回最高的值。每一天都有一个独特的价值回报，我想找到最高的回报和细节。

pre $ 数据。 groupby（['Company'，'Product'，'Industry']）['ROI'] .idxmax（）

回报显示：

 目标 - 盘肥皂 -  House在9/17 $有5％ROI b $ b百思买 -  CD  - 电子产品在9/3

<3> 。

以下是一些示例数据：

  + ---- ------ + ----------- + ------------- + --------- + ----- + 
 |行业|产品|行业|日期| ROI | 
 + ---------- + ----------- + ------------- + -------- -  + ----- + 
 |目标|盘肥皂|房子| 9/17/13 | 5％| 
 |目标|盘肥皂|房子| 9/16/13 | 2％| 
 | BestBuy | CD |电子| 9/1/13 | 1％| 
 | BestBuy | CD | Electroincs | 9/3/13 | 3％| 
 | ...

不知道这是for循环还是使用.ix。

解决方案

我认为，如果我理解正确，可以使用 groupby 和 idxmax（），然后使用从 df loc ：

  idx = data.groupby（['Company'，'Product'，'Industry ']）['ROI'] .idxmax（）
 data.loc [idx]

另一种选择是使用 reindex ：

  data.reindex（idx ）

在一个（不同的）数据框中，我碰巧得到了方便，它显示为

 在[39]中：％timeit df.reindex（idx ）
 10000循环，最好是3：每个循环121美元
 
在[40]中：％timeit df.loc [idx] 
 10000循环，最好是3：147 us per loop

I have a DataFrame that must be grouped on three levels, and would then have the highest value returned. Each day there is a return for each unique value, and I would like to find the highest return and the details.

data.groupby(['Company','Product','Industry'])['ROI'].idxmax()

The return would show that:

Target   - Dish Soap - House       had a 5% ROI on 9/17
Best Buy - CDs       - Electronics had a 3% ROI on 9/3

was the highest.

Here's some example data: +----------+-----------+-------------+---------+-----+ | Industry | Product | Industry | Date | ROI | +----------+-----------+-------------+---------+-----+ | Target | Dish Soap | House | 9/17/13 | 5% | | Target | Dish Soap | House | 9/16/13 | 2% | | BestBuy | CDs | Electronics | 9/1/13 | 1% | | BestBuy | CDs | Electroincs | 9/3/13 | 3% | | ... Not sure if this would be a for loop, or using .ix. 解决方案 I think, if I understand you correctly, you could collect the index values in a Series using groupby and idxmax(), and then select those rows from df using loc: idx = data.groupby(['Company','Product','Industry'])['ROI'].idxmax() data.loc[idx] another option is to use reindex: data.reindex(idx) On a (different) dataframe I happened to have handy, it appears reindex might be the faster option: In [39]: %timeit df.reindex(idx) 10000 loops, best of 3: 121 us per loop In [40]: %timeit df.loc[idx] 10000 loops, best of 3: 147 us per loop 这篇关于Python Pandas groupby forloop& Idxmax的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python Pandas groupby forloop& Idxmax [英] Python Pandas groupby forloop & Idxmax

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python Pandas groupby forloop&amp; Idxmax [英] Python Pandas groupby forloop &amp; Idxmax

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

Python Pandas groupby forloop& Idxmax [英] Python Pandas groupby forloop & Idxmax

登录关闭