重塑和过滤 pandas 数据框 [英] Reshape and filter pandas dataframe

查看：65 发布时间：2020/10/17 2:24:18 python pandas dataframe

本文介绍了重塑和过滤 pandas 数据框的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想从单元格等于1的下面（df1）数据框中筛选出所有值，并创建一个新的数据框，其中每一行都有相应单元格中的行和列（如下面的df2中所示）：

I would like to filter out all the values from the dataframe below (df1) with cells equal to 1 and create a new dataframe where each row has the row and column from the corresponding cell (as in df2 below):

dict1 = [{'12/21/18': 0,'12/22/18': 0,'12/23/18': 1,'12/24/18': 1},
     {'12/21/18': 1,'12/22/18': 1,'12/23/18': 0,'12/24/18': 1},
     {'12/21/18': 0,'12/22/18': 1,'12/23/18': 0,'12/24/18': 0},
     {'12/21/18': 1,'12/22/18': 0,'12/23/18': 1,'12/24/18': 1}]


df1 = pd.DataFrame(dict1, index= ['AAPL','CSCO','GE','MSFT' ])

dict2 = [{'Ticker': 'AAPL','Date': '12/23/18'},
     {'Ticker': 'AAPL','Date': '12/24/18'},
     {'Ticker': 'CSCO','Date': '12/22/18'},
     {'Ticker': 'CSCO','Date': '12/24/18'},
     {'Ticker': 'GE',  'Date': '12/22/18'},
     {'Ticker': 'MSFT','Date': '12/24/18'}]


df2 = pd.DataFrame(dict2)

任何人都可以建议批准

推荐答案

这是@slayer和@Lucas H给出的方法的性能比较。还添加了第三种方法。

Here's the performance comparison of methods given by @slayer and @Lucas H. I've also added a third approach.

@slayer method 
%%timeit 
1.12 ms ± 61.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

@Lucas H method
%%timeit
5.16 ms ± 735 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

@Third method
%%timeit
4.4 ms ± 232 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


# Third method
df1 = df1.T
df2 = pd.melt(df1.where(df1==0, df1.index))
df2 = df2[df2.value != 0]
df2.columns = ['Ticker', 'Date']

@slayer的方法很明显胜过一切。

Clearly @slayer's method beats all.

这篇关于重塑和过滤 pandas 数据框的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

重塑和过滤 pandas 数据框 [英] Reshape and filter pandas dataframe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

重塑和过滤 pandas 数据框 [英] Reshape and filter pandas dataframe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭