pandas 列多索引到行 [英] Pandas column multi-index to rows

查看:63
本文介绍了 pandas 列多索引到行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 yfinance 下载多个交易品种的价格历史,它返回一个具有多个索引的 df.例如:

I'm using yfinance to download price history for multiple symbols, which returns a df with multiple indexes. For example:

import yfinance as yf
df = yf.download(tickers = ['AAPL', 'MSFT'], period = '2d')

可以在没有 yfinance 的情况下构建类似的数据帧,例如:

A similar dataframe could be constructed without yfinance like:

import pandas as pd
pd.options.display.float_format = '{:.2f}'.format
import numpy as np

attributes = ['Adj Close', 'Close', 'High', 'Low', 'Open', 'Volume']
symbols = ['AAPL', 'MSFT']
dates = ['2020-07-23', '2020-07-24']
data = [[[371.38, 202.54], [371.38, 202.54], [388.31, 210.92], [368.04, 202.15], [387.99, 207.19], [49251100, 67457000]],
    [[370.46, 201.30], [370.46, 201.30], [371.88, 202.86], [356.58, 197.51 ], [363.95, 200.42], [46323800, 39799500]]]
data = np.array(data).reshape(len(dates),len(symbols) * len(attributes))

cols = pd.MultiIndex.from_product([attributes, symbols])
df = pd.DataFrame(data, index=dates, columns=cols)
df
           Adj Close           Close            High             Low            Open              Volume            
                AAPL    MSFT    AAPL    MSFT    AAPL    MSFT    AAPL    MSFT    AAPL    MSFT        AAPL        MSFT
2020-07-23    371.38  202.54  371.38  202.54  388.31  210.92  368.04  202.15  387.99  207.19  49251100.0  67457000.0
2020-07-24    370.46  201.30  370.46  201.30  371.88  202.86  356.58  197.51  363.95  200.42  46323800.0  39799500.0

一旦我有了这个 df,我想重组它,以便每个符号和日期都有一行.我目前通过循环遍历符号列表并每次调用 API 并附加结果来执行此操作.我相信一定有更有效的方法:

Once I have this df, I want to restructure it so that I have a row for each symbol and date. I'm currently doing this by looping through a list of symbols and calling the API once each time, and appending the results. I'm sure there must be a more efficient way:

df = pd.DataFrame()
symbols = ['AAPL', 'MSFT']

for x in range(0, len(symbols)):
    symbol = symbols[x]
    result = yf.download(tickers = symbol, start = '2020-07-23', end = '2020-07-25')
    result.insert(0, 'symbol', symbol)
    df = pd.concat([df, result])

所需输出的示例:

df
           symbol        Open        High         Low       Close   Adj Close    Volume
Date                                                                                   
2020-07-23   AAPL  387.989990  388.309998  368.040009  371.380005  371.380005  49251100
2020-07-24   AAPL  363.950012  371.880005  356.579987  370.459991  370.459991  46323800
2020-07-23   MSFT  207.190002  210.919998  202.149994  202.539993  202.539993  67457000
2020-07-24   MSFT  200.419998  202.860001  197.509995  201.300003  201.300003  39799500

推荐答案

这看起来像一个简单的堆叠操作.一起去吧

This looks like a simple stacking operation. Let's go with

df = yf.download(tickers = ['AAPL', 'MSFT'], period = '2d') # get yer data
df.stack(level=1).rename_axis(['Date', 'symbol']).reset_index(level=1)

           symbol   Adj Close  ...        Open    Volume
Date                           ...                      
2020-07-23   AAPL  371.380005  ...  387.989990  49251100
2020-07-23   MSFT  202.539993  ...  207.190002  67457000
2020-07-24   AAPL  370.459991  ...  363.950012  46323800
2020-07-24   MSFT  201.300003  ...  200.419998  39799500

[4 rows x 7 columns]

这篇关于 pandas 列多索引到行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆