使用来自另一个 pandas 数据帧的信息填充 pandas 数据帧 [英] Fill a Pandas dataframe using information from another Pandas dataframe

查看：129 发布时间：2017/3/26 1:43:13 python python-2.7 dataframe pandas

本文介绍了使用来自另一个 pandas 数据帧的信息填充 pandas 数据帧的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个熊猫数据框，其中包含信息：

 索引年月日符号交易nr_shares 
 2011- 01-10 2011 1 10 AAPL购买1500 
 2011-01-13 2011 1 13 GOOG卖1000

，我想填写第二个零填充的熊猫数据框

index AAPL GOOG 2011- 01-10 0 0 2011-01-11 0 0 2011-01-12 0 0 2011-01-13 0 0 / pre>

使用第一个数据框中的信息，以便获得

 索引AAPL GOOG 
 2011-01-10 1500 0 
 2011-01-11 0 0 
 2011-01-12 0 0 
 2011-01-13 0 -1000

可以看出，在相关日期，指定数量的股票的买卖交易有已经输入了适当的列，正数为a购买和负数的卖单。

我该如何完成这个？我必须循环使用第一个数据帧索引，并使用嵌套的if语句检查符号和事务列，然后写入第二个数据帧，还是有一个更优雅的数据框方法，我可以使用？

解决方案

您可以使用 pivot_table 。从（编辑稍微复杂一点）：

 >>> df1 
索引年月日符号交易nr_shares 
 0 2011-01-10 2011 1 10 AAPL购买1500 
 1 2011-01-10 2011 1 10 AAPL卖200 
 2 2011 -01-10 2011 1 10 GOOG卖500 
 3 2011-01-10 2011 1 10 GOOG买600 
 4 2011-01-13 2011 1 13 GOOG卖1000 
>> > df2 
 index AAPL GOOG 
 0 2011-01-10 0 0 
 1 2011-01-11 0 0 
 2 2011-01-12 0 0 
 3 2011 -01-13 0 0

我们可以签署股票：

 >>> df1 [nr_shares] = df1.apply（lambda row：row [nr_shares] *（-1 if row [transaction] ==Sellelse 1），axis = 1）
> >> df1 
索引年月日符号交易nr_shares 
 0 2011-01-10 2011 1 10 AAPL购买1500 
 1 2011-01-10 2011 1 10 AAPL卖-200 
 2 2011-01-10 2011 1 10 GOOG卖-500 
 3 2011-01-10 2011 1 10 GOOG买600 
 4 2011-01-13 2011 1 13 GOOG卖-1000

然后，您可以转动 df1 。默认情况下，它使用聚合值的平均值，但是我们需要总和：

 >>> a = df1.pivot_table（values =nr_shares，rows =index，cols =symbol，
 aggfunc = sum）
>>> a 
符号AAPL GOOG 
索引
 2011-01-10 1300 100 
 2011-01-13 NaN -1000

给 b 相同的索引：

 >>> b = df2.set_index（index）
>>> b 
 AAPL GOOG 
索引
 2011-01-10 0 0 
 2011-01-11 0 0 
 2011-01-12 0 0 
 2011 -01-13 0 0

然后添加：

 >>> （a + b）.fillna（0）
符号AAPL GOOG 
索引
 2011-01-10 1300 100 
 2011-01-11 0 0 
 2011- 01-12 0 0 
 2011-01-13 0 -1000

I have one Pandas dataframe that contains information thus:

index       year  month day symbol transaction  nr_shares
2011-01-10  2011  1     10  AAPL       Buy       1500
2011-01-13  2011  1     13  GOOG       Sell      1000

and I would like to fill a second, zero-filled Pandas dataframe

index        AAPL  GOOG
2011-01-10     0     0
2011-01-11     0     0
2011-01-12     0     0
2011-01-13     0     0

using the information from the first dataframe so I get

index        AAPL  GOOG
2011-01-10   1500    0
2011-01-11     0     0
2011-01-12     0     0
2011-01-13     0  -1000

where it can be seen that on the relevant dates the buy and sell transactions for a specified number of shares have been entered in the appropriate column, with a positive number for a buy and a negative number for a sell order.

How can I accomplish this? Will I have to loop over the first dataframe index and check the symbol and transaction columns using nested "if" statements and then write to the second dataframe, or is there a more elegant dataframe method that I could use?

解决方案

You could use pivot_table. Starting from (edited to be slightly more complicated):

>>> df1
        index  year  month  day symbol transaction  nr_shares
0  2011-01-10  2011      1   10   AAPL         Buy       1500
1  2011-01-10  2011      1   10   AAPL        Sell        200
2  2011-01-10  2011      1   10   GOOG        Sell        500
3  2011-01-10  2011      1   10   GOOG         Buy        600
4  2011-01-13  2011      1   13   GOOG        Sell       1000
>>> df2
        index  AAPL  GOOG
0  2011-01-10     0     0
1  2011-01-11     0     0
2  2011-01-12     0     0
3  2011-01-13     0     0

We can sign the shares:

>>> df1["nr_shares"] = df1.apply(lambda row: row["nr_shares"] * (-1 if row["transaction"] == "Sell" else 1), axis=1)
>>> df1
        index  year  month  day symbol transaction  nr_shares
0  2011-01-10  2011      1   10   AAPL         Buy       1500
1  2011-01-10  2011      1   10   AAPL        Sell       -200
2  2011-01-10  2011      1   10   GOOG        Sell       -500
3  2011-01-10  2011      1   10   GOOG         Buy        600
4  2011-01-13  2011      1   13   GOOG        Sell      -1000

And then you can pivot df1. By default it uses the mean of the aggregated values, but we want the sum:

>>> a = df1.pivot_table(values="nr_shares", rows="index", cols="symbol",
                    aggfunc=sum)
>>> a
symbol      AAPL  GOOG
index                 
2011-01-10  1300   100
2011-01-13   NaN -1000

Give b the same index:

>>> b = df2.set_index("index")
>>> b
            AAPL  GOOG
index                 
2011-01-10     0     0
2011-01-11     0     0
2011-01-12     0     0
2011-01-13     0     0

And then add them:

>>> (a+b).fillna(0)
symbol      AAPL  GOOG
index                 
2011-01-10  1300   100
2011-01-11     0     0
2011-01-12     0     0
2011-01-13     0 -1000

这篇关于使用来自另一个 pandas 数据帧的信息填充 pandas 数据帧的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用来自另一个 pandas 数据帧的信息填充 pandas 数据帧 [英] Fill a Pandas dataframe using information from another Pandas dataframe

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用来自另一个 pandas 数据帧的信息填充 pandas 数据帧 [英] Fill a Pandas dataframe using information from another Pandas dataframe

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭