基于索引给定条件在另一列上移动列的元素 [英] Shifting elements of column based on index given condition on another column

查看：166 发布时间：2017/3/26 4:11:51 python pandas dataframe shift

本文介绍了基于索引给定条件在另一列上移动列的元素的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个数据帧（df），其中有2列和1个索引。

索引是datetime索引，格式为2001-01-30 ....等等，索引按DATE排序，有数千个相同的日期（和是月度日期）。列A是公司名称（对应于日期），列B是索引中日期列A中公司名称的股价。

现在有每个日期的A列中有多个公司，而且公司随时间变化（因此数据不能完全预测）。

我想创建一个滞后于所有的列C在B转到下一个日期的价格（根据索引）。

基本的.shift（）将无法正常工作，因为我会要求所有的价格将根据公司在指数下一个位置仍然存在的情况进行转移。

我想要一个将B向前移位1的列C，一列D将其重新移回1.

我已经坚持了一段时间，有人非常聪明，请帮忙。

谢谢

解决方案

考虑示例数据框 df 以下

  np.random.seed（[3,1415]）
 df = pd.concat（dict（
 A = pd.Series（np.random.rand（10），pd.date_range（'2016-09-30'，periods = 10）），
 B = pd.Series（np.random.rand（7），pd.date_range（'2016-09-25'，periods = 7）），
 C = pd.Series（np.random.rand（10） ，pd.date_range（'2016-09-20'，periods = 10）），
 D = pd.Series（np.random.rand（8），pd.date_range（'2016-10-30'期间= 8）），
 E = pd.Series（np.random.rand（12），pd.date_range（'2016-10-25'，periods = 12）），
 F = pd .Series（np.random.rand（14），pd.date_range（'▲8-30'，periods = 14）），
 
））。rename_axis（['ColumnA'，None] ）.reset_index（'ColumnA'，name ='ColumnB'）
 
 print（df.head（10））
 
 ColumnA ColumnB 
 2016-09-30 A 0.444939 
 2016-10-01 A 0.407554 
 2016-10-02 A 0.460148 
 2016-10-03 A 0.465239 
 2016-10-04 A 0.462691 
 2016-10-05 A 0.016545 
 2016-10-06 A 0.850445 
 2016-10-07 A 0.8 17744 
 2016-10-08 A 0.777962 
 2016-10-09 A 0.757983

< hr>

使用 groupby + shift / em>

  d1 = df.set_index（'ColumnA'，append = True）
g = d1.groupby（level ='ColumnA'）。ColumnB 
 keys = ['Forward'，'Back'] 
 new_df = d1.join（pd.concat（[g.shift（i）for i在[-1,1]]，轴= 1，键=键））
 print（new_df.query（'ColumnA ==A'）head（10））
 
列B前退
 ColumnA 
 2016-09-30 A 0.444939 0.407554 NaN 
 2016-10-01 A 0.407554 0.460148 0.444939 
 2016-10-02 A 0.460148 0.465239 0.407554 
 2016-10-03 A 0.465239 0.462691 0.460148 
 2016-10-04 A 0.462691 0.016545 0.465239 
 2016-10-05 A 0.016545 0.850445 0.462691 
 2016-10-06 A 0.850445 0.817744 0.016545 
 2016 -10-07 A 0.817744 0.777962 0.850445 
 2016-10-08 A 0.777962 0.757983 0.817744 
 2016-10-09 A 0.757983 NaN 0.777962

I have a dataframe (df) with 2 columns and 1 index.

Index is datetime index and is in format of 2001-01-30 .... etc and the index is ordered by DATE and there are thousands of identical dates (and is monthly dates). Column A is company name (which corresponds to the date), Column B are share prices for the company names in column A for the date in the Index.

Now there are multiple companies in Column A for each date, and companies do vary over time (so the data is not predictable fully).

I want to create a Column C which lags all the prices which are in B forward to the next date (as per in the index).

A basic .shift() would not work, as I would require all the prices to be shifted based on the condition that the company is still there at the next point in the index.

I want a column C which shifts B forward by 1, and a column D which shifts it back by 1.

I have been stuck on this for a while, somebody very smart please help.

Thanks
解决方案
Consider the example dataframe df below
np.random.seed([3,1415]) df = pd.concat(dict( A=pd.Series(np.random.rand(10), pd.date_range('2016-09-30', periods=10)), B=pd.Series(np.random.rand(7), pd.date_range('2016-09-25', periods=7)), C=pd.Series(np.random.rand(10), pd.date_range('2016-09-20', periods=10)), D=pd.Series(np.random.rand(8), pd.date_range('2016-10-30', periods=8)), E=pd.Series(np.random.rand(12), pd.date_range('2016-10-25', periods=12)), F=pd.Series(np.random.rand(14), pd.date_range('2016-08-30', periods=14)), )).rename_axis(['ColumnA', None]).reset_index('ColumnA', name='ColumnB') print(df.head(10)) ColumnA ColumnB 2016-09-30 A 0.444939 2016-10-01 A 0.407554 2016-10-02 A 0.460148 2016-10-03 A 0.465239 2016-10-04 A 0.462691 2016-10-05 A 0.016545 2016-10-06 A 0.850445 2016-10-07 A 0.817744 2016-10-08 A 0.777962 2016-10-09 A 0.757983

use groupby + shift
d1 = df.set_index('ColumnA', append=True) g = d1.groupby(level='ColumnA').ColumnB keys = ['Forward', 'Back'] new_df = d1.join(pd.concat([g.shift(i) for i in [-1, 1]], axis=1, keys=keys)) print(new_df.query('ColumnA == "A"').head(10)) ColumnB Forward Back ColumnA 2016-09-30 A 0.444939 0.407554 NaN 2016-10-01 A 0.407554 0.460148 0.444939 2016-10-02 A 0.460148 0.465239 0.407554 2016-10-03 A 0.465239 0.462691 0.460148 2016-10-04 A 0.462691 0.016545 0.465239 2016-10-05 A 0.016545 0.850445 0.462691 2016-10-06 A 0.850445 0.817744 0.016545 2016-10-07 A 0.817744 0.777962 0.850445 2016-10-08 A 0.777962 0.757983 0.817744 2016-10-09 A 0.757983 NaN 0.777962

这篇关于基于索引给定条件在另一列上移动列的元素的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

基于索引给定条件在另一列上移动列的元素 [英] Shifting elements of column based on index given condition on another column

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

基于索引给定条件在另一列上移动列的元素 [英] Shifting elements of column based on index given condition on another column

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭