切片和分配值的唯一顺序索引的多索引 pandas 数据框 [英] Slicing and assigning values multi-indexed pandas dataframe of unique sequential indices

查看:91
本文介绍了切片和分配值的唯一顺序索引的多索引 pandas 数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想选择并更改数据框单元格的值.此数据帧有2个索引:"datetime"和"idx".两者都包含唯一且顺序的标签. "datetime"索引具有日期时间类型的日期时间标签,"idx"具有整数值标签.

I want to select and change the value of a dataframe cell. There are 2 indices used for this dataframe: 'datetime' and 'idx'. Both contain labels which are unique and sequential. 'datetime' index has datetime label of datetime type, and 'idx' has integer valued labels.

import numpy as np
import pandas as pd

dt = pd.date_range("2010-10-01 00:00:00", periods=5, freq='H')
d = {'datetime': dt, 'a': np.arange(len(dt))-1,'b':np.arange(len(dt))+1}
df = pd.DataFrame(data=d)
df.set_index(keys='datetime',inplace=True,drop=True)
df.sort_index(axis=0,level='datetime',ascending=False,inplace=True)

df.loc[:,'idx'] = np.arange(0, len(df),1)+5
df.set_index('idx',drop=True,inplace=True,append=True)
print(df)

这里是数据框:

                         a  b
datetime            idx      
2010-10-01 04:00:00 5    3  5
2010-10-01 03:00:00 6    2  4
2010-10-01 02:00:00 7    1  3
2010-10-01 01:00:00 8    0  2
2010-10-01 00:00:00 9   -1  1

'说我想获取idx = 5的行.我怎么做?我可以使用这个:

'Say I want to get the row where idx=5. How do I do that? I could use this:

print(df.iloc[0])

然后我将在下面获得结果:

Then I will get result below:

a    3
b    5
Name: (2010-10-01 04:00:00, 5), dtype: int32

但是我要在idx = 5,column ='a',通过指定idx值和列名'a'的该单元格中访问并设置值 >.我该怎么办?

But I want to access and set the value in this cell where idx=5, column='a', by specifying idx value, and column name 'a'. How do I do that?

请咨询.

推荐答案

您可以使用或者您可以使用 DataFrame.eval() 方法,如果您需要设置/更新一些单元格:

Or you can use DataFrame.eval() method if you need to set/update some cells:

In [61]: df.loc[df.eval('idx==5'), 'a'] = 100

In [62]: df
Out[62]:
                           a  b
datetime            idx
2010-10-01 04:00:00 5    100  5
2010-10-01 03:00:00 6      2  4
2010-10-01 02:00:00 7      1  3
2010-10-01 01:00:00 8      0  2
2010-10-01 00:00:00 9     -1  1

说明:

In [59]: df.eval('idx==5')
Out[59]:
datetime             idx
2010-10-01 04:00:00  5       True
2010-10-01 03:00:00  6      False
2010-10-01 02:00:00  7      False
2010-10-01 01:00:00  8      False
2010-10-01 00:00:00  9      False
dtype: bool

In [60]: df.loc[df.eval('idx==5')]
Out[60]:
                         a  b
datetime            idx
2010-10-01 04:00:00 5    3  5

如果您的原始MultiIndex没有名称,则可以使用

PS if your original MultiIndex doesn't have names, you can easily set them using rename_axis() method:

df.rename_axis(('datetime','idx')).query(...)

替代(价格稍贵)的解决方案-使用sort_index() + pd.IndexSlice[]:

Alternative (bit more expensive) solution - using sort_index() + pd.IndexSlice[]:

In [106]: df.loc[pd.IndexSlice[:,5], ['a']]
...
skipped
...
KeyError: 'MultiIndex Slicing requires the index to be fully lexsorted tuple len (2), lexsort depth (0)'

所以我们需要先对索引进行排序:

so we would need to sort index first:

In [107]: df.sort_index().loc[pd.IndexSlice[:,5], ['a']]
Out[107]:
                         a
datetime            idx
2010-10-01 04:00:00 5    3

这篇关于切片和分配值的唯一顺序索引的多索引 pandas 数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆