根据来自另一个数据框的行范围添加/填充 pandas 列 [英] Add/fill pandas column based on range in rows from another dataframe
本文介绍了根据来自另一个数据框的行范围添加/填充 pandas 列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
与大熊猫一起工作,我将df1按时间样本索引:
Working with pandas, I have df1 indexed by time samples:
data = '''\
time flags input
8228835.0 53153.0 32768.0
8228837.0 53153.0 32768.0
8228839.0 53153.0 32768.0
8228841.0 53153.0 32768.0
8228843.0 61345.0 32768.0'''
fileobj = pd.compat.StringIO(data)
df1 = pd.read_csv(fileobj, sep='\s+', index_col='time')
df2用开始和结束指示时间范围,以定义'check'状态为True的范围:
df2 indicates time ranges with start and end to define ranges where the state of 'check' is True:
data = '''\
check start end
20536 True 8228837 8228993
20576 True 8232747 8232869
20554 True 8230621 8230761
20520 True 8227351 8227507
20480 True 8223549 8223669
20471 True 8221391 8221553'''
fileobj = pd.compat.StringIO(data)
df2 = pd.read_csv(fileobj, sep='\s+')
我需要做的是在df1中添加一列检查",并用True值填充df2中定义的实际时间范围.所有其他人都应该是错误的.结果示例如下:
What I need to do is add a column for 'check' to df1 and fill out the actual time ranges defined in df2 with the value of True. All others should be False. An example result would be:
flags input check
time
8228835.0 53153.0 32768.0 False
8228837.0 53153.0 32768.0 True
8228839.0 53153.0 32768.0 True
8228841.0 53153.0 32768.0 True
8228843.0 61345.0 32768.0 True
....
8228994.0. 12424.0. 32768.0. False
推荐答案
您可以创建一个列表或范围,然后使用 itertools.chain
:
You can make a list or ranges, and then use pd.Index.isin
with itertools.chain
:
from itertools import chain
df2 = df2[df2['check']]
ranges = map(range, df2['start'], df2['end'])
df1['check'] = df1.index.isin(chain.from_iterable(ranges))
print(df1)
flags input check
time
8228835.0 53153.0 32768.0 False
8228837.0 53153.0 32768.0 True
8228839.0 53153.0 32768.0 True
8228841.0 53153.0 32768.0 True
8228843.0 61345.0 32768.0 True
这篇关于根据来自另一个数据框的行范围添加/填充 pandas 列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文