pandas 遍历DataFrame行对 [英] Pandas iterate over DataFrame row pairs

查看:108
本文介绍了 pandas 遍历DataFrame行对的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何遍历Pandas DataFrame的成对行?

How can I iterate over pairs of rows of a Pandas DataFrame?

例如:

content = [(1,2,[1,3]),(3,4,[2,4]),(5,6,[6,9]),(7,8,[9,10])]
df = pd.DataFrame( content, columns=["a","b","interval"])
print df

输出:

   a  b interval
0  1  2   [1, 3]
1  3  4   [2, 4]
2  5  6   [6, 9]
3  7  8  [9, 10]

现在我想做类似的事情

for (indx1,row1), (indx2,row2) in df.?
    print "row1:\n", row1
    print "row2:\n", row2
    print "\n"

应输出

row1:
a    1
b    2
interval    [1,3]
Name: 0, dtype: int64
row2:
a    3
b    4
interval    [2,4]
Name: 1, dtype: int64

row1:
a    3
b    4
interval    [2,4]
Name: 1, dtype: int64
row2:
a    5
b    6
interval    [6,9]
Name: 2, dtype: int64

row1:
a    5
b    6
interval    [6,9]
Name: 2, dtype: int64
row2:
a    7
b    8
interval    [9,10]
Name: 3, dtype: int64

有内置的方法可以实现吗? 我看了df.groupby(df.index//2)和df.itertuples,但是这些方法似乎都不符合我的要求.

Is there a builtin way to achieve this? I looked at df.groupby(df.index // 2) and df.itertuples but none of these methods seems to do what I want.

修改: 总体目标是获得一列布尔值,以指示时间间隔"列中的时间间隔是否重叠.在上面的示例中,列表为

The overall goal is to get a list of bools indicating whether the intervals in column "interval" overlap. In the above example the list would be

overlaps = [True, False, False]

每对一个傻瓜.

推荐答案

如果要保持循环for,可以使用zipiterrows是一种方法

If you want to keep the loop for, using zip and iterrows could be a way

for (indx1,row1),(indx2,row2) in zip(df[:-1].iterrows(),df[1:].iterrows()):
    print "row1:\n", row1
    print "row2:\n", row2
    print "\n"

要同时访问下一行,请使用df[1:].iterrows()在第二行之后开始第二行.然后您就可以按照想要的方式获得输出.

To access the next row at the same time, start the second iterrow one row after with df[1:].iterrows(). and you get the output the way you want.

row1:
a    1
b    2
Name: 0, dtype: int64
row2:
a    3
b    4
Name: 1, dtype: int64


row1:
a    3
b    4
Name: 1, dtype: int64
row2:
a    5
b    6
Name: 2, dtype: int64


row1:
a    5
b    6
Name: 2, dtype: int64
row2:
a    7
b    8
Name: 3, dtype: int64

但是正如@RafaelC所说,进行for循环可能不是解决您的一般问题的最佳方法.

But as said @RafaelC, doing for loop might not be the best method for your general problem.

这篇关于 pandas 遍历DataFrame行对的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆