在Pandas的列和索引上使用Merge [英] Using Merge on a column and Index in Pandas
问题描述
我有两个单独的数据框,它们共享一个项目号.在type_df
中,项目编号是索引.在time_df
中,项目编号为一列.我想计算type_df
中Project Type
为2
的行数.我正在尝试使用pandas.merge()
做到这一点.当同时使用两个列而不使用索引时,它的效果很好.我不确定如何引用索引,并且merge
是否是正确的方法.
I have two separate dataframes that share a project number. In type_df
, the project number is the index. In time_df
, the project number is a column. I would like to count the number of rows in type_df
that have a Project Type
of 2
. I am trying to do this with pandas.merge()
. It works great when using both columns, but not indices. I'm not sure how to reference the index and if merge
is even the right way to do this.
import pandas as pd
type_df = pd.DataFrame(data = [['Type 1'], ['Type 2']],
columns=['Project Type'],
index=['Project2', 'Project1'])
time_df = pd.DataFrame(data = [['Project1', 13], ['Project1', 12],
['Project2', 41]],
columns=['Project', 'Time'])
merged = pd.merge(time_df,type_df, on=[index,'Project'])
print merged[merged['Project Type'] == 'Type 2']['Project Type'].count()
错误:
未定义名称索引".
Name 'Index' is not defined.
所需的输出:
2
推荐答案
如果要在合并中使用索引,则必须指定left_index=True
或right_index=True
,然后使用left_on
或right_on
.对您来说,它应该像这样:
If you want to use an index in your merge you have to specify left_index=True
or right_index=True
, and then use left_on
or right_on
. For you it should look something like this:
merged = pd.merge(type_df, time_df, left_index=True, right_on='Project')
这篇关于在Pandas的列和索引上使用Merge的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!