Python Pandas合并KeyError [英] Python pandas merge keyerror
问题描述
当我尝试合并两个数据帧时,始终出现键盘错误.代码:
Consistently getting a keyerror when I try to merge two data frames. The code:
c = pd.merge(a, b, on='video_id', how='left')
基于互联网研究,我仔细检查了dtype并将其都强制为int:
Based on internet research I double checked the dtype and coerced both to int:
a = pd.read_csv(filename, index_col=False, dtype={'video_id': np.int64}, low_memory=False)
b = pd.read_csv(videoinfo, index_col=False, dtype={'video_id': np.int64})
重命名列(以确保它们匹配):
Renamed the columns (to make sure they match):
a.columns.values[2] = "video_id"
b.columns.values[0] = "video_id"
强制为df:
c = pd.merge(pd.DataFrame(a), pd.DataFrame(b), on='video_id', how='left')
关于为什么我仍然遇到键盘错误的想法.而且始终是"KeyError:'video_id'"
Out of ideas as to why I'm still getting the keyerror. And it's always "KeyError: 'video_id'"
推荐答案
您要注意不要使用df.columns.values
重命名列.这样做会在列名上加上索引.
You want to be careful not to use df.columns.values
to rename columns. Doing so screws with the indexing on your column names.
如果您知道要替换的列名,则可以尝试如下操作:
If you know which column names you're replacing, you can try something like this:
a.rename(columns={'old_col_name':'video_id'}, inplace = True)
b.rename(columns={'old_col_name':'video_id'}, inplace = True)
如果您不提前知道列名,可以尝试:
If you don't know the column names ahead of time, you can try:
col_names_a = a.columns
col_names_a[index] = 'video_id'
a.columns = col_names_a
请记住,实际上您不需要在两个数据帧上使用相同的列名.熊猫允许您在每个数据框中指定单独的名称
Keep in mind, you actually don't need to use the same column names on both dataframes. Pandas allows you to specify the individual names in each dataframe
pd.merge(a, b, left_on = 'a_col', right_on = 'b_col', how = 'left')
这篇关于Python Pandas合并KeyError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!