根据Pandas中的公共列值合并两个数据帧 [英] Merge two data frames based on common column values in Pandas

查看:79
本文介绍了根据Pandas中的公共列值合并两个数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何从具有共同列值的两个数据帧中获取合并数据帧,以使只有那些行才能使特定列中的具有共同值的合并数据帧.

How to get merged data frame from two data frames having common column value such that only those rows make merged data frame having common value in a particular column.

我有5000行df1作为格式:-

I have 5000 rows of df1 as format : -

    director_name   actor_1_name    actor_2_name    actor_3_name    movie_title
0   James Cameron   CCH Pounder Joel David Moore    Wes Studi     Avatar
1   Gore Verbinski  Johnny Depp Orlando Bloom   Jack Davenport   Pirates 
    of the Caribbean: At World's End
2   Sam Mendes   Christoph Waltz    Rory Kinnear    Stephanie Sigman Spectre

df2的10000行

movieId                   genres                        movie_title
    1       Adventure|Animation|Children|Comedy|Fantasy   Toy Story
    2       Adventure|Children|Fantasy                    Jumanji
    3       Comedy|Romance                             Grumpier Old Men
    4       Comedy|Drama|Romance                      Waiting to Exhale

公共列'movie_title'具有相同的值,并且基于它们,我想获取'movie_title'相同的所有行.其他要删除的行.

A common column 'movie_title' have common values and based on them, I want to get all rows where 'movie_title' is same. Other rows to be deleted.

任何帮助/建议将不胜感激.

Any help/suggestion would be appreciated.

注意:我已经尝试过

pd.merge(dfinal, df1, on='movie_title')

输出就像一行

director_name   actor_1_name    actor_2_name    actor_3_name    movie_title movieId title   genres

关于="outer"/"left","right"的方式,尽管存在许多常见的提示,但我尝试了所有操作并且在删除NaN之后没有得到任何行.

and on how ="outer"/"left", "right", I tried all and didn't get any row after dropping NaN although many common coloumn do exist.

推荐答案

我们可以通过几种方式合并两个数据帧. python中最常见的方式是在Pandas中使用合并操作.

We can merge two Data frames in several ways. Most common way in python is using merge operation in Pandas.

import pandas
dfinal = df1.merge(df2, on="movie_title", how = 'inner')

要基于不同数据框的列进行合并,可以特别指定左和右公共列名称,以防同一列的两个不同名称不明确,例如-'movie_title''movie_name'.

For merging based on columns of different dataframe, you may specify left and right common column names specially in case of ambiguity of two different names of same column, lets say - 'movie_title' as 'movie_name'.

dfinal = df1.merge(df2, how='inner', left_on='movie_title', right_on='movie_name')

如果您想更具体一点,可以阅读

If you want to be even more specific, you may read the documentation of pandas merge operation.

这篇关于根据Pandas中的公共列值合并两个数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆