使用df.merge填充df中的新列会产生奇怪的匹配 [英] Use df.merge to populate a new column in df gives strange matchs

查看：186 发布时间：2020/5/9 0:40:35 python pandas dataframe merge

本文介绍了使用df.merge填充df中的新列会产生奇怪的匹配的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想基于另一个数据框在我的数据框(df)中创建一个新列. 基本上df2包含我要插入df的更新信息. 为了复制实际情况(> 1m行)，我将使用简单的列填充两个随机df.

I want to create a new column in my dataframe (df) based on another dataframe. Basically df2 contains updated informations that I want to plug into df. In order to replicate my real case (>1m lines), I will just populate two random df with simple columns.

我使用pandas.merge()来做到这一点，但这给了我奇怪的结果.

I use pandas.merge() to do this, but this is giving me strange results.

这是一个典型的例子.让我们随机创建df并创建具有简单关系的df2:"New Type" ="Type" +1.我创建了此简单关系，以便我们可以轻松检查输出.在我的实际应用程序中，我当然没有如此简单的关系.

Here is a typical example. Let's create df randomly and create df2 with a simple relationship : "New Type" = "Type" + 1. I create this simple relationship so that we can check easily the ouput. In my real application I don't have such an easy relationship of course.

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(0,100,size=(100, 1)),columns = ["Type"])
df.head()

   Type
0    45
1     3
2    89
3     6
4    39

df1 = pd.DataFrame({"Type":range(1,100)})
df1["New Type"] = df1["Type"] + 1
print(df1.head())

 Type  New Type
0     1         2
1     2         3
2     3         4
3     4         5
4     5         6

现在假设我要基于df1上的新类型"更新df类型"

Now let's say I want to update df "Type" based on the "New Type" on df1

df["Type2"] = df.merge(df1,on="Type")["New Type"]
print(df.head())

我得到了这个奇怪的输出，我们清楚地看到它不起作用

I get this strange output where we clearly see that it does not work

  Type  Type2
0    45   46.0
1     3    4.0
2    89    4.0
3     6    4.0
4    39   90.0

我认为输出应该像

  Type  Type2
0    45   46.0
1     3    4.0
2    89   90.0
3     6    7.0
4    39   40.0

仅第一行正确匹配.你知道我错过了吗?

Only the first line is properly matched. Do you know what I've missed?

1.我需要与how ="left"合并，否则默认选择是"inner"生成另一个维度与df不同的表.

1.I need to do merge with how="left" otherwise the default choice is "inner" producing another table with a different dimension than df.

我还需要使用sort = false作为合并功能的属性.否则，合并结果将先排序，然后再应用于df.

使用df.merge填充df中的新列会产生奇怪的匹配 [英] Use df.merge to populate a new column in df gives strange matchs

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用df.merge填充df中的新列会产生奇怪的匹配 [英] Use df.merge to populate a new column in df gives strange matchs

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭