Python Pandas:如何从另一个csv文件更新一个csv文件 [英] Python Pandas: how to update a csv file from another csv file

查看：63 发布时间：2021/4/27 19:57:07 python csv pandas

本文介绍了Python Pandas:如何从另一个csv文件更新一个csv文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我们有两个CSV文件: a.csv 和 b.csv .

We have two CSV files: a.csv and b.csv.

a.csv 具有树列: label ， item1 ， item2 . b.csv 有两列: item1 ， item2 .如果 a.csv 中的 item1 和 item2 也出现在 b.csv 中，则为 a.csv和 b.csv 具有相同的 item1 和 item2 ，即 a.csv 中的label值应该为 1 .如何使用大熊猫来应对?

a.csv has tree columns: label, item1, item2. b.csv has two columns: item1, item2. If item1 and item2 in a.csv also occurr in b.csv, that's a.csv and b.csv have same item1 and item2, the value of label in a.csv should be 1 instead. How to use pandas to deal?

例如:

a.csv:

label    item1     item2
 0         123       35
 0         342       721
 0         876       243

b.csv:

item1     item2
 12        35
 32        721
 876       243

result.csv:

label    item1     item2
 0         123       35
 0         342       721
 1         876       243

我尝试了此操作，但没有用:

I tried this, but it doesn't work:

import pandas as pd

df1 = pd.read_csv("~/train_dataset.csv", names=['label', 'user_id', 'item_id', 'behavior_type', 'user_geohash', 'item_category', 'time','sales'], parse_dates=True)
df2 = pd.read_csv(~/train_user.csv", names=['user_id', 'item_id', 'behavior_type', 'user_geohash', 'item_category', 'time', 'sales'], parse_dates=True)
df1.loc[(df1['user_id'] == df2['user_id'])& (df1['item_id'] == df2['item_id']), 'label'] = 1

推荐答案

您可以使用 loc 和布尔条件来屏蔽df(此处表示a.csv)，并在以下情况下将标签设置为1满足条件:

You could use loc and a boolean condition to mask your df (here representing a.csv) and set the label to 1 if that condition is met:

In [18]:

df.loc[(df['item1'] == df1['item1'])& (df['item2'] == df1['item2']), 'label'] = 1
df
Out[18]:
   label  item1  item2
0      0    123     35
1      0    342    721
2      1    876    243

如果要设置所有行值，则可以使用 np.where :

If you want to set all row values you could use np.where:

In [19]:

np.where((df['item1'] == df1['item1'])& (df['item2'] == df1['item2']), 1, 0)
Out[19]:
array([0, 0, 1])
In [20]:

df['label'] = np.where((df['item1'] == df1['item1'])& (df['item2'] == df1['item2']), 1, 0)
df
Out[20]:
   label  item1  item2
0      0    123     35
1      0    342    721
2      1    876    243

这篇关于Python Pandas:如何从另一个csv文件更新一个csv文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python Pandas:如何从另一个csv文件更新一个csv文件 [英] Python Pandas: how to update a csv file from another csv file

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python Pandas:如何从另一个csv文件更新一个csv文件 [英] Python Pandas: how to update a csv file from another csv file

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭