根据匹配的行合并两个csv文件并在linux中添加新列 [英] merge two csv files according to matching rows and add new columns in linux

查看:71
本文介绍了根据匹配的行合并两个csv文件并在linux中添加新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 java 开发一个应用程序,但为此我需要按某种顺序创建一个 csv 文件.我不太了解 linux,但想知道是否有某种方法可以合并所需格式的 csv 文件.

I am developing an application using java, but for that I need a csv file in some order. I dont know linux much, but wondering if some way is there to merge the csv files in the required format.

我有两个包含数十万条记录的 csv 文件.示例如下:

I have two csv files containing hundreds of thousands of records. a sample is below:

name,Direction,Date
abc,sent,Jan 21 2014 02:06 
xyz,sent,Nov 21 2014 01:09
pqr,sent,Oct 21 2014 03:06  

name,Direction,Date
abc,received,Jan 22 2014 02:06
xyz,received,Nov 22 2014 02:06

因此,第二个 csv 文件将包含文件 1 的一些记录.我需要的是一个新的 csv,如下所示:

so, this second csv file would contain some records of file 1. What I need is a new csv like this:

name,Direction,Date,currentDirection,receivedDate
abc,sent,Jan 21 2014 02:06,received,Jan 22 2014 02:06
xyz,sent,Nov 21 2014 01:09,received,Nov 22 2014 02:06
pqr,sent,Oct 21 2014 03:06

需要根据 column1 中的匹配数据添加列(第 4 列和第 5 列).如果第二个文件中没有匹配的数据,列应该像上面一样为空.

Need to add the columns(4th and 5th column) according to the matching data in column1. if no matching data is there in the second file, the columns should be empty like above.

Linux 中是否有 bash 命令来实现此目的?

so is there a bash command in linux to achieve this?

推荐答案

awk 可能适合你:

kent$  awk -F, -v OFS="," 
       'BEGIN{print "name,Direction,Date,currentDirection,receivedDate"}
        NR==FNR&&NR>1{a[$1]=$0;next}
        FNR>1{printf "%s%s\n",$0,($1 in a?FS a[$1]:"")}' 2.csv 1.csv
name,Direction,Date,currentDirection,receivedDate
abc,sent,Jan 21 2014 02:06,abc,received,Jan 22 2014 02:06
xyz,sent,Nov 21 2014 01:09,xyz,received,Nov 22 2014 02:06
pqr,sent,Oct 21 2014 03:06

更新

kent$  awk -F, -v OFS="," 'BEGIN{print "name,Direction,Date,currentDirection,receivedDate"}
        NR==FNR&&NR>1{a[$1]=$2 FS $3;next}
        FNR>1{printf "%s%s\n",$0,($1 in a?FS a[$1]:"")}' 2.csv 1.csv 
name,Direction,Date,currentDirection,receivedDate
abc,sent,Jan 21 2014 02:06,received,Jan 22 2014 02:06
xyz,sent,Nov 21 2014 01:09,received,Nov 22 2014 02:06
pqr,sent,Oct 21 2014 03:06

这篇关于根据匹配的行合并两个csv文件并在linux中添加新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆