BASH:加入基于共同字段名2的CSV文件 [英] BASH: Joining 2 CSV files based on common field name

查看:146
本文介绍了BASH:加入基于共同字段名2的CSV文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有2个的CSV文件,我需要使用bash加入他们的行列:

  file_1.csv列:track_id
标题
song_id
发布
artist_id
artist_mbid
ARTIST_NAME
为期
artist_familiarity
artist_hotttnesss
年在file_1.csv采样日期TRZZZZZ12903D05E3A,红外恒星,SOZPUEF12AF72A9F2A,档案卷。 2,ARBG8621187FB54842,4279aba0-1bde-40a9-8fb2-c63d165dc554,Delerium,495.22893,0.69652442519,0.498471038842,2001file_2.csv列:track_id
sales_date
sales_count在file_2.csv样本数据TRZZZZZ12903D05E3A,2014-06-19,79

文件之间的关系是 file_1.track_id = file_2.track_id

我想创建一个文件3 file_3.csv ,将有以下几列:

<$p$p><$c$c>file_2.track_id,file_2.sales_date,file_2.sales_count,file_1.title,file_1.song_id,file_1.release,file_1.artist_id,file_1.artist_mbid,file_1.artist_name,file_1.duration,file_1.artist_familiarity,file_1.artist_hotttnesss,file_1.year

我曾尝试以下方法:

 加入-t','-1 N-ñfile_2.csv file_1.csv&GT;&GT; file_3.csv

 的awk -F,'NR == FNR {a [$ 0] = $ 0;接下来}(在$ 1){打印[$ 1],&GT; file_3.csv}'file_1.csv file_2.csv

虽然 file_3.csv 被创建,它是一个空文件。
如何做到这一点任何想法?

谢谢!


解决方案

下面加入命令应该做的伎俩:

 加入--header -t','-j 1 file_2.csv file_1.csv

只要确保你的CSV文件联接字段排序;有
track_id 在每个文件的第一个字段使这个容易。

您应该在这两个文件中使用的测试数据,当你感到满意的命令是做你想要什么,你可以根据实际数据运行它和它的输出重定向到 file_3.csv

I have 2 CSV files and I need to JOIN them using BASH:

file_1.csv columns: 

track_id    
title
song_id 
release 
artist_id   
artist_mbid 
artist_name 
duration    
artist_familiarity  
artist_hotttnesss
year

Sample date in file_1.csv

TRZZZZZ12903D05E3A,Infra Stellar,SOZPUEF12AF72A9F2A,Archives Vol. 2,ARBG8621187FB54842,4279aba0-1bde-40a9-8fb2-c63d165dc554,Delerium,495.22893,0.69652442519,0.498471038842,2001

file_2.csv columns: 

track_id    
sales_date  
sales_count

Sample data in file_2.csv

TRZZZZZ12903D05E3A,2014-06-19,79

The relation between the files is that file_1.track_id = file_2.track_id.

I want to create a 3rd file file_3.csv that will have the following columns:

file_2.track_id,file_2.sales_date,file_2.sales_count,file_1.title,file_1.song_id,file_1.release,file_1.artist_id,file_1.artist_mbid,file_1.artist_name,file_1.duration,file_1.artist_familiarity,file_1.artist_hotttnesss,file_1.year

I have tried the following methods:

join -t',' -1 N -1 N file_2.csv file_1.csv >> file_3.csv

and

awk -F, 'NR==FNR{a[$0]=$0;next} ($1 in a){print a[$1]"," > "file_3.csv"}' file_1.csv file_2.csv

Although the file_3.csv gets created, it is an empty file. Any ideas on how to do this?

Thanks!

解决方案

The following join command should do the trick:

join --header -t',' -j 1 file_2.csv file_1.csv

Just make sure that your CSV files are sorted on the join fields; having track_id as the first field in each file makes this easy.

You should use test data in both files and when you're satisfied that the command is doing what you want, you can run it against actual data and redirect its output to file_3.csv.

这篇关于BASH:加入基于共同字段名2的CSV文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆