BASH:加入基于共同字段名2的CSV文件 [英] BASH: Joining 2 CSV files based on common field name
问题描述
我有2个的CSV文件,我需要使用bash加入他们的行列:
file_1.csv列:track_id
标题
song_id
发布
artist_id
artist_mbid
ARTIST_NAME
为期
artist_familiarity
artist_hotttnesss
年在file_1.csv采样日期TRZZZZZ12903D05E3A,红外恒星,SOZPUEF12AF72A9F2A,档案卷。 2,ARBG8621187FB54842,4279aba0-1bde-40a9-8fb2-c63d165dc554,Delerium,495.22893,0.69652442519,0.498471038842,2001file_2.csv列:track_id
sales_date
sales_count在file_2.csv样本数据TRZZZZZ12903D05E3A,2014-06-19,79
文件之间的关系是 file_1.track_id = file_2.track_id
。
我想创建一个文件3 file_3.csv
,将有以下几列:
我曾尝试以下方法:
加入-t','-1 N-ñfile_2.csv file_1.csv>> file_3.csv
和
的awk -F,'NR == FNR {a [$ 0] = $ 0;接下来}(在$ 1){打印[$ 1],> file_3.csv}'file_1.csv file_2.csv
虽然 file_3.csv
被创建,它是一个空文件。
如何做到这一点任何想法?
谢谢!
下面加入
命令应该做的伎俩:
加入--header -t','-j 1 file_2.csv file_1.csv
只要确保你的CSV文件联接字段排序;有 track_id
在每个文件的第一个字段使这个容易。
您应该在这两个文件中使用的测试数据,当你感到满意的命令是做你想要什么,你可以根据实际数据运行它和它的输出重定向到 file_3.csv
。
I have 2 CSV files and I need to JOIN them using BASH:
file_1.csv columns:
track_id
title
song_id
release
artist_id
artist_mbid
artist_name
duration
artist_familiarity
artist_hotttnesss
year
Sample date in file_1.csv
TRZZZZZ12903D05E3A,Infra Stellar,SOZPUEF12AF72A9F2A,Archives Vol. 2,ARBG8621187FB54842,4279aba0-1bde-40a9-8fb2-c63d165dc554,Delerium,495.22893,0.69652442519,0.498471038842,2001
file_2.csv columns:
track_id
sales_date
sales_count
Sample data in file_2.csv
TRZZZZZ12903D05E3A,2014-06-19,79
The relation between the files is that file_1.track_id = file_2.track_id
.
I want to create a 3rd file file_3.csv
that will have the following columns:
file_2.track_id,file_2.sales_date,file_2.sales_count,file_1.title,file_1.song_id,file_1.release,file_1.artist_id,file_1.artist_mbid,file_1.artist_name,file_1.duration,file_1.artist_familiarity,file_1.artist_hotttnesss,file_1.year
I have tried the following methods:
join -t',' -1 N -1 N file_2.csv file_1.csv >> file_3.csv
and
awk -F, 'NR==FNR{a[$0]=$0;next} ($1 in a){print a[$1]"," > "file_3.csv"}' file_1.csv file_2.csv
Although the file_3.csv
gets created, it is an empty file.
Any ideas on how to do this?
Thanks!
The following join
command should do the trick:
join --header -t',' -j 1 file_2.csv file_1.csv
Just make sure that your CSV files are sorted on the join fields; having
track_id
as the first field in each file makes this easy.
You should use test data in both files and when you're satisfied that the command is doing what you want, you can run it against actual data and redirect its output to file_3.csv
.
这篇关于BASH:加入基于共同字段名2的CSV文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!