如何根据另一个文件中的列表值从 csv 文件中删除行? [英] How to delete rows from a csv file based on a list values from another file?
问题描述
我有两个文件:
candidates.csv
:
id,value
1,123
4,1
2,5
50,5
blacklist.csv
:
1
2
5
3
10
我想从 candidates.csv
中删除所有行,其中第一列 (id
) 的值包含在 blacklist.csv中代码>.
id
始终是数字.在这种情况下,我希望我的输出如下所示:
I'd like to remove all rows from candidates.csv
in which the first column (id
) has a value contained in blacklist.csv
. id
is always numeric. In this case I'd like my output to look like this:
id,value
4,1
50,5
到目前为止,我用于识别重复行的脚本如下所示:
So far, my script for identifying the duplicate lines looks like this:
cat candidates.csv | cut -d , -f 1 | grep -f blacklist.csv -w
这给了我输出
1
2
现在我需要以某种方式将此信息通过管道返回到 sed
/awk
/gawk
/... 以删除重复项,但我不知道怎么样我有什么想法可以从这里继续吗?或者有没有更好的解决方案?我唯一的限制是它必须在 bash 中运行.
Now I somehow need to pipe this information back into sed
/awk
/gawk
/... to delete the duplicates, but I don't know how. Any ideas how I can continue from here? Or is there a better solution altogether? My only restriction is that it has to run in bash.
推荐答案
以下内容如何:
awk -F, '(NR==FNR){a[$1];next}!($1 in a)' blacklist.csv candidates.csv
这是如何工作的?
一个 awk 程序是一系列模式-动作对,写成:
An awk program is a series of pattern-action pairs, written as:
condition { action }
condition { action }
...
其中 condition
通常是一个表达式,而 action
是一系列命令.在这里,第一个条件-动作对如下:
where condition
is typically an expression and action
a series of commands. Here, the first condition-action pairs read:
(NR==FNR){a[$1];next}
如果总记录数NR
等于文件的记录数FNR
(即如果我们正在读取第一个文件),将所有值存储在数组a
中并跳到下一条记录(不要做任何其他事情)!($1 in a)
如果第一个字段不在数组a
中,则执行默认操作,即打印该行.这仅适用于第二个文件,因为第一个条件-动作对的条件不成立.
(NR==FNR){a[$1];next}
if the total record countNR
equals the record count of the fileFNR
(i.e. if we are reading the first file), store all values in arraya
and skip to the next record (do not do anything else)!($1 in a)
if the first field is not in the arraya
then perform the default action which is print the line. This will only work on the second file as the condition of the first condition-action pair does not hold.
这篇关于如何根据另一个文件中的列表值从 csv 文件中删除行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!