了解AWK和CSV文件 [英] Understanding AWK and CSV files
问题描述
如何编写AWK程序来分析CSV文件中的字段列表,计算指定字段中每个不同字符串的数量,并打印出找到的每个字符串的数量?我只用C和Java编写过代码,所以我对AWK的语法完全感到困惑.我了解最简单的概念,但是AWK的结构却大不相同.随时感谢,谢谢!
How can I write an AWK program that analyses a list of fields in CSV files, count the number of each different string in the specified field, and print out the count of each string that is found? I have only coded in C and Java, so I am completely confused on the syntax of AWK. I understand the simplest of concepts, however, AWK is structured much differently. Any time is appreciated, thank you!
BEGIN {
FS = ""
}
{
for(i = 1; i <= NF; i++)
freq[$i]++
PROCINFO ["sorted_in"] = "@val_num_desc" #this got the desired result
}
END {
for {this in freq)
printf "%s\t%d\n", this, freq[this]
}
在包含以下内容的CSV文件中
On a CSV file containing:
Field1, Field2, Field3, Field4
A, B, C, D
A, E, F, G
Z, E, C, D
Z, W, C, Q
我能够获得结果:
A 2
B 1
C 3
Q 1
D 1
E 2
F 1
, 12
G 1
W 1
Field1,Field2,Field3,Field4 1
Z 2
这是理想的结果:
A 10
C 7
D 2
E 2
Z 2
B 1
Q 1
Field1 1
Field2 1
F 1
Field3 1
G 1
Field4 1
W 1
我的代码有一个修改,并带有注释.
There is an edit to my code which is commented.
推荐答案
修复了您的代码:
$ awk '
BEGIN { # you need BEGIN block for FS
FS = ", *" # your data had ", " and "," seps
} # ... based on your sample output
{
for(i = 1; i <= NF; i++)
freq[$i]++
}
END {
for(this in freq) # fixed a parenthesis
printf "%s\t%d\n", this, freq[this]
}' file
输出(使用GNU awk.其他awk以不同的顺序显示输出):
Output (using GNU awk. Other awks displayed output in different order):
A 2
B 1
C 3
Q 1
D 2
Field1 1
E 2
Field2 1
F 1
Field3 1
G 1
Field4 1
W 1
Z 2
这篇关于了解AWK和CSV文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!