如何按百分比添加列 [英] How to Add Column with Percentage

查看:147
本文介绍了如何按百分比添加列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想计算所有行中每一行的价值百分比,并将其添加为另一列. 输入(定界符为\ t):

I would like to calculate percentage of value in each line out of all lines and add it as another column. Input (delimiter is \t):

1   10      
2   10
3   20
4   40

所需的输出,其中增加了第三列,显示了根据第二列中的值计算出的百分比:

Desired output with added third column showing calculated percentage based on values in second column:

1   10   12.50   
2   10   12.50
3   20   25.00
4   40   50.00

我尝试自己做,但是当我计算所有行的总数时,我不知道如何保持其余行不变.非常感谢您的帮助!

I have tried to do it myself, but when I calculated total for all lines I didn't know how to preserve rest of line unchanged. Thanks a lot for help!

推荐答案

在这里,一个 pass 步骤awk解决方案-

Here you go, one pass step awk solution -

awk 'NR==FNR{a = a + $2;next} {c = ($2/a)*100;print $1,$2,c }' file file

[jaypal:~/Temp] cat file
1   10      
2   10
3   20
4   40
[jaypal:~/Temp] awk 'NR==FNR{a = a + $2;next} {c = ($2/a)*100;print $1,$2,c }' file file
1 10 12.5
2 10 12.5
3 20 25
4 40 50

更新:如果输出中需要使用制表符,则只需将OFS变量设置为"\ t"即可.

Update: If tab is a required in output then just set the OFS variable to "\t".

[jaypal:~/Temp] awk -v OFS="\t" 'NR==FNR{a = a + $2;next} {c = ($2/a)*100;print $1,$2,c }' file file
1   10  12.5
2   10  12.5
3   20  25
4   40  50

模式{action}语句的突破:

  • 第一个模式是NR==FNR. FNR是awk的内置变量,用于跟踪给定文件中的记录数(默认情况下用新行分隔).因此,在我们的情况下,FNR为4.NR与FNR相似,但不会重置为0.它会继续增长.因此本例中的NR为8.

  • The first pattern is NR==FNR. FNR is awk's in-built variable that keeps track of number of records (by default separated by a new line) in a given file. So FNR in our case would be 4. NR is similar to FNR but it does not get reset to 0. It continues to grow on. So NR in our case would be 8.

该模式仅对前4条记录适用,这正是我们想要的.在仔细阅读了4条记录之后,我们将总数分配给变量a.注意,我们没有初始化它.在awk中,我们不必这样做.但是,如果整个第2列为0,这会中断.因此,您可以通过在第二个动作语句中放置一个if语句来处理它,即仅当a> 0时才进行除法,否则用0除以某物.

This pattern will be true only for the first 4 records and thats exactly what we want. After perusing through the 4 records, we are assign the total to a variable a. Notice that we did not initialize it. In awk we don't have to. However, this would break if entire column 2 is 0. So you can handle it by putting an if statement in the second action statement i.e do the division only if a > 0 else say division by 0 or something.

next是因为我们实际上并不希望执行第二个模式{action}语句. next告诉awk停止进一步的操作并移至下一条记录.

next is needed cause we don't really want second pattern {action} statement to execute. next tells awk to stop further actions and move to the next record.

一旦解析了四个记录,下一个模式{action}就开始了,这很简单.进行百分比计算,并在列1和2以及其旁边打印百分比.

Once the four records are parsed, the next pattern{action} begins, which is pretty straight forward. Doing the percentage and print column 1 and 2 along with percentage next to them.

注意: 正如注释中提到的@lhf一样,只有在文件中设置了数据的情况下,此单线才会起作用.如果您通过管道传递数据,它将无法正常工作.

评论中,正在讨论如何使此awk one-liner接受来自pipe而不是file的输入.好吧,我能想到的唯一方法是将列值存储在array中,然后使用for loop吐出每个值及其百分比.

In the comments, there is a discussion going on ways to make this awk one-liner take input from a pipe instead of a file. Well the only way I could think of was to store the column values in array and then using for loop to spit each value out along with their percentage.

现在awk中的arraysassociative且从不顺序排列,即,将值从数组中拉出的顺序将与输入顺序不同.因此,如果可以,则下面的一个-班轮应该工作.

Now arrays in awk are associative and are never in order, i.e pulling the values out of arrays will not be in the same order as they went in. So if that is ok then the following one-liner should work.

[jaypal:~/Temp] cat file
1   10      
2   10
3   20
4   40

[jaypal:~/Temp] cat file | awk '{b[$1]=$2;sum=sum+$2} END{for (i in b) print i,b[i],(b[i]/sum)*100}'
2 10 12.5
3 20 25
4 40 50
1 10 12.5

要按顺序获取它们,可以将结果传递到sort.

To get them in order, you can pipe the result to sort.

[jaypal:~/Temp] cat file | awk '{b[$1]=$2;sum=sum+$2} END{for (i in b) print i,b[i],(b[i]/sum)*100}' | sort -n
1 10 12.5
2 10 12.5
3 20 25
4 40 50

这篇关于如何按百分比添加列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆