这是在AWK打印的最快方法 [英] which is the fastest way to print in awk

查看:102
本文介绍了这是在AWK打印的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图做一些测试,我想知道什么是打印通过 NAWK 东西的最快方法。
在我使用的那一刻的printf ARR [2]; ,但似乎需要比正常更多的时间来打印

I am trying to make some measurements, and i would like to know what is the fastest way to print something through nawk. at the moment i use printf ARR[2] " ";, but it seems to take more time than normal to print.

信息:我打印约500号码并添加在空间的printf ,这样不是一切会在打印出被stucked在一起。另外,我上运行的ksh脚本,在UNIX中的Oracle Solaris。

Info: I am printing around 500 numbers and adding the space in the printf so that not everything would be stucked together in the print out. Also i am running the script on ksh, in unix oracle solaris.

就像这样,它需要大约14秒,打印的一切,有没有什么更快的方法我能做到这一点?

Like this, it needs around 14 seconds to print everything, is there any faster way i could do this?

先谢谢了!

更新

UPDATE

这是我关心的功能awkfun,在whuch我使用时间当我把它才能让我的时间测量。
想想数字作为拥有1000个随机数字和一个变量 XNUMBERS 保存1000个随机数字,但在这一个变量格式, 123 | 321 ,所以需要随机数reverces,并增加了一个 |中间
我检查了每个数字如果在 XNUMBERS exhists,如果exhists我正在打印出只有扭转数

The function that i care about is awkfun, in whuch i use time when i call it in order to make my time measurements. Think of NUMBERS as a variable that holds 1000 random numbers, and XNUMBERS a variable that holds 1000 random number but in this format, 123|321, so it takes the random number reverces it and adds a | in the middle. I am checking for each of NUMBERS if it exhists in XNUMBERS and if it exhists i am printing out only the reversed number.

numfun() {
    NUMBERS=`nawk ' BEGIN{ 
        srand();
        for (i=0; i<=999; i++) {
            printf("%s\n", 100 + int(rand() * (899)));
        }   
    }'`
}
numfun
sleep 1
xnumfun() {
    XNUMBERS=`nawk ' BEGIN{ 
        srand();
        for (i=0; i<=999; i++) {
            XNUMBERS[i]= 100 + int(rand() * (899));
        }
        for (i=0; i<=999; i++) {
            ver=XNUMBERS[i] "";
                    rev = "";
            for (q=length(ver); q!=0; q--) {
                rev = rev substr(ver, q, 1);
            }
            printf("%s\n", XNUMBERS[i] "|" rev );
        }
    }'`
}
xnumfun
awkfun() {
    for n in $NUMBERS
    do
        echo "${XNUMBERS}" | nawk -v VAR=$n '
        {
            split($1,ARR,"|")
            if (VAR == ARR[1]){
                printf ARR[2] " ";
                exit;
            }
        }' 
    done

}
shellfun() {
    for n in $NUMBERS
    do
        for x in $XNUMBERS
        do
            if test "$n" -eq "${x%%\|*}"
                then
                echo "${x##*\|}";
                break;
            fi
            continue;
        done
    done
}
sleep 1
time awkfun;
echo "\nAWK TIME\n\n-----------------------------";
time shellfun;
echo "\nSHELL TIME\n\n-----------------------------";
time numfun;
echo "\nNUMBERS TIME\n\n-----------------------------";
time xnumfun;
echo "\nXNUMBERS TIME\n\n-----------------------------\n\nTOTAL TIME\n";

结果

正如炼脚本, AWK 平均实时= 0.84 SHELL <后的基准,对于结果/ STRONG>平均实时: 0,4​​8

Just as a reference, for the results after refining the script, AWK average Real time = 0,84 , SHELL average Real Time: 0,48

推荐答案

原因你的程序很慢是因为印刷的不是。因为你调用 NAWK 的新副本 $ NUMBERS 的每一个元素,你的程序很慢。这是非常浪费的,你应该从一开始就重新考虑你的方案设计。您似乎大多是想看看哪些号码从一个列表在第二个列表存在。如果你想这样做的NAWK,你应该先阅读整个第一列表,第二个文件读取每个号码前在关联数组中的元素存储。

The reason your program is slow is not because of printing. Your program is slow because you invoke a new copy of nawk for every element of $NUMBERS. This is very wasteful and you should rethink your program design from the beginning. It appears you are mostly trying to see which numbers from one list exist in a second list. If you want to do this in nawk, you should read the entire first list first, and store the elements in an associative array before reading each number from the second file.

您也许可以更清晰地解决这个问题,使用加入的grep

You could probably solve this problem more cleanly using join or grep.

编辑:这是一个使用的grep 工作方案。这比你原来的 shellfun()

Here's a working solution using grep. It's at least 20x faster than your original shellfun().

shellfun2() {
    echo $XNUMBERS | tr ' ' '\n' | cut -d '|' -f1 \
        | grep -f <(echo $NUMBERS | tr ' ' '\n') | rev
}

它的工作方式是采取一切从 $ XNUMBERS 数字管道之前(因此 12 | 21 34 | 43 变成 12 \\ N34 ),然后通过管道那些的grep -f 参数为全 $号。这意味着我们搜索所有 $ XNUMBERS 的左侧面,和打印的比赛,我们在 $ NUMBERS内只需使用扭转他们。我们不需要 $ XNUMBERS 的右手边在所有(所以也许你甚至可以停在首位生成它们,节省更多的时间)。

The way it works is to take all the numbers from $XNUMBERS before the pipes (so 12|21 34|43 becomes 12\n34), then pipe those to grep with the -f argument being all of $NUMBERS. This means we search for all the left-hand sides of $XNUMBERS within $NUMBERS, and after printing the matches we simply use rev to reverse them. We don't need the right-hand sides of $XNUMBERS at all (so maybe you can even stop generating them in the first place, saving more time).

编辑>转上面这一点:

Since you've now told us you are running on Solaris instead of Linux, you don't have rev, so you can replace rev in the above with this:

sed '/\n/!G;s/\(.\)\(.*\n\)/&\2\1/;//D;s/.//'

和可以更换的grep 的/ usr / XPG4 /斌/ grep的来获得一个增强版支持 -f

And you can replace grep with /usr/xpg4/bin/grep to get an enhanced version that supports -f.

这篇关于这是在AWK打印的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆