猫管AWK操作的比较来的awk上的文件命令 [英] Comparision of cat pipe awk operation to awk command on a file

查看:154
本文介绍了猫管AWK操作的比较来的awk上的文件命令的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在试图优化一些我的服务器相关的数据,我和我的团队有超过Linux命令的用法的讨论。将要求会员请帮助我们理解这个概念更precisely。

在服务器上,我们有记录将被创建每分钟的文件,我们需要搜索日志与例如特定的标记:错误日志,超时日志,请求失败日志。在众人之中,一个要求是提供有关这些标签的计数信息

的简单逻辑将awk的特定字段(定界符)排序和uniq的-c命令计数这样的实例中的数目。

我可以看到两种方法来执行它:


  

猫FNAME | AWK -F:{打印$ 1'} |排序| uniq的-c



  

awk的-F:{打印$ 1'} FNAME |排序| uniq的-c


文件大小可以在GB的所以它可以命令更有效的去了。


解决方案

有3种方式打开一个文件,并AWK上的内容进行操作:


  1. 猫打开文件:

     猫文件| awk的'...'


  2. shell重定向打开文件:

     的awk'...'<文件


  3. AWK打开文件

     的awk'...'的文件


这些选择:


  1. 总是要避免为和管道使用资源,提供没有价值,谷歌UUOC(无用的使用猫)了解详情。

其中的其他使用; 2,是值得商榷的:

<醇开始=2>
  • 的优点是,外壳打开文件,而不是工具,所以你可以依靠一致的错误处理,如果你这样做的所有工具

  • 有刀具知道它是在(例如filename在AWK),所以你可以使用内部操作文件的名称的优势。

  • 要看到差距,考虑这2个文件:

      $ ls -l命令文件1文件2
    -rw-R - R-- 1埃德无30年3月4日9时55分文件1
    --w ------- 1埃德无0 3月30日09:55文件2
    $ file1的猫
    一个
    b
    $ file2的猫
    猫:文件2:权限被拒绝

    和看到,当您尝试运行两个使用打开它们这两种方法的内容awk的会发生什么:

      $的awk'{打印文件名,$ 0}'&LT;文件1
    - 一个
    - b$ AWK'{打印文件名,$ 0}'文件1
    文件1一
    文件1 b$ AWK'{打印文件名,$ 0}'&LT;文件2
    -bash:文件2:权限被拒绝$ AWK'{打印文件名,$ 0}'文件2
    AWK:致命的:无法打开文件'文件2',用于读取(权限被拒绝)

    请注意,对于打开读取文件,文件2,当您使用重定向错误消息从外壳来了,所以看上去完全一样的错误信息,当我第一次尝试它同时让AWK时打开它从AWK来了,是从壳的消息不同,会在各种不同的awks错误消息。

    请注意,使用awk来打开文件时,文件名是用文件的名称填充正在操作,但是使用重定向打开它设置为该文件时 -

    我个人认为,3(填充文件名)大大超过2(一致的错误文件打开错误的处理)的好处和利益,所以我会一直使用:

     的awk'...'的文件

    和您的特定问题,您会使用:

     的awk -F:'{CNT [$ 1] ++} END {了(我在CNT)打印CNT [I],I}'FNAME

    While trying to optimize few of my server related data, I and my team had a discussion over the usage of linux commands. Would request the members to please help us understand the concept more precisely.

    On servers we have log files which are created every minutes and we need to search logs with specific tags for example: Error logs, Timeout Logs, Request fail logs. Out of many, one requirement is to provide information about the count of these tags

    The simple logic would be to awk the specific field(with delimiter) to sort and uniq -c command to count the number of such instances.

    I can see two ways to perform it:

    cat fname | awk -F":" {'print $1'} | sort | uniq -c

    and

    awk -F":" {'print $1'} fname | sort | uniq -c

    The file size can go in GB's so which command could be more effective.

    解决方案

    There are 3 ways to open a file and have awk operate on it's contents:

    1. cat opens the file:

      cat file | awk '...'
      

    2. shell redirection opens the file:

      awk '...' < file
      

    3. awk opens the file

      awk '...' file
      

    Of those choices:

    1. is always to be avoided as the cat and pipe are using resources and providing no value, google UUOC (Useless Use Of Cat) for details.

    Which of the other 2 to use is debatable:

    1. has the advantage that the shell is opening the file rather than the tool so you can rely on consistent error handling if you do this for all tools
    2. has the advantage that the tool knows the name of the file it is operating on (e.g. FILENAME in awk) so you can use that internally.

    To see the difference, consider these 2 files:

    $ ls -l file1 file2
    -rw-r--r-- 1 Ed None 4 Mar 30 09:55 file1
    --w------- 1 Ed None 0 Mar 30 09:55 file2
    $ cat file1
    a
    b
    $ cat file2
    cat: file2: Permission denied
    

    and see what happens when you try to run awk on the contents of both using both methods of opening them:

    $ awk '{print FILENAME, $0}' < file1
    - a
    - b
    
    $ awk '{print FILENAME, $0}' file1
    file1 a
    file1 b
    
    $ awk '{print FILENAME, $0}' < file2
    -bash: file2: Permission denied
    
    $ awk '{print FILENAME, $0}' file2
    awk: fatal: cannot open file `file2' for reading (Permission denied)
    

    Note that the error message for opening the unreadable file, file2, when you use redirection came from the shell and so looked exactly like the error message when I first tried to cat it while the error message when letting awk open it came from awk and is different from the shell message and would be different across various awks.

    Note that when using awk to open the file, FILENAME was populated with the name of the file being operated on but when using redirection to open the file it was set to -.

    I personally think that the benefit of "3" (populated FILENAME) vastly outweighs the benefit of "2" (consistent error handling of file open errors) and so I would always use:

    awk '...' file
    

    and for your particular problem you'd use:

    awk -F':' '{cnt[$1]++} END{for (i in cnt) print cnt[i], i}' fname
    

    这篇关于猫管AWK操作的比较来的awk上的文件命令的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆