生成列表时如何关闭awk中的文件? [英] How to close file in awk while generating a list of?

查看:74
本文介绍了生成列表时如何关闭awk中的文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

伙计们,我正在尝试寻找一种没有awk错误打开文件太多"的方法.这是我的情况:

Guys I'm trying to find a way to don't have the awk error "too many open file" . Here's my situation:

INPUT:采用这种方案的ASCII文件,很多行:

INPUT : ASCII file, lot of line, with this scheme:

NODE_212_lenght.._1
NODE_212_lenght.._2
NODE_213_lenght.._1
NODE_213_lenght.._2

为了用每个具有相同NODE编号的记录来分割此文件,我使用了这种单线awk命令

In order to split this file with every record with the same NODE number, I've used this one-liner awk command

awk -F "_" '{print >("orfs_for_node_" $2 "")}' <file

对于由许多行组成的文件,此命令在打开的文件太多"中保持发言.我也尝试过用2k线分割,同样.我实际上不能超过2k行,因为输入的文件很大.

With a file composed by lots of lines, this command keeps sayin "too many open files" . I've tried also by splitting by 2k lines, same. I can't actually go under 2k lines, because the input one is a huge file.

我知道awk在执行内部操作后可以关闭文件,但是我实际上不知道该怎么做.我尝试添加

I know awk could close a file after doing something inside, but I don't know actually how to do that. I've tried adding

awk -F "_" '{print >("orfs_for_node_" $2 ""); close(orfs_for_node_*)}' <file 

但这不会产生任何输出.

but this will make no output.

推荐答案

如果您切换到GNU awk,它将为您处理.否则,如果您的输入文件将每个$ 2值的所有行分组在一起,则这是正确的语法:

If you switch to GNU awk that'll handle it for you. Otherwise this is the right syntax if your input file has all the lines for each $2 value grouped together:

awk -F '_' '{out="orfs_for_node_"$2} out!=prev{close(prev)} {print > out; prev=out}' file

否则,您需要使用>> 而不是> :

otherwise you need to use >> instead of >:

awk -F '_' '{out="orfs_for_node_"$2} out!=prev{close(prev)} {print >> out; prev=out}' file

请注意,在第二种情况下,您需要先清空所有先前存在的输出"文件(例如,上一次运行中的文件),因为它始终会附加到输出文件中.

Note that in that second case you'd need to empty any pre-existing "out" files (e.g. from a previous run) before running it since it'll always append to the output files.

这篇关于生成列表时如何关闭awk中的文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆