在参数更改文本xargs的(或GNU并行) [英] Change text in argument for xargs (or GNU Parallel)

查看:112
本文介绍了在参数更改文本xargs的(或GNU并行)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有我可以以两种方式运行程序:单端或末端配对模式。下面是语法:

 程序<输出目录名称> <输入1> [INPUT2]

在需要输出目录和至少一个输​​入。如果我想就三个文件运行它,比如说,样品A,B和C,我会使用类似与xargs的或并行的发现:

 用户@主持人:〜/单$ LS
sampleA.txt sampleB.txt sampleC.txt用户@主持人:〜/单$发现。 -name样本*| xargs的回声-i程序{}退房手续{}
节目./sampleA.txt-out ./sampleA.txt
节目./sampleB.txt-out ./sampleB.txt
节目./sampleC.txt-out ./sampleC.txt用户@主持人:〜/单$发现。 -name样本*| --dry平行运行程序{}退房手续{}
节目./sampleA.txt-out ./sampleA.txt
节目./sampleB.txt-out ./sampleB.txt
节目./sampleC.txt-out ./sampleC.txt

但是,当我想在配对末端模式下运行程序,我需要给它两个输入。这些相关的文件,但他们不能简单地级联 - 具有运行与两个作为输入的程序。文件被命名为理智,例如,sampleA_1.txt和sampleA_2.txt。

我希望能够在命令行上的东西,如xargs的(或preferably并行)轻松建立这样的:

 用户@主持人:〜/ $配对LS
sampleA_1.txt sampleB_1.txt sampleC_1.txt
sampleA_2.txt sampleB_2.txt sampleC_2.txt用户@主持人:〜/ $配对找到。 -name样品* _1.txt| SED / AWK? |平行 ?
节目./sampleA-out ./sampleA_1.txt ./sampleA_2.txt
节目./sampleB-out ./sampleB_1.txt ./sampleB_2.txt
节目./sampleC-out ./sampleC_1.txt ./sampleC_2.txt

在理想情况下,命令脱光_1.txt创建输出目录名(sampleA出,等),但我真的需要能够采取这样的说法,改变_1到_2为第二个输入

我知道这是死的简单用一个脚本 - 我有一个快速的常规前pression替代这样做在Perl。但我很想能够与一个快速班轮做到这一点。

先谢谢了。


解决方案

  

我有快速常规的前pression替代这样做在Perl。但我很想能够与一个快速班轮做到这一点。


Perl有单行,也正如 SED AWK 做的。你可以写:

 找到。 -name样品* _1.txt| perl的-pe的/ _1 \\ .TXT $ //'|并行程序{}退房手续{} _1.txt {} _2.txt

(即 -e 标志的意思是下一个参数是程序文本;在 -p 标记手段该计划应在循环运行;对于输入的每一行,设置 $ _ 来该行,然后运行该程序,然后打印 $ _ )。

I have a program that I can run in two ways: single-end or paired-end mode. Here's the syntax:

program <output-directory-name> <input1> [input2]

Where the output directory and at least one input is required. If I wanted to run this on three files, say, sample A, B, and C, I would use something like find with xargs or parallel:

user@host:~/single$ ls
sampleA.txt  sampleB.txt  sampleC.txt

user@host:~/single$ find . -name "sample*" | xargs -i echo program {}-out {}
program ./sampleA.txt-out ./sampleA.txt
program ./sampleB.txt-out ./sampleB.txt
program ./sampleC.txt-out ./sampleC.txt

user@host:~/single$ find . -name "sample*" | parallel --dry-run program {}-out {}
program ./sampleA.txt-out ./sampleA.txt
program ./sampleB.txt-out ./sampleB.txt
program ./sampleC.txt-out ./sampleC.txt

But when I want to run the program in "paired-end" mode, I need to give it two inputs. These are related files, but they can't simply be concatenated - you have to run the program with both as inputs. Files are named sensibly, e.g., sampleA_1.txt and sampleA_2.txt.

I want to be able to create this easily on the command line with something like xargs (or preferably parallel):

user@host:~/paired$ ls
sampleA_1.txt  sampleB_1.txt  sampleC_1.txt
sampleA_2.txt  sampleB_2.txt  sampleC_2.txt

user@host:~/paired$ find . -name "sample*_1.txt" | sed/awk? | parallel ?
program ./sampleA-out ./sampleA_1.txt ./sampleA_2.txt
program ./sampleB-out ./sampleB_1.txt ./sampleB_2.txt
program ./sampleC-out ./sampleC_1.txt ./sampleC_2.txt

Ideally, the command would strip off the _1.txt to create the output directory name (sampleA-out, etc), but I really need to be able to take that argument and change the _1 to a _2 for the second input.

I know this is dead simple with a script - I did this in Perl with a quick regular expression substitution. But I would love to be able to do this with a quick one-liner.

Thanks in advance.

解决方案

I did this in Perl with a quick regular expression substitution. But I would love to be able to do this with a quick one-liner.

Perl has one-liners, too, just as sed and awk do. You can write:

find . -name "sample*_1.txt" | perl -pe 's/_1\.txt$//' | parallel program {}-out {}_1.txt {}_2.txt

(The -e flag means "the next argument is the program text"; the -p flag means "the program should be run in loop; for each line of input, set $_ to that line, then run the program, then print $_".)

这篇关于在参数更改文本xargs的(或GNU并行)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆