虽然读线时,awk $线,并写入变量 [英] While read line, awk $line and write to variable

查看:172
本文介绍了虽然读线时,awk $线,并写入变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想一个文件分割成视第五个字段的值不同的较小的文件。一个很漂亮的方式做到这一点是已经建议和<一个href=\"http://stackoverflow.com/questions/16404666/while-read-line-awk-line-with-multiple-delimiters\">also这里。

不过,我试图将其纳入为一的qsub脚本.SH这一点,但没有成功。

的问题是,在部分,在那里被指定到的文件输出线路,

即, F =Alignments_$ 5.SAM打印&GT; ˚F

,我需要通过早期的脚本,它指定该文件应写入的目录声明的变量。我需要这是每个任务时,我为多个文件发送阵列作业构建一个变量来做到这一点。

所以说, $ output_path = ./样品1

我需要写类似

  F = $ output_path/ Alignments_$ 5.SAM打印&GT; F

但它似乎不喜欢有一个$变量,是不是属于一个AWK $场。我甚至不认为它喜欢前后5 $后有两个字符串。

我回来的错误是,它需要的文件的第一行被分割( little.sam ),并尝试将名称˚F 这样的,其次是/ Alignments_$ 5.SAM(这些最后三个放在一起正确)。它说,当然,这是太大的名称。

我怎么能写这样它的工作原理?

谢谢!

 的awk -F'[:\\ t]''#阅读Tile_Number_List号码列表
    FNR == {NR
        NUM [$ 1]
        下一个
    }    #过程.BAM文件的每一行
    #与未知$ 5号将忽略任何行
$ 5 NUM {
    F =Alignments_$ 5.SAM打印&GT; F
}'Tile_Number_List.txt little.sam

更新,之后加入-v来awk和声明变量OPATH

 输入= $ 1
outputBase = $ {输入%.bam}MKDIR -v $ outputBase \\ _TESTNEWDIR = $ outputBase \\ _TESTsamtools查看-h $输入| AWK'NR&GT; = 18'| awk的-F'[\\ t:]'-v OPATH =$ NEWDIR'FNR == {NR
    NUM [$ 1]
    下一个
}$ 5 NUM {
    F = NEWDIR/定线_$ 5.SAM
    打印&GT; F
}'Tile_Number_List.txt - MKDIR:创建的目录little_TEST
AWK:CMD。行:10:(文件名= - FNR = 1)致命的:不能重定向到`/Alignments_1101.sam(权限被拒绝)


解决方案

要通过shell变量的值,如 $ output_path AWK 您需要使用 -v 选项。

  $ output_path = /样品1 /$ awk的-F'[:\\ t]'-v OPATH =$ ouput_path'
    #读取数字的Tile_Number_List列表
    FNR == {NR
        NUM [$ 1]
        下一个
    }    #过程.BAM文件的每一行
    #与未知$ 5号将忽略任何行
    $ 5 NUM {
        F = OPATH路线_$ 5.SAM
        打印&GT; F
    }'Tile_Number_List.txt little.sam

另外你还有从<一个错误href=\"http://stackoverflow.com/questions/16404666/while-read-line-awk-line-with-multiple-delimiters\">$p$pvious问题的留在你的脚本

编辑:

-v 创建 AWK 变量 obase的但你用 NEWDIR 你想要的是:

 输入= $ 1
outputBase = $ {输入%.bam}
MKDIR -v $ outputBase \\ _TEST
NEWDIR = $ outputBase \\ _TESTsamtools查看-h$输入| awk的-F'[\\ t:]'-v OPATH =$ NEWDIR'
FNR == NR和放大器;&安培; NR&GT; = {18
    NUM [$ 1]
    下一个
}
$ 5 NUM {
    F = OPATH/定线_$ 5.SAM#&LT; - OPATH是awk的变量未NEWDIR
    打印&GT; F
}'Tile_Number_List.txt -

您也应该移动 NR&GT; = 18 进入第二 AWK 脚本。

I am trying to split a file into different smaller files depending on the value of the fifth field. A very nice way to do this was already suggested and also here.

However, I am trying to incorporate this into a .sh script for qsub, without much success.

The problem is that in the section where the file to which output the line is specified,

i.e., f = "Alignments_" $5 ".sam" print > f

, I need to pass a variable declared earlier in the script, which specifies the directory where the file should be written. I need to do this with a variable which is built for each task when I send out the array job for multiple files.

So say $output_path = ./Sample1

I need to write something like

f = $output_path "/Alignments_" $5 ".sam"        print > f

But it does not seem to like having a $variable that is not a $field belonging to awk. I don't even think it likes having two "strings" before and after the $5.

The error I get back is that it takes the first line of the file to be split (little.sam) and tries to name f like that, followed by /Alignments_" $5 ".sam" (those last three put together correctly). It says, naturally, that it is too big a name.

How can I write this so it works?

Thanks!

awk -F '[:\t]' '    # read the list of numbers in Tile_Number_List
    FNR == NR {
        num[$1]
        next
    }

    # process each line of the .BAM file
    # any lines with an "unknown" $5 will be ignored
$5 in num {
    f = "Alignments_" $5 ".sam"        print > f
} ' Tile_Number_List.txt little.sam

UPDATE, AFTER ADDING -V TO AWK AND DECLARING THE VARIABLE OPATH

input=$1
outputBase=${input%.bam}

mkdir -v $outputBase\_TEST

newdir=$outputBase\_TEST

samtools view -h $input | awk 'NR >= 18' | awk -F '[\t:]' -v opath="$newdir" '

FNR == NR {
    num[$1]
    next
}

$5 in num {
    f = newdir"/Alignments_"$5".sam";
    print > f
} ' Tile_Number_List.txt -

mkdir: created directory little_TEST'
awk: cmd. line:10: (FILENAME=- FNR=1) fatal: can't redirect to `/Alignments_1101.sam' (Permission denied)

解决方案

To pass the value of the shell variable such as $output_path to awk you need to use the -v option.

$ output_path=./Sample1/

$ awk -F '[:\t]' -v opath="$ouput_path" '    
    # read the list of numbers in Tile_Number_List
    FNR == NR {
        num[$1]
        next
    }

    # process each line of the .BAM file
    # any lines with an "unknown" $5 will be ignored
    $5 in num {
        f = opath"Alignments_"$5".sam"
        print > f
    } ' Tile_Number_List.txt little.sam

Also you still have the error from your previous question left in your script

EDIT:

The awk variable created with -v is obase but you use newdir what you want is:

input=$1
outputBase=${input%.bam}
mkdir -v $outputBase\_TEST
newdir=$outputBase\_TEST

samtools view -h "$input" | awk -F '[\t:]' -v opath="$newdir" '
FNR == NR && NR >= 18 {
    num[$1]
    next
}    
$5 in num {
    f = opath"/Alignments_"$5".sam"   # <-- opath is the awk variable not newdir
    print > f
}' Tile_Number_List.txt -

You should also move NR >= 18 into the second awk script.

这篇关于虽然读线时,awk $线,并写入变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆