从在bash字符串打印列 [英] Printing column from a string in bash

查看:140
本文介绍了从在bash字符串打印列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

已更新问题
好了,我有这样的行的文件:

UPDATED QUESTION Ok, so I have a file with lines like this:

44:)   2.884E-02  0.000E+00  0.000E+00  2.780E+02  0.000E+00  0.000E+00  9.990E+02
45:)   2.884E-02  0.000E+00  0.000E+00  2.780E+02  0.000E+00  0.000E+00  9.990E+02
1:)   3.593E-02  0.000E+00  0.000E+00  2.780E+02  0.000E+00  0.000E+00  1.000E+05
2:)   3.593E-02  0.000E+00  0.000E+00  2.780E+02  0.000E+00  0.000E+00  1.000E+05

在从1中的第一列运行至x的数字(在本例45),然后于1很多次开始。我想某些列移动到单独的文件中。我要移动的列的索引被存储在可变/阵列中 $ selected_columns (在此例2中,5和8)和列数我要移动存储在 $ number_of_columns (在此情况下,3)。

The numbers in the first column run from 1 to x (in this case 45) and then starts over at 1 lots of times. I want to move some of the columns to a separate file. The indexes of the columns I want to move is stored in the variable/array $selected_columns (in this case 2, 5 and 8) and the number of columns I want to move is stored in $number_of_columns (in this case 3).

然后我想创建45文件,一个用于选定列所有 1:),一个是对所有 2选定列: )等等。我想使此尽可能通用,因为列数和从1到x将改变的数目。数目x被总是已知的,并且被选择以提取的列由用户

I then want to create 45 files, one for the selected columns for all 1:), one for the selected columns for all 2:) and so forth. I want to make this as general as possible since both the number of columns and the number running from 1 to x will change. The number x is always known and the columns to extract are chosen by the user.

原题:

我有egrep的进账的字符串。然后我想打印一些在该字符串的列(字)。位置(列索引)的列表被称为在我的bash脚本。目前,它看起来像这样:

I have a string fetched by egrep. Then I want to print some of the columns (words) in that string. The position (column index) is known in a list in my bash script. Currently it looks like this:

line=$(egrep " ${i}:\)" $1)

for ((j=1; j<=$number_of_columns; j++))
do
    awk $line -v current_column=${selected_columns[$j]} '{printf $(current_column)}' > "history_files/history${i}"
done

其中, number_of_columns 是要打印的列数和 selected_columns 包含这些列的相应指标。作为一个例子 number_of_columns = 3 selected_columns = [2 5 8] ,所以我想打印字2号, 5和8从字符串到文件历史$ {我}

where number_of_columns is the number of columns that are to be printed and selected_columns contain the corresponding indexes of those columns. As an example number_of_columns = 3 and selected_columns = [2 5 8], so I want to print word number 2, 5 and 8 from the string line to the file history${i}.

我不知道什么是错的,但是这已经有一些试验和错误进行。目前的误差 AWK:无法打开0.000E + 00(没有这样的文件或目录)

I am not sure what is wrong, but this has been done with some trial and error. The current error is awk: cannot open 0.000E+00 (No such file or directory).

任何帮助AP preciated!

Any help is appreciated!

推荐答案

awk $line -v ...

$行持有的grep的输出,大概不是AWK希望看到它的命令行。此外,米这样的:

$line holds the output of a grep, probably not something awk expects to see on it's command line. Also,m this:

for ((j=1; j<=$number_of_columns; j++))
do
    anything > "history_files/history${i}"
done

将导致您的每一次循环改写历史文件。我不知道你真正想要在那里。

will cause you to overwrite the history file every time through the loop. I don't know what you really wanted there.

您有其他问题,您的脚本转换,虽然。你说:作为一个例子number_of_columns = 3,selected_columns = [2 5 8],所以我想从字符串行打印字2号,5和8的文件历史记录$ {I}。

You have a slew of other issues with your script, though. You said "As an example number_of_columns = 3 and selected_columns = [2 5 8], so I want to print word number 2, 5 and 8 from the string line to the file history${i}.".

这是微不足道完全awk和你不需要做的grep的awk之外要么,所以你可以只是整个事情:

That's trivial entirely in awk and you don't need to do a "grep" outside of awk either, so you could just do the whole thing as:

awk -v pat=" ${i}:\)" -v selected_columns="$selected_columns" '

BEGIN { number_of_columns = split(selected_columns,selected_columnsA) }

$0 ~ pat {
    sep=""
    for (j=1;j<=number_of_columns;j++) {
        current_column = selected_columnsA[j]
        printf "%s,%s",sep,lineA[current_column]
        sep = "\t"
    }
    print ""
}
' "$1" > "history_files/history${i}"

如果不为你工作,我们进行了修复的尝试修复原来的脚本来代替。听起来像是你有封闭上述的外环线,有机会,可能只是在awk脚本的一部分。

If that doesn't work for you, let's fix THAT instead of trying to fix the original script. Sounds like you have enclosing loop outside of the above, chances are that could just be part of the awk script as well.

根据最新OP编辑:

我添加了大量的评论,但让我知道如果您有任何疑问:

I've added lots of comments but let me know if you have questions:

$ cat file
44:)   2.884E-02  0.000E+00  0.000E+00  2.780E+02  0.000E+00  0.000E+00  9.990E+02
45:)   2.884E-02  0.000E+00  0.000E+00  2.780E+02  0.000E+00  0.000E+00  9.990E+02
1:)   3.593E-02  0.000E+00  0.000E+00  2.780E+02  0.000E+00  0.000E+00  1.000E+05
2:)   3.593E-02  0.000E+00  0.000E+00  2.780E+02  0.000E+00  0.000E+00  1.000E+05
$
$ cat tst.sh
selected_columns=(2 5 8)

selCols="${selected_columns[@]}"

awk -v selCols="$selCols" '

BEGIN { # Executed before the first line of the input file is read

    # Split the string of selected column numbers, selCols, into
    # an array selColsA where selColsA[1] has the value of the
    # first space-separated sub-string of selCols (i.e. the number
    # of the first column to print). Note that we dont need the
    # number of columns passed into the script as a result of
    # splitting the string is the count of elements put into the
    # array as a return code from the split() builtin function.
    numCols = split(selCols,selColsA)
}

{ # Executed once for every line of the input file

    # Create a numerix suffix like "45" from the first column
    # in the current line of the input file, e.g. "45:)" by
    # just getting rid of all non-digit characters.
    sfx = $1
    gsub(/[^[:digit:]]/,"",sfx)

    # Create the name of the output file by attaching that
    # numeric suffix to the base value for all output files.
    #histfile = "history_files/history" sfx
    histfile = "tmp" sfx


    # Loop through every column we want printed. selColsA[<index>]
    # gives us a column number which we can then use to access the
    # columns of the current line. Awk uses the builtin variable $0
    # to hold the current line, and it autolatically splits it so
    # that $1 holds the first column, $2 is the second, etc. So
    # if selColsA[1] has the value 3, then $(selColsA[1]) would be
    # the value of the 3rd column of the current input line.
    sep=""
    for (i=1;i<=numCols;i++) {
        curCol = selColsA[i]

        # Print the current column, prefixed by a tab for all but
        # the first column, and without a terminating newline so the
        # next column gets appended to the end of the current output line.
        # Note that in awk "> file" has different semantics from shell
        # and opens the file for writing the first time the line is hit
        # like "> file" in shell, but then appends to it every time its
        # hit afterwards, like ">> file" in shell.
        printf "%s%s",sep,$curCol > histfile
        sep = "\t"
    }
    # Add a newline to the end of the current output line
    print "" > histfile
}

' "$1"
$
$ ./tst.sh file
$
$ cat tmp1
3.593E-02       2.780E+02       1.000E+05
$ cat tmp2
3.593E-02       2.780E+02       1.000E+05
$ cat tmp44
2.884E-02       2.780E+02       9.990E+02
$ cat tmp45
2.884E-02       2.780E+02       9.990E+02

顺便说一句,我用上面的字列和行为了你的利益,因为你刚开始学习,但仅供参考awk的术语,实际上是场和记录。

By the way, I used the words "column" and "line" above for your benefit since you're just learning, but FYI the awk terminology is actually "field" and "record".

这篇关于从在bash字符串打印列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆