按日期列可变数目排序 [英] Sorting by date with variable number of columns

查看:92
本文介绍了按日期列可变数目排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想包括日期排序行,但我有麻烦试图找出如何行进行排序并保持整个行。我也不懂如何使用管道线路进行排序。

I want to sort lines consisting of dates, but I'm having trouble trying to figure out how to sort the lines and keep the lines whole. I also don't understand how to use pipe to sort the lines.

例如,我的脚本接收这是一个文本文件:

For example, my script receives this as a text file:

asdsa 24 asdsa 3 3000 054217542 30.3.2016
asdsadsa 25 asdsadsaa 5 4500 534215365 2.1.2014
dsasda 23 dsada 4 3200 537358234 6.3.2016

我想读一行行:

while read line; do

done < "$1"

和内部排序的日期线。我怎么能以线条为他们在一个文件中进行排序,而我一个阅读吗?

And inside sort the lines by their dates. How can I sort the lines as a they are in a file, while I read them one by one?

如果我这样做什么:

#!/bin/bash

PATH=${PATH[*]}:.
#filename: testScript


while read line; do
    arr=( $line )
    num_of_params=`echo ${#arr[*]}`
    echo $line | sort -n -k$num_of_params

    num_of_params=0
done < "$1"

我这个问题是,我实际上是由它自己发送的每一行进行排序,而不是所有的行在一起,但我不知道其他任何其他的方式做到这一点(不使用临时文件,我不希望使用任何这些)。

My problem with this is that I actually send each line by its own to sort, and not the lines all together, but I don't know any other other way to do it (without using temp files, I'm not looking to use any of these).

输出:

asdsa 24 asdsa 3 3000 054217542 30.3.2016
asdsadsa 25 asdsadsaa 5 4500 534215365 2.1.2014
dsasda 23 dsada 4 3200 537358234 6.3.2016

所需的输出:

asdsadsa 25 asdsadsaa 5 4500 534215365 2.1.2014
dsasda 23 dsada 4 3200 537358234 6.3.2016
asdsa 24 asdsa 3 3000 054217542 30.3.2016

正如你所看到的,它没有工作。

As you can see, it didn't work.

我该如何解决呢?

推荐答案

下面是一个使用的Schwartzian变换使用awk 与剪切

awk '{split($NF,arr,"."); printf("%d%02d%02d\t%s\n",arr[3],arr[2],arr[1],$0)}' infile |
sort -k 1,1 | cut -f 2-

awk的部分首先分割记录, $ NF (日期)的最后一个字段,在时间到一个数组改编

The awk part first splits the last field of the record, $NF (the date), at the periods into an array arr:

split($NF,arr,".")

第二部分将打印带格式化的日期prepended行:第一年,那么月和日,后两者与零填充到两个数字:

The second part prints the line with the reformatted date prepended: first the year, then the month and the day, the latter two with zero padding to two digits:

printf("%d%02d%02d\t%s\n",arr[3],arr[2],arr[1],$0)

此的输出如下所示:

The output of this looks as follows:

20160330        asdsa 24 asdsa 3 3000 054217542 30.3.2016
20140102        asdsadsa 25 asdsadsaa 5 4500 534215365 2.1.2014
20160306        dsasda 23 dsada 4 3200 537358234 6.3.2016

现在我们只要管排序和使用的第一个字段:

Now we can just pipe to sort and use the first field:

sort -k 1,1

产生

20140102        asdsadsa 25 asdsadsaa 5 4500 534215365 2.1.2014
20160306        dsasda 23 dsada 4 3200 537358234 6.3.2016
20160330        asdsa 24 asdsa 3 3000 054217542 30.3.2016

最后,我们再次与消除我们插入领域剪切,从第二场只留下一切:

And finally, we remove our inserted field again with cut, leaving only everything from the second field on:

cut -f 2-

产生

asdsadsa 25 asdsadsaa 5 4500 534215365 2.1.2014
dsasda 23 dsada 4 3200 537358234 6.3.2016
asdsa 24 asdsa 3 3000 054217542 30.3.2016


一个bash的解决方案

如果不是的awk,我们希望只使用猛砸,我们可以这样做:


A Bash solution

If instead of awk we want to use just Bash, we can do this:

#!/bin/bash

# Read each line into an array 'line'
while read -r -a line; do

    # Find the number of array elements
    nel=${#line[@]}

    # Assign the last element of the array to 'date'
    date=${line[nel-1]}

    # Extract the month from the date with parameter expansion
    month=${date#*.}
    month=${month%.*}

    # Year and day need only one expansion step, which is done here directly
    printf "%d%02d%02d\t%s\n" "${date##*.}" "$month" "${date%%.*}" "${line[*]}"

# Pipe result to sort, then remove the first column with cut
done < infile | sort -k 1,1 | cut -f 2-

的总体思路是完全一样的:我们添加包含格式化日期一个额外的列,排序由,然后再删除它。

The general idea is exactly the same: we add an extra column containing the reformatted date, sort by that and then remove it again.

这篇关于按日期列可变数目排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆