获取基于时间戳最新的文件 [英] Get the newest file based on timestamp

查看:189
本文介绍了获取基于时间戳最新的文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是新来的shell脚本,所以我需要一些帮助,需要如何去解决这个问题。

I am new to shell scripting so i need some help need how to go about with this problem.

我有一个包含以下格式的文件的目录。该文件是在一个名为diretory /来电/外部/数据

I have a directory which contains files in the following format. The files are in a diretory called /incoming/external/data

AA_20100806.dat
AA_20100807.dat
AA_20100808.dat
AA_20100809.dat
AA_20100810.dat
AA_20100811.dat
AA_20100812.dat

正如你可以看到文件的文件名包含时间戳。即[RANGE] _ [YYYYMMDD] .DAT

As you can see the filename of the file includes a timestamp. i.e. [RANGE]_[YYYYMMDD].dat

什么,我需要做的是找出这些文件具有使用上的文件名不是系统时间戳的时间戳的最新日期和文件名存储在一个变量,并将其移动到另一个目录,其余移动到不同的目录。

What i need to do is find out which of these files has the newest date using the timestamp on the filename not the system timestamp and store the filename in a variable and move it to another directory and move the rest to a different directory.

推荐答案

对于那些谁只想一个答案,那就是:

For those who just want an answer, here it is:

ls | sort -n -t _ -k 2 | tail -1

下面是思维过程,导致我在这里。

Here's the thought process that led me here.

我要承担[RANGE]部分可以是任何东西。

I'm going to assume the [RANGE] portion could be anything.

开始与我们所知道。


  • 工作目录:/传入/外部/数据

  • 的文件格式:[RANGE] _ [YYYYMMDD] .dat文件

我们需要在目录中查找最新的[YYYYMMDD]文​​件,我们需要存储的文件名。

We need to find the most recent [YYYYMMDD] file in the directory, and we need to store that filename.

可用的工具(我只列出了相关的工具,对于这个问题...识别他们的做法变得更加容易):

Available tools (I'm only listing the relevant tools for this problem ... identifying them becomes easier with practice):

  • ls
  • sed
  • awk (or nawk)
  • sort
  • tail

我想我们不需要SED,因为我们可以用ls命令的整个输出工作。使用ls命令时,awk,排序和尾部我们可以得到正确的文件像这样(要记住,你必须要检查你的操作系统会接受的语法):

I guess we don't need sed, since we can work with the entire output of ls command. Using ls, awk, sort, and tail we can get the correct file like so (bear in mind that you'll have to check the syntax against what your OS will accept):

NEWESTFILE=`ls | awk -F_ '{print $1 $2}' | sort -n -k 2,2 | tail -1`

然后,它只是一个把下划线回来的事,这不应该是太辛苦了。

Then it's just a matter of putting the underscore back in, which shouldn't be too hard.

编辑:我有一点时间,让我抽时间去修复命令,至少在Solaris中使用

I had a little time, so I got around to fixing the command, at least for use in Solaris.

下面是令人费解的第一遍(假定目录中的所有文件都在同一个格式:[RANGE] _ [YYYYMMDD] .DAT)。我打赌有更好的方法可以做到这一点,但这个工程我自己的测试数据(其实,我发现刚才一个更好的办法,见下文):

Here's the convoluted first pass (this assumes that ALL files in the directory are in the same format: [RANGE]_[yyyymmdd].dat). I'm betting there are better ways to do this, but this works with my own test data (in fact, I found a better way just now; see below):

ls | awk -F_ '{print $1 " " $2}' | sort -n -k 2 | tail -1 | sed 's/ /_/'

...在写这个的时候,我发现,你可以这样做:

... while writing this out, I discovered that you can just do this:

ls | sort -n -t _ -k 2 | tail -1

我会打破它分解成部分。

I'll break it down into parts.

ls

够简单...获取目录列表,只是文件名。现在,我通过管道将进入下一个命令。

Simple enough ... gets the directory listing, just filenames. Now I can pipe that into the next command.

awk -F_ '{print $1 " " $2}'

这是awk命令。它可以让你采取一种输入线,并修改它以特定的方式。在这里,所有我做的是指定AWK应该打破地方投入有一个underscord(_)。我这样做与-F选项。这给了我每个文件名的两半。那么我告诉awk来输出上半年($ 1),后跟一个空格()
,其次是第二半($ 2)。需要注意的是空间是从我最初的建议缺少的部分。此外,这是不必要的,因为你可以指定在下面的sort命令的分隔符。

This is the AWK command. it allows you to take an input line and modify it in a specific way. Here, all I'm doing is specifying that awk should break the input wherever there is an underscord (_). I do this with the -F option. This gives me two halves of each filename. I then tell awk to output the first half ($1), followed by a space (" ") , followed by the second half ($2). Note that the space was the part that was missing from my initial suggestion. Also, this is unnecessary, since you can specify a separator in the sort command below.

现在的输出被分成[RANGE] [YYYYMMDD] .dat文件每行。现在,我们可以排序的:

Now the output is split into [RANGE] [yyyymmdd].dat on each line. Now we can sort this:

sort -n -k 2

此需要的输入和基于所述第二场进行排序。 sort命令使用空格作为默认分隔符。在写这个更新中,我找到了排序的文件,它允许你指定的分隔符,所以AWK和SED是不必要的。通过以下排序取ls和管道的:

This takes the input and sorts it based on the 2nd field. The sort command uses whitespace as a separator by default. While writing this update, I found the documentation for sort, which allows you to specify the separator, so AWK and SED are unnecessary. Take the ls and pipe it through the following sort:

sort -n -t _ -k 2

此获得相同的结果。现在,你只需要最后一个文件,所以:

This achieves the same result. Now you only want the last file, so:

tail -1

如果你使用awk来分隔文件(这仅仅是增加额外的复杂性,所以不做它的不好意思的),你可以再次使用sed替换下划线空间>

If you used awk to separate the file (which is just adding extra complexity, so don't do it sheepish), you can replace the space with an underscore again with sed:

sed 's/ /_/'

一些好的信息在这里,但我敢肯定,大多数人都不会下来读这样的底部。

Some good info here, but I'm sure most people aren't going to read down to the bottom like this.

这篇关于获取基于时间戳最新的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆