使用大量文件编写文件管理脚本 [英] Scripting for file management with a very large amount of files

查看:269
本文介绍了使用大量文件编写文件管理脚本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有三个OSX机器设置,它们使用同步来保持远程同步共享驱动器。有人犯了一些错误,很多文件最终被重命名。

因此,在整个驱动器中,我都有一个大小为0KB的文件,例如 file.jpg 和另一个名为
file.sync-confilct201705-4528.jpg 的实际大小的文件。我需要递归地搜索整个驱动​​器,当我找到一个带有同步冲突字符串的文件时,请检查是否存在没有同步冲突字符串以及0KB大小的文件。如果有的话,我需要重命名同步冲突文件来覆盖0KB文件。



我已经考虑用bash脚本或者Perl脚本解决这个问题。使用bash我认为使用-regex使用'find'命令会让我开始,但我不知道如何处理结果并运行下一个find测试。我正在学习和研究它。



与Perl相同的问题。我可以通过使用File :: Find找到第一步:找到并选择我需要使用正则表达式来过滤掉文件,但是我再次卡住进入下一步,这将是在同一目录中找到原始文件并执行必要的文件移动功能。

在这两种情况下,我都愿意花时间弄清楚,但是我想知道这些警告会是什么?这两种情况都可以处理递归大量文件吗?有人可以推荐一个更好的方法吗?

解决方案

Perl中的一个好工具是 File :: Find :: Rule



找到所有的 sync-conflict 文件,然后测试相应的文件是否存在并且大小为零b
$ b

 使用警告; 
使用strict;
使用FindBin qw($ RealBin);
使用File :: Copy qw(move);
使用File :: Find :: Rule;

my $ dir = shift || 。; #要搜索的层次结构顶部(从命令行或./)

my @conflict_files = File :: Find :: Rule
- > file-> name('* sync -conflict * .JPG') - >在($ DIR);

foreach my $ conflict(@conflict_files)
{
my($ file)= $ conflict =〜m |(。*)\.sync-conflict |;
$ file。='.jpg';
$ b if(-z$ RealBin / $ file){
print将$ conflict重命名为$ file \\\

#move($ conflict,$ file)或者警告不能将$ conflict移动到$ file:$!;





$ b $ p $这样就可以建立文件名文件为每个 file.sync-conflict 文件并应用 -z 文件测试(-X),它测试存在和零大小。然后使用核心 File :: Copy 重命名文件。



请注意,文件测试运算符需要完整路径,而 File :: Find :: Rule 返回相对于 $ dir 它搜索。我使用 FindBin 提供的 $ RealBin ,这是脚本开始的路径,解决所有链接的路径,为 -z 构建完整路径。



在经过足够的测试(并且首先进行了备份)之后,取消注释 move 行。

代码对文件名做了一些假设,请根据需要进行调整。
在命令行中提供的 $ dir 应该是相对于脚本目录的。


I have a three OSX machine setup that was using syncthing to keep shared drives synchronized remotely. Someone made some mistakes and a lot of files ended up getting renamed.

So all throughout this drive I have situations where there's a file of size 0KB named,for example, file.jpg and another file with real size named file.sync-confilct201705-4528.jpg. I need to search the entire drive recursively and while I find a file with the sync-conflict string in it, check to see if there is the same file without the 'sync-conflict' string along with a size of 0KB. If there is, I need to rename the sync-conflict file to overwrite the 0KB file.

I have considered tackling this with a bash script or a Perl script. Using bash I think just using the 'find' command with -regex would get me started but I don't really know how to process the results and run the next find test. I am studying and working on it.

Same problem with Perl. I can get through the first step using File::Find:find and select what I need using regex to filter out the files, but there again I am stuck getting to the next step, which would be finding the original file in the same directory and performing the necessary file move function.

In both of these cases I am willing to put in the time to figure it out, but I wonder what the caveats will be? Can both of these scenarios handle recursing a large number of files without exception? Is there perhaps a better approach anyone can recommend?

解决方案

One good tool in Perl for this is File::Find::Rule.

Find all sync-conflict files, then test whether corresponding files exist and are zero size

use warnings;
use strict;
use FindBin qw($RealBin);
use File::Copy qw(move);
use File::Find::Rule;

my $dir = shift || '.';  # top of hierarchy to search (from command line, or ./)

my @conflict_files = File::Find::Rule
    ->file->name('*sync-conflict*.jpg')->in($dir);

foreach my $conflict (@conflict_files)
{
    my ($file) = $conflict =~ m|(.*)\.sync-conflict|;
    $file .= '.jpg';

    if (-z "$RealBin/$file") {
        print "Rename $conflict to $file\n"
        #move($conflict, $file) or warn "Can't move $conflict to $file: $!";
    }
 }

This builds the file's name file for each file.sync-conflict file and applies -z file test (-X), which tests for both existence and zero size. Then it renames the file using the core File::Copy.

Note that file-test operators need the full path while File::Find::Rule returns the path relative to the $dir it searches. I use $RealBin provided by FindBin, which is the path to the directory where the script was started with all links resolved, to build the full path for -z.

Uncomment the move line after sufficient testing (and with having made a backup first).

The code makes some assumptions about file names, please adjust as needed. The $dir supplied on the command line is expected to be relative to the script's directory.

这篇关于使用大量文件编写文件管理脚本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆