shell脚本迭代抛目录和文件名裂 [英] shell script iterate throw directories and split filenames

查看:201
本文介绍了shell脚本迭代抛目录和文件名裂的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要从文件名中提取2件事 - 扩大和多家

I need to extract 2 things from filenames - the extension and a number.

我有一个文件夹在/ var / www / html等/ MyFolder中/,这个文件夹中包含了几个文件夹和每个文件夹中存储有一些文件。
该文件具有以下结构:a_X_mytest.jpg或a_X_mytest.png。
在A_是修复,并在每个文件夹一样的,我需要的X和文件扩展名。

I have a folder "/var/www/html/MyFolder/", this folder contains a few more folders and in each folder are some files stored. The file has the following structure: "a_X_mytest.jpg" or "a_X_mytest.png". The "a_" is fix and in each folder the same, and i need the "X" and the file extension.

我的脚本是这样的:

#!/bin/bash
for dir in /var/www/html/MyFolder/*/
do
  dir=${dir%*/}
  find "/var/www/html/MyFolder/${dir##*/}/a_*.*" -maxdepth 1 -mindepth 1 -type f
done

这只是从我的脚本开始。

That's only the beginning from my script.

有在我的脚本错误:

find: `/var/www/html/MyFolder/first/a_*.*': No such file or directory
find: `/var/www/html/MyFolder/sec/a_*.*': No such file or directory
find: `/var/www/html/MyFolder/test/a_*.*': No such file or directory

是否有人知道在哪里的错误是什么?
下一步骤中,当行以上在工作,是分裂找到的文件,并获得这两个部分

Does anybody know where the mistake is? The next step, when the lines above are working, is to split the found files and get the two parts.

要分开我会用这样的:

arrFIRST=(${IN//_/ })
echo ${arrFIRST[1]}
arrEXT=(${IN//./ })
echo ${arrEXT[1]}

任何人可以帮助我与我的问题?

Can anybody help me with my problem?

推荐答案

TL;博士:

您的脚本可以简化为以下内容:

Your script can be simplified to the following:

for file in /var/www/html/MyFolder/*/a_*.*; do
  [[ -f $file ]] || continue
  [[ "${file##*/}" =~ _(.*)_.*\.(.*)$ ]] && 
    x=${BASH_REMATCH[1]} ext=${BASH_REMATCH[2]}
  echo "$x"
  echo "$ext"
done


  • A 单水珠(文件名模式,通配符模式)是你的情况足够了,因为水珠可以拥有的多个的通配符的跨层次结构的级别的/ var / www / html等/ MyFolder中/ * /一_ * 查找文件匹配 A _ * 任何直接的子文件夹中( * / )的文件夹的/ var / www / html等/ MyFolder中。结果
    你只需要查找来匹配位于文件的不同的的子树的水平(但你可能还需要它更复杂的匹配需要)。

  • [-f $ FILE]] ||突破确保只的文件的考虑,也有效地退出循环,如果没有找到匹配。

  • [... =〜...]] 使用bash中的正则表达式匹配运营商, =〜 ,从每个匹配的文件( $ {文件## * /} )的文件名部分提取所关注的标记。

  • 正则表达式匹配的结果存储在保留数组变量$ {BASH_REMATCH},包含什么一号括号SUBEX $ P 1号元素$ pssion((...) - 又名捕获组)抓获,等等

    • A single glob (filename pattern, wildcard pattern) is sufficient in your case, because a glob can have multiple wildcards across levels of the hierarchy: /var/www/html/MyFolder/*/a_*.* finds files matching a_*.* in any immediate subfolder of (*/) of folder /var/www/html/MyFolder.
      You only need find to match files located on different levels of a subtree (but you may also need it for more complex matching needs).
    • [[ -f $file ]] || break ensures that only files are considered and also effectively exits the loop if NO matches are found.
    • [[ ... =~ ... ]] uses bash's regex-matching operator, =~, to extract the tokens of interest from the filename part of each matching file (${file##*/}).
    • The results of the regex matching are stored in reserved array variable "${BASH_REMATCH}", with the 1st element containing what the 1st parenthesized subexpression ((...) - a.k.a. capture group) captured, and so on.


      • 另外,你也可以使用与数组匹配的文件名解析为它们的组件:

      • Alternatively, you could have used read with an array to parse matching filenames into their components:

      IFS='_.' read -ra tokens <<<"${file##*/}"
      x="${tokens[0]}"
      ext="${tokens[@]: -1}"
      


    • 至于为什么你试过没有工作


      • 找到不支持的水珠的作为的文件名的参数,所以这间$ P $点的/ var / www / html等/ MyFolder中/ $ {DIR ## * /} /一_ *。*的字面

      • 此外,你必须分开的根文件夹的从你搜索的文件名的寻找根文件夹的子树的任何级别的模式:

        • 的根文件夹变成了文件名参数

        • 文件名模式传递(报价总是)通过 -name -iname (对于不区分大小写匹配)选项

        • 人机工程学:找到的/ var / www / html等/ MyFolder中/ $ {DIR ## * /}-name'。一_ *...... ,类似于 @konsolebox的回答

        • find does NOT support globs as filename arguments, so it interprets "/var/www/html/MyFolder/${dir##*/}/a_*.*" literally.
        • Also, you have to separate the root folder for your search from the filename pattern to look for on any level of the root folder's subtree:
          • the root folder becomes the filename argument
          • the filename pattern is passed (always quoted) via the -name or -iname (for case-insensitive matching) options
          • Ergo: find "/var/www/html/MyFolder/${dir##*/}" -name 'a_*.*' ..., analogous to @konsolebox' answer.

          这篇关于shell脚本迭代抛目录和文件名裂的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆