从多个 7-zip 文件中提取特定的文件扩展名 [英] Extract specific file extensions from multiple 7-zip files

查看:41
本文介绍了从多个 7-zip 文件中提取特定的文件扩展名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 RAR 文件和一个 ZIP 文件.在这两个中有一个文件夹.文件夹内有几个 7-zip (.7z) 文件.每个 7z 中都有多个扩展名相同但名称不同的文件.

I have a RAR file and a ZIP file. Within these two there is a folder. Inside the folder there are several 7-zip (.7z) files. Inside every 7z there are multiple files with the same extension, but whose names vary.

RAR or ZIP file
  |___folder
        |_____Multiple 7z
                  |_____Multiple files with same extension and different name

我只想从数千个文件中提取我需要的那些...我需要那些名称包含某个子字符串的文件.例如,如果压缩文件的名称中包含 '[!]''(U)''(J)'这是确定要提取的文件的标准.

I want to extract just the ones I need from thousands of files... I need those files whose names include a certain substring. For example, if the name of a compressed file includes '[!]' in the name or '(U)' or '(J)' that's the criteria to determine the file to be extracted.

我可以毫无问题地提取文件夹,所以我有这个结构:

I can extract the folder without problem so I have this structure:

folder
   |_____Multiple 7z
                |_____Multiple files with same extension and different name

我在 Windows 环境中,但我安装了 Cygwin.我想知道如何轻松提取我需要的文件?也许使用单个命令行.

I'm in a Windows environment but I have Cygwin installed. I wonder how can I extract the files I need painlessly? Maybe using a single command line line.

问题有一些改进:

  • 内部 7z 文件及其内部文件可以在名称中包含空格.
  • 有 7z 个文件,其中只有一个文件不符合给定条件.因此,作为唯一可能的文件,它们也必须被提取.

谢谢大家.bash 解决方案是帮助我解决问题的解决方案.我无法测试 Python3 解决方案,因为我在尝试使用 pip 安装库时遇到了问题.我不使用 Python,所以我必须研究并克服我在使用这些解决方案时遇到的错误.现在,我找到了一个合适的答案.谢谢大家.

Thanks to everyone. The bash solution was the one that helped me out. I wasn't able to test Python3 solutions because I had problems trying to install libraries using pip. I don't use Python so I'll have to study and overcome the errors I face with these solutions. For now, I've found a suitable answer. Thanks to everyone.

推荐答案

该解决方案基于 bash、grep 和 awk,适用于 Cygwin 和 Ubuntu.

This solution is based on bash, grep and awk, it works on Cygwin and on Ubuntu.

因为您需要先搜索 (X) [!].ext 文件,如果没有这样的文件,则查找 (X).ext文件,我认为不可能编写一些单一的表达式来处理这个逻辑.

Since you have the requirement to search for (X) [!].ext files first and if there are no such files then look for (X).ext files, I don't think it is possible to write some single expression to handle this logic.

解决方案应该有一些 if/else 条件逻辑来测试存档中的文件列表并决定提取哪些文件.

The solution should have some if/else conditional logic to test the list of files inside the archive and decide which files to extract.

这是我测试脚本的 zip/rar 存档中的初始结构(我做了一个 脚本来准备这个结构):

Here is the initial structure inside the zip/rar archive I tested my script on (I made a script to prepare this structure):

folder
├── 7z_1.7z
│   ├── (E).txt
│   ├── (J) [!].txt
│   ├── (J).txt
│   ├── (U) [!].txt
│   └── (U).txt
├── 7z_2.7z
│   ├── (J) [b1].txt
│   ├── (J) [b2].txt
│   ├── (J) [o1].txt
│   └── (J).txt
├── 7z_3.7z
│   ├── (E) [!].txt
│   ├── (J).txt
│   └── (U).txt
└── 7z 4.7z
    └── test.txt

输出是这样的:

output
├── 7z_1.7z           # This is a folder, not an archive
│   ├── (J) [!].txt   # Here we extracted only files with [!]
│   └── (U) [!].txt
├── 7z_2.7z
│   └── (J).txt       # Here there are no [!] files, so we extracted (J)
├── 7z_3.7z
│   └── (E) [!].txt   # We had here both [!] and (J), extracted only file with [!]
└── 7z 4.7z
    └── test.txt      # We had only one file here, extracted it

这是脚本提取:

#!/bin/bash

# Remove the output (if it's left from previous runs).
rm -r output
mkdir -p output

# Unzip the zip archive.
unzip data.zip -d output
# For rar use
#  unrar x data.rar output
# OR
#  7z x -ooutput data.rar

for archive in output/folder/*.7z
do
  # See https://stackoverflow.com/questions/7148604
  # Get the list of file names, remove the extra output of "7z l"
  list=$(7z l "$archive" | awk '
      /----/ {p = ++p % 2; next}
      $NF == "Name" {pos = index($0,"Name")}
      p {print substr($0,pos)}
  ')
  # Get the list of files with [!].
  extract_list=$(echo "$list" | grep "[!]")
  if [[ -z $extract_list ]]; then
    # If we don't have files with [!], then look for ([A-Z]) pattern
    # to get files with single letter in brackets.
    extract_list=$(echo "$list" | grep "([A-Z]).")
  fi
  if [[ -z $extract_list ]]; then
    # If we only have one file - extract it.
    if [[ ${#list[@]} -eq 1 ]]; then
      extract_list=$list
    fi
  fi
  if [[ ! -z $extract_list ]]; then
    # If we have files to extract, then do the extraction.
    # Output path is output/7zip_archive_name/
    out_path=output/$(basename "$archive")
    mkdir -p "$out_path"
    echo "$extract_list" | xargs -I {} 7z x -o"$out_path" "$archive" {}
  fi
done

这里的基本思想是遍历 7zip 存档并使用 7z l 命令(文件列表)获取每个存档的文件列表.

The basic idea here is to go over 7zip archives and get the list of files for each of them using 7z l command (list of files).

命令的输出比较冗长,所以我们使用 awk 来清理它并获取文件名列表.

The output of the command if quite verbose, so we use awk to clean it up and get the list of file names.

之后我们使用 grep 过滤这个列表以获得 [!] 文件列表或 (X) 文件列表.然后我们只需将这个列表传递给 7zip 即可提取我们需要的文件.

After that we filter this list using grep to get either a list of [!] files or a list of (X) files. Then we just pass this list to 7zip to extract the files we need.

这篇关于从多个 7-zip 文件中提取特定的文件扩展名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆