在 Ubuntu 中使用 python 获取与文件关联的元数据列表 [英] get the list of metadata associated to a file using python in Ubuntu

查看:29
本文介绍了在 Ubuntu 中使用 python 获取与文件关联的元数据列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 Ubuntu 中的 python 获取与文件关联的元数据列表.

I'm trying to get the list of meta-data associated to a file, using python in Ubuntu.

不使用python,命令extract"效果很好,但我不知道如何在python中使用它,我总是收到一条消息说extract"没有定义.

Without using python, the command "extract" works very well but I don't know how to use it with python, I always get a message saying that "extract" is not defined.

推荐答案

我假设您询问的是出现在摘要"选项卡下的 Windows属性"对话框中的元数据.(如果没有,请忽略这一点.)这是我的管理方式.

I assume you're asking about the metadata that appears in the Windows "Properties" dialogue under the "Summary" tab. (If not, just disregard this.) Here's how I managed it.

  1. 下载并安装 Python win32 扩展.win32、win32com 等进入你的 Python[ver]/Lib/site-packages 文件夹.这些带来了 win32api、win32com 等.出于某种原因,我无法让 Python 2.6(在 build 216 中)的版本工作.我将我的系统更新到 Python 2.7 并使用了 Python 2.7 的 216 版本,它工作正常.(要下载和安装,请按照上面的链接,单击阅读pywin32"的链接,单击最新版本的链接(当前为 216),单击与您的系统和 Python 安装匹配的 .exe 文件的链接(对我而言),它是 pywin32-216.win32-py2.7.exe).运行 .exe 文件.)
  2. 复制并粘贴代码来自获取文档摘要信息" 将 Tim Golden 教程中的页面转换为您自己计算机上的 .py 文件.
  3. 调整代码.您实际上不必调整代码,但是如果您将这个 Tim 的脚本作为您的主模块运行,并且如果您没有提供路径名作为您的第一个 sys.argv,那么您将收到一个错误.要进行调整,请向下滚动到代码底部,并省略以 if __name__ == '__main__': 开头的最后一个块.
  1. Download and install Python win32 extensions. This will put win32, win32com, etc. into your Python[ver]/Lib/site-packages folder. These bring the win32api, win32com, etc. For some reason, I couldn't get the version for Python 2.6 (in build 216) to work. I updated my system to Python 2.7 and used the 216 build for Python 2.7, and it worked. (To download & install, follow the link above, click the link reading 'pywin32', click the link for the latest build (currently 216), click the link for the .exe file that matches your system and Python installation (for me, it was pywin32-216.win32-py2.7.exe). Run the .exe file.)
  2. Copy and paste the code from the "Get document summary information" page on Tim Golden's tutorial into a .py file on your own computer.
  3. Tweak the code. You don't really have to tweak the code, but if you run this Tim's script as your main module, and if you don't supply a pathname as your first sys.argv, then you'll get an error. To make the tweak, scroll down to the bottom of the code, and omit the final block, which starts with if __name__ == '__main__':.

将您的文件另存为 property_reader.py 之类的内容,并调用其 property_sets(filepath) 方法.此方法返回一个生成器对象.您可以遍历生成器以查看所有属性及其值.你可以这样实现:

Save your file as something like property_reader.py, and call its property_sets(filepath) method. This method returns a generator object. You can iterate through the generator to see all the properties and their values. You could implement it like this:

# Assuming 'property_reader.py' is the name of the module/file in which you saved Tim Golden's code...
import property_reader 
propgenerator = property_reader.property_sets('[your file path]')
    for name, properties in propgenerator:
        print name
        for k, v in properties.items ():
            print "  ", k, "=>", v

以上代码的输出将类似于以下内容:

The output of the above code will be something like the following:

DocSummaryInformation
   PIDDSI_CATEGORY => qux
SummaryInformation
   PIDSI_TITLE => foo
   PIDSI_COMMENTS => flam
   PIDSI_AUTHOR => baz
   PIDSI_KEYWORDS => flim
   PIDSI_SUBJECT => bar

这篇关于在 Ubuntu 中使用 python 获取与文件关联的元数据列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆