在Linux中的Python中读取EXE,MSI和ZIP文件元数据 [英] Read EXE, MSI, and ZIP file metadata in Python in Linux

查看:137
本文介绍了在Linux中的Python中读取EXE,MSI和ZIP文件元数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写Python脚本,以将大量Windows安装程序索引到数据库中.

I am writing a Python script to index a large set of Windows installers into a DB.

我想首先知道如何使用在Linux上运行的Python从EXE,MSI和ZIP文件中读取元数据信息(公司,产品名称,版本等).

I would like top know how to read the metadata information (Company, Product Name, Version, etc) from EXE, MSI and ZIP files using Python running on Linux.

我在带有Django 1.2.1的Ubuntu 10.04 64位上使用Python 2.6.5.

I am using Python 2.6.5 on Ubuntu 10.04 64-bit with Django 1.2.1.

Windows命令行实用程序,可以提取EXE元数据(例如,来自SysUtils的filever)或仅在Windows中工作的其他单个CL实用程序.我已经尝试过在Wine中运行这些程序,但是它们存在问题,值得去寻找那些CL utils依赖的库和框架并尝试将它们安装在Wine/Crossover中.

Windows command line utilities that can extract EXE metadata (like filever from SysUtils), or other individual CL utils that only work in Windows. I've tried running these through Wine but they have problems and it hasn't been worth the work to go and find the libs and frameworks that those CL utils depend on and try installing them in Wine/Crossover.

适用于Python的Win32模块可以执行某些操作,但不能在Linux中运行(对吗?)

Win32 modules for Python that can do some things but won't run in Linux (right?)

显然,更改文件的元数据将更改文件的MD5哈希值.除了查找和读取文件外,是否有一种通用的散列文件的方法,该文件独立于元数据(例如:像跳过前1024个字节一样?)

Obviously changing the file's metadata would change the MD5 hashsum of the file. Is there a general method of hashing a file independent of the metadata besides locating it and reading it in (ex: like skipping the first 1024 byes?)

这是我在StackOverflow上的第一篇文章.自从我作为新的Python开发人员开始新的工作以来,Stackoverflow给我留下了难以置信的印象,它一直出现在Google搜索我的Python/Django查询的顶部,并且具有高质量的答案.对这个社区表示敬意.

This is my first post here to StackOverflow. Since starting at my latest job as a new Python developer, I've been incredibly impressed with Stackoverflow and it has consistently shown up at the top of Google searches for my Python/Django queries and has high quality answers. Kudos to this community.

推荐答案

看看这个库:

Take a look at this library: http://bitbucket.org/haypo/hachoir/wiki/Home and this example program that uses the library: http://pypi.python.org/pypi/hachoir-metadata/1.3.3. The second link is an example program which uses the Hachoir binary file manipulation library (first link) to parse the metadata.

该库可以处理以下格式:

  • 归档文件:bzip2,gzip,zip,tar
  • 音频:MPEG音频("MP3"),WAV,Sun/NeXT音频,Ogg/Vorbis(OGG),MIDI,AIFF,AIFC,真实音频(RA)
  • 图片:BMP,CUR,EMF,ICO,GIF,JPEG,PCX,PNG,TGA,TIFF,WMF,XCF
  • 杂项:洪流
  • 程序:EXE
  • 视频:ASF格式(WMV视频),AVI,Matroska(MKV),Quicktime(MOV),Ogg/Theora,真实媒体(RM)

此外,Hachoir可以执行一些文件操作操作,我认为其中包括一些原始元数据操作.

Additionally, Hachoir can do some file manipulation operations which I would assume includes some primitive metadata manipulation.

这篇关于在Linux中的Python中读取EXE,MSI和ZIP文件元数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆