Python 2.7中子流程模块输出的编码是什么? [英] what is the encoding of the subprocess module output in Python 2.7?

查看:66
本文介绍了Python 2.7中子流程模块输出的编码是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用python2.7在64位Windows Vista上检索压缩存档的内容。我尝试通过使用子进程模块对7zip(我最喜欢的存档管理器)进行系统调用:

I'm trying to retrieve the content of a zipped archive with python2.7 on 64bit windows vista. I tried by making a system call to 7zip (my favourite archive manager) using the subprocess module:

# -*- coding: utf-8 -*-
import sys, os, subprocess

Extractor = r'C:\Program Files\7-Zip\7z.exe'
ArchiveName = r'C:\temp\bla.zip'
output = subprocess.Popen([Extractor,'l','-slt',ArchiveName],stdout=subprocess.PIPE).stdout.read()

只要存档内容仅包含ascii文件名,此方法就可以正常工作,但是当我尝试使用非ASCII我得到一个编码的输出字符串变量,其中ä,ë,ö,ü被\x84,\x89,\x94,\x81(等)代替。我已经尝试过各种解码/编码调用,但是我对python太缺乏经验了(通常太愚蠢了),无法用umlaut重现原始字符(如果我想在后续步骤中添加例如,

This works fine as long as the archive content contains only ascii filenames, but when I try it with non-ascii I get an encoded output string variable where ä, ë, ö, ü have been replaced by \x84, \x89, \x94, \x81 (etcetera). I've tried all kinds of decode/encode calls but I'm just too inexperienced with python (and generally too stupid) to reproduce the original characters with umlaut (which is required if I would like to follow-up this step with e.g. an extraction subprocess call to 7z).

简单地说,我的问题是:如何使该方法也适用于非ascii内容的档案?

Simply put my question is: How do I get this to work also for archives with non-ascii content?

...还是用更复杂的方式表示:子进程的输出是否始终采用固定编码?
在前一种情况下->它是哪种编码?
在后一种情况下->如何控制或发现子流程输出的编码?受此博客上类似问题的启发,我尝试添加

... or to put it in a more convoluted way: Is the output of subprocess always of a fixed encoding or not? In the former case -> Which encoding is it? In the latter case -> How can I control or uncover the encoding of the output of subprocess? Inspired by similar questions on this blog I've tried adding

import codecs
sys.stdout = codecs.getwriter('utf8')(sys.stdout)

我也尝试过

my_env = os.environ
my_env['PYTHONIOENCODING'] = 'utf-8'
output = subprocess.Popen([Extractor,'l','-slt',ArchiveName],stdout=subprocess.PIPE,env=my_env).stdout.read()

,但似乎都没有改变输出变量的编码(或再现变音符号)。

but neither seems to alter the encoding of the output variable (or to reproduce the umlaut).

推荐答案

您可以尝试使用7zip中的-sccUTF-8开关将输出强制为utf-8。
这是参考页: http: //en.helpdoc-online.com/7-zip_9.20/source/cmdline/switches/scc.htm

You can try using the -sccUTF-8 switch from 7zip to force the output in utf-8. Here is ref page: http://en.helpdoc-online.com/7-zip_9.20/source/cmdline/switches/scc.htm

这篇关于Python 2.7中子流程模块输出的编码是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆