subprocess.Popen命令(反词)在shell和web应用程序中产生不同的输出 [英] subprocess.Popen command (antiword) produces different output in shell vs. web application

查看:352
本文介绍了subprocess.Popen命令(反词)在shell和web应用程序中产生不同的输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有Django在标准的WSGI / Apache httpd组合上运行。

I have Django running on a standard WSGI/Apache httpd combo.

我注意到,当我在shell中运行代码和从浏览器运行代码时,文件输出是不同的。

I noticed that file output was different when I ran code in the shell vs. from the browser. I've isolated out everything else and am still getting the same problem.

这里是代码:

def test_antiword(filename):
    import subprocess
    with open(filename, 'w') as writefile:
        subprocess.Popen(["antiword", '/tmp/test.doc'], stdout=writefile)
    p = subprocess.Popen(["antiword", '/tmp/test.doc'], stdout=subprocess.PIPE)
    out, _ = p.communicate()
    ords = []
    for kk in out:
        ords.append(ord(kk))
    return out, ords

def test_antiword_view(request):
    import HttpResponse
    return HttpResponse(repr(test_antiword('/tmp/web.txt')))

在浏览器中打开网址时,这是输出:

When open the url in the browser, this is the output:


('\\\
我说好日子先生好日子!喊道:为什么不是Zoidberg?Zoidberg问。 ',[10,34,73,32,115,97,105,100,32,103,111,111,100,32,100,97,121,32,115,105,114,46,32,71 ,111,111,100,32,100,97,121,33,34,32,115,104,111,117,116,101,100,32,83,104,233,114,108,111,231 ,107,32,72,248,108,109,101,163,46,10,10,32,32,32,32,32,32,32,32,32,32,32,32,32,34 ,87,104,121,32,110,111,116,32,90,111,105,100,98,101,114,103,63,34,32,113,117,101, ,100,32,90,111,105,100,98,101,114,103,46,10])

('\n"I said good day sir. Good day!" shouted Sh\xe9rlo\xe7k H\xf8lme\xa3.\n\n "Why not Zoidberg?" queried Zoidberg.\n', [10, 34, 73, 32, 115, 97, 105, 100, 32, 103, 111, 111, 100, 32, 100, 97, 121, 32, 115, 105, 114, 46, 32, 71, 111, 111, 100, 32, 100, 97, 121, 33, 34, 32, 115, 104, 111, 117, 116, 101, 100, 32, 83, 104, 233, 114, 108, 111, 231, 107, 32, 72, 248, 108, 109, 101, 163, 46, 10, 10, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 34, 87, 104, 121, 32, 110, 111, 116, 32, 90, 111, 105, 100, 98, 101, 114, 103, 63, 34, 32, 113, 117, 101, 114, 105, 101, 100, 32, 90, 111, 105, 100, 98, 101, 114, 103, 46, 10])

是当我调用 test_antiword('/ tmp / shell.txt') ine hte shell时的相应输出:

This is the corresponding output when I call test_antiword('/tmp/shell.txt') ine hte shell:


('\\\
\xe2\x80\x9cI说好日子先生。好日子!\xe2\x80\x9d喊了Sh\xc3\xa9rlo\xc3\xa7k H\xc3\xb8lme\xc2\xa3.\\\
\\\
\xe2\\ \\ x80 \x9c为什么不是Zoidberg?\xe2\x80\x9d查询Zoidberg.\\\
',[10,226,128,156,73,32,115,97,105,100,32, 111,111,100,32,100,97,121,32,115,105,114,46,32,71,111,111,100,32,100,97, 32位,115位,104位,111位,117位,116位,101位,100位,32位,83位,104位,195位,169位,114位,108位,111位,195位,167位,107位,32位,72位,195位,184位,108位,109位, 101,194,163,46,10,10,32,32,32,32,32,32,32,32,32,32,32,32,32,226,128,156,87,104,121, 32个,110个,111个,116个,32个,90个,111个,105个,100个,98个,101个,114个,103个,63个,226个,128个,157个,32个,113个,117个, 32,90,111,105,100,98,101,114,103,46,10])

('\n\xe2\x80\x9cI said good day sir. Good day!\xe2\x80\x9d shouted Sh\xc3\xa9rlo\xc3\xa7k H\xc3\xb8lme\xc2\xa3.\n\n \xe2\x80\x9cWhy not Zoidberg?\xe2\x80\x9d queried Zoidberg.\n', [10, 226, 128, 156, 73, 32, 115, 97, 105, 100, 32, 103, 111, 111, 100, 32, 100, 97, 121, 32, 115, 105, 114, 46, 32, 71, 111, 111, 100, 32, 100, 97, 121, 33, 226, 128, 157, 32, 115, 104, 111, 117, 116, 101, 100, 32, 83, 104, 195, 169, 114, 108, 111, 195, 167, 107, 32, 72, 195, 184, 108, 109, 101, 194, 163, 46, 10, 10, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 226, 128, 156, 87, 104, 121, 32, 110, 111, 116, 32, 90, 111, 105, 100, 98, 101, 114, 103, 63, 226, 128, 157, 32, 113, 117, 101, 114, 105, 101, 100, 32, 90, 111, 105, 100, 98, 101, 114, 103, 46, 10])

,输出非常不同。一方面,shell输出维护原始文件中的空格;它在网络版本中丢失。

As you can see, the output is very different. For one thing, the shell output maintains the whitespace that was in the original file; it's lost in the web version.

正如您在代码中可以看到的,我还将文档输出到文件。生成的输出如下:

As you can see in the code, I also output the documents to files. The generated output is below:

web.txt

"I said good day sir. Good day!" shouted Sh?rlo?k H?lme?.

             "Why not Zoidberg?" queried Zoidberg.

shell.txt

"I said good day sir. Good day!" shouted Shérloçk Hølme£.

             "Why not Zoidberg?" queried Zoidberg.

在网络版本中,字符无法识别,编码由文件作为ISO-8859。在shell版本中,字符显示正确,编码由 file 标识为UTF-8。

In the web version, the characters are unrecognized and the encoding is identified by file as ISO-8859. In the shell version, the characters display correctly and the encoding is identified by file as UTF-8.

I我不知道为什么会发生这种情况。我检查和两个进程使用相同版本的反义词。此外,我已经验证他们都使用相同的python模块文件 subprocess

I am at a loss to why this could be happening. I've checked and both processes are using the same version of antiword. In addition, I've verified that they are both using the same python module file for subprocess. The version of Python being used in both cases matches exactly also.

任何人都可以解释可能发生的事情。

Can anyone explain what might be going on?

推荐答案

差异可能是由于环境变量。根据手册页

The difference is likely due to an environment variable. According to the man page:

Antiword使用环境变量 LC_ALL LC_CTYPE LANG (以此顺序)以获取当前区域设置,并使用此信息选择默认映射文件。

Antiword uses the environment variables LC_ALL, LC_CTYPE and LANG (in that order) to get the current locale and uses this information to select the default mapping file.

我怀疑发生了什么事情,当你从你的shell运行它,你的shell是一个UTF-8语言环境,你从Django运行它,它在不同的区域设置,它不能正确转换Unicode字符。尝试在运行子进程时切换到UTF-8语言环境,如下所示:

I suspect that what's happening is that when you run it from your shell, your shell is in a UTF-8 locale, but when you run it from Django, it's in a different locale, and it can't properly convert the Unicode characters. Try switching into a UTF-8 locale when running the subprocess like this:

new_env = dict(os.environ)  # Copy current environment
new_env['LANG'] = 'en_US.UTF-8'
p = subprocess.Popen(..., env=new_env)

这篇关于subprocess.Popen命令(反词)在shell和web应用程序中产生不同的输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆