为什么打印的字符在这个 python 代码中消失了? [英] Why do printed characters disappear in this python code?

查看:70
本文介绍了为什么打印的字符在这个 python 代码中消失了?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

(这个问题源于试图绕过这个问题)

我正在尝试用 python 打印字典列表.由于我找不到能够将python对象转换为字符串的真正函数(不,json.dumps不起作用),我想写一个简单的打印脚本.

不幸的是,行首的字符只是消失了......现在,我可能不是 Python 专家,但这种行为在我看来是无稽之谈.

# out 对象由库返回(rekall)# 它是一个字典列表.进口召回out = rekall.a_modified_module.calculate()打印 '[',对于 ps 输入:第一 = 真打印 '{',ps中的信息:如果首先:第一个 = 错误别的:打印 '\'%s\':\'%s\',' % (info, ps[info]),打印 '}',打印 ']'

我希望输出为:

[{'pid':'2040', 'name':'leon.exe', 'offset':'2234185984',}]

相反,我明白了:

'pid':'2040', 'name':'leon.exe', 'offset':'2234185984',}]

你能解释一下这里发生了什么吗?(我跳过循环中的第一行,因为它包含另一个字典,并且输出变得更加疯狂,输出的部分混合在一起)

P.S.:如果您有打印通用 python 对象的有效选项(类似于 JavaScript 中的 JSON.stringify,但无需处理 JSON 对象)请告诉我.

我的问题旨在解释这种奇怪的(对我来说)行为,输出取决于括号后打印的内容.事实上,如果我删除内部 for 循环(for info in ps"),初始括号会正确打印.此外,如果我创建一个管道将输出发送到另一个程序,该程序将从括号开始正确接收输出.

为了帮助理解问题的性质和输出"对象的类型,这是使用pprint"模块的输出:

[{'name': [String:ImageFileName]: 'leon.exe\x00','偏移':2236079360,'pid': [unsigned int:UniqueProcessId]: 0x000007FC,'psscan':{'CSRSS':假,句柄":错误,'PsActiveProcessHead':真,'PspCidTable':真,'会话':真}}]

解决方案

Python 对象有两种方法可用于快速获得人类可读的数据表示形式:str 提供对象的可打印表示repr 试图给出一个可用于重建对象的字符串:对于许多类型,这个函数使尝试返回一个字符串,该字符串在传递给 eval() 时会产生具有相同值的对象.高度强调尝试".类可以使用自己的 __str____repr__ 方法自由地覆盖默认实现.

您的示例输出:

'name': [String:ImageFileName]: 'leon.exe\x00'

很有趣.它表明 rekall 模块正在覆盖 __repr__ 以提供其数据类型 ([String:ImageFileName]:) 的更复杂视图.但这不是有效的python - 实现者只是给出了更类型化的描述.它还显示其字符串 'leon.exe\x00' 中包含不可打印的字符.这意味着,在这种情况下,在打印数据的字符串值时会发出 NUL \x00.我会称这是一个错误 - 但也可能是该模块应该发出原始二进制数据.

您的控制台可能会使用不可打印的字符进行格式化.例如,\r(回车)告诉控制台在行首重新定位并覆盖字符

<预><代码>>>>打印 'foo\rbar'酒吧

在我的控制台上,这个转义序列

<预><代码>>>>打印 '\x1b[0;31;40m 你好'你好

将hello"打印成红色.

如果 rekall 输出原始二进制数据,则您尝试打印的字符串具有不可打印的字符,这些字符会弄乱您的控制台显示.为了让事情变得复杂,rekall 模块可能会检查它的 stdout 是否是一个终端,并改变它的输出以向它的字符串添加面向终端的花哨格式.

假设 rekall 将原始二进制数据放入字符串中,您可以执行 str 来摆脱 rekall 元数据,然后 repr 转义麻烦的字符

def mystr(s):返回 repr(str(s))对于 ps 输入:第一 = 真ps中的信息:如果首先:第一个 = 错误别的:打印 '\'%s\':\'%s\'' % (mystr(info), mystr(ps[info])))

或者编写自己的函数来过滤掉不需要的字符.这在 Unicode 中有点困难,但对于 ascii 文本,我们可以采用 string.printable 中的字符子集.

printable = set('0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$''%&\\\'()*+,-./:;<=>@[\\]^_`{|}~ \t')def mystr(s):返回 '​​'.join(filter(printable.__contains__, str(s)))对于 ps 输入:第一 = 真ps中的信息:如果首先:第一个 = 错误别的:打印 '\'%s\':\'%s\'' % (mystr(info), mystr(ps[info])))

(This question rised from an attempt to get around this problem)

I'm trying to print a list of dictionaries in python. Since I can't find a real function which is able to convert a python object to a string (no, json.dumps doesn't work), I thought to write a simple printing script.

Unfortunately characters at the beginning of the line simply disappear... Now, I'm probably no expert in python, but this behavior looks nonsense to me.

# The out object is returned by a library (rekall) 
# and it is a list of dictionaries.
import rekall
out = rekall.a_modified_module.calculate()

print '[',
for ps in out:
    first = True
    print '{',
    for info in ps:
        if first:
            first = False
        else:
            print '\'%s\':\'%s\',' % (info, ps[info]),
    print '}',
print ']'

I would expect the output to be:

[{'pid':'2040', 'name':'leon.exe', 'offset':'2234185984',}]

Instead I get this:

'pid':'2040', 'name':'leon.exe', 'offset':'2234185984',}]

Can you please explain me what's happening here? (I'm skipping first line in the loop because it contains another dictionary and the output gets even crazier, with mixed parts of the output)

P.S.: if you have a valid option for printing a generic python object (something comparable to JSON.stringify in javascript, but without having to deal with JSON objects) please tell me.

EDIT: My question aims at explaining this strange (to me) behavior, where the output depends on what is printed after the brackets. In fact, if I remove the inner for loop ("for info in ps"), the initial brackets are printed correctly. Also, if I create a pipe to send the output to another program, that program will receive the output correctly, starting from the brackets.

EDIT: To help understanding the nature of the problem, and the type of the 'out' object, this is the output using the 'pprint' module:

[{'name':  [String:ImageFileName]: 'leon.exe\x00',
  'offset': 2236079360,
  'pid':  [unsigned int:UniqueProcessId]: 0x000007FC,
  'psscan': {'CSRSS': False,
             'Handles': False,
             'PsActiveProcessHead': True,
             'PspCidTable': True,
             'Sessions': True}}]

解决方案

Python objects have two methods used to get a quick human-readable representation of its data: str which gives a nicely printable representation of an object and repr which attempts to give a string that could be used to rebuild the object: For many types, this function makes an attempt to return a string that would yield an object with the same value when passed to eval(). Heavy emphassis on "attempts". Classes are free to override the default implementation with their own __str__ and __repr__ methods.

Your example output:

'name':  [String:ImageFileName]: 'leon.exe\x00'

is interesting. It shows that the rekall module is overriding __repr__ to give a more complex view of its data types ([String:ImageFileName]:). But that's not valid python - the implementors were just giving a more typeful description. It also shows that its strings 'leon.exe\x00' have non-printable characters in them. It means that, in this instance, a NUL \x00 is emitted when printing the string value of the data. I would call this a bug - but it could be that the module is supposed to emit raw binary data.

Non-printable characters may be used for formatting by your console. For instance, \r (carriage return) tells the console to reposition at the start of the line and overwrite characters

>>> print 'foo\rbar'
bar

On my console, this escape sequence

>>> print '\x1b[0;31;40m hello'
hello

makes "hello" print in red.

If rekall is putting out raw binary data, strings you are trying to print have non-printable characters that mess up your console display. To keep things complicated, the rekall module may be checking whether its stdout is a terminal and changing its output to add fancy terminal-oriented formatting to its strings.

Assuming that rekall is putting raw binary data in strings you could do str to get rid of rekall metadata and then repr to escape the troublesome characters

def mystr(s):
    return repr(str(s))

for ps in out:
    first = True
    for info in ps:
        if first:
            first = False
        else:
            print '\'%s\':\'%s\'' % (mystr(info), mystr(ps[info])))

Or write your own function to filter out characters you don't want. This is kinda difficult in Unicode but for ascii text we could take a subset of the characters you would find in string.printable.

printable = set(
    '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$'
    '%&\\\'()*+,-./:;<=>?@[\\]^_`{|}~ \t')

def mystr(s):
    return ''.join(filter(printable.__contains__, str(s)))

for ps in out:
    first = True
    for info in ps:
        if first:
            first = False
        else:
            print '\'%s\':\'%s\'' % (mystr(info), mystr(ps[info])))

这篇关于为什么打印的字符在这个 python 代码中消失了?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆