python中的ctypes与memset崩溃 [英] ctypes in python crashes with memset

查看:210
本文介绍了python中的ctypes与memset崩溃的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从内存中删除密码字符串就像建议的那样在这里

I am trying to erase password string from memory like it is suggested in here.

我写了那个小片段:

import ctypes, sys

def zerome(string):
    location = id(string) + 20
    size     = sys.getsizeof(string) - 20
    #memset =  ctypes.cdll.msvcrt.memset
    # For Linux, use the following. Change the 6 to whatever it is on your computer.
    print ctypes.string_at(location, size)
    memset =  ctypes.CDLL("libc.so.6").memset
    memset(location, 0, size)
    print "Clearing 0x%08x size %i bytes" % (location, size)
    print ctypes.string_at(location, size)

a = "asdasd"

zerome(a)

奇怪的是,这段代码适用于IPython,

Oddly enouth this code works fine with IPython,

[7] oz123@yenitiny:~ $ ipython a.py 
Clearing 0x02275b84 size 23 bytes

但与Python崩溃:

But crashes with Python:

[8] oz123@yenitiny:~ $ python a.py 
Segmentation fault
[9] oz123@yenitiny:~ $

任何想法为什么?

我在Debian Wheezy上测试过Python 2.7.3。

I tested on Debian Wheezy, with Python 2.7.3.

该代码适用于使用Python 2.6.6的CentOS 6.2。
代码在Debian上用Python 2.6.8崩溃了。
我试过想为什么它在CentOS上运行,而不是在Debian上运行。唯一的原因,
来自不同的不同,是我的Debian是multiarch并且CentOS
在我的旧笔记本电脑上使用i686 CPU运行。

The code works on CentOS 6.2 with Python 2.6.6. The code crashed on Debian with Python 2.6.8. I tried thinking why it works on CentOS, and not on Debian. The only reason, which came an immidiate different, is that my Debian is multiarch and CentOS is running on my older laptop with i686 CPU.

因此,我重新启动了我的CentOS latop并加载了Debian Wheezy。
代码适用于Debian Wheezy,它不是多拱的。
因此,我怀疑我在Debian上的配置有些问题...

Hence, I rebooted my CentOS latop and loaded Debian Wheezy on it. The code works on Debian Wheezy which is not multi-arch. Hence, I suspect my configuration on Debian is somewhat problematic ...

推荐答案

ctypes有 memset 函数已经存在,因此您不必为libc / msvcrt函数创建函数指针。此外,20个字节用于常见的32位平台。在64位系统上,它可能是36个字节。这是 PyStringObject 的布局:

ctypes has a memset function already, so you don't have to make a function pointer for the libc/msvcrt function. Also, 20 bytes is for common 32-bit platforms. On 64-bit systems it's probably 36 bytes. Here's the layout of a PyStringObject:

typedef struct {
    Py_ssize_t ob_refcnt;         // 4|8 bytes
    struct _typeobject *ob_type;  // 4|8 bytes
    Py_ssize_t ob_size;           // 4|8 bytes
    long ob_shash;                // 4|8 bytes (4 on 64-bit Windows)
    int ob_sstate;                // 4 bytes
    char ob_sval[1];
} PyStringObject; 

因此在32位系统上可能是5 * 4 = 20字节,8 * 4 + 4 = 64位Linux上的36个字节,或64位Windows上的8 * 3 + 4 * 2 = 32个字节。由于未使用垃圾收集头跟踪字符串,因此可以使用 sys.getsizeof 。一般情况下,如果你不想包含GC头大小(在内存中它实际上是在你从 id 获得的对象的基地址之前),那么使用对象的 __ sizeof __ 方法。根据我的经验,至少这是一般规则。

So it could be 5*4 = 20 bytes on a 32-bit system, 8*4 + 4 = 36 bytes on 64-bit Linux, or 8*3 + 4*2 = 32 bytes on 64-bit Windows. Since a string isn't tracked with a garbage collection header, you can use sys.getsizeof. In general if you don't want the GC header size included (in memory it's actually before the object's base address you get from id), then use the object's __sizeof__ method. At least that's a general rule in my experience.

您想要的是简单地从对象大小中减去缓冲区大小。 CPython中的字符串以空值终止,因此只需在其长度上添加1即可获得缓冲区大小。例如:

What you want is to simply subtract the buffer size from the object size. The string in CPython is null-terminated, so simply add 1 to its length to get the buffer size. For example:

>>> a = 'abcdef'
>>> bufsize = len(a) + 1
>>> offset = sys.getsizeof(a) - bufsize
>>> ctypes.memset(id(a) + offset, 0, bufsize)
3074822964L
>>> a
'\x00\x00\x00\x00\x00\x00'






编辑

更好的选择是定义 PyStringObject 结构。这样可以方便地检查 ob_sstate 。如果它大于0,那意味着字符串被实现,理所当然的事情是引发异常。单字符字符串以及仅由ASCII字母和下划线组成的代码对象中的字符串常量,以及解释器内部用于名称(变量名称,属性)的字符串。

A better alternative is to define the PyStringObject structure. This makes it convenient to check ob_sstate. If it's greater than 0, that means the string is interned and the sane thing to do is raise an exception. Single-character strings are interned, along with string constants in code objects that consist of only ASCII letters and underscore, and also strings used internally by the interpreter for names (variable names, attributes).

from ctypes import *

class PyStringObject(Structure):
    _fields_ = [
      ('ob_refcnt', c_ssize_t),
      ('ob_type', py_object),
      ('ob_size', c_ssize_t),
      ('ob_shash', c_long),
      ('ob_sstate', c_int),
      # ob_sval varies in size
      # zero with memset is simpler
    ]

def zerostr(s):
    """zero a non-interned string"""
    if not isinstance(s, str):
        raise TypeError(
          "expected str object, not %s" % type(s).__name__)

    s_obj = PyStringObject.from_address(id(s))
    if s_obj.ob_sstate > 0:
        raise RuntimeError("cannot zero interned string")

    s_obj.ob_shash = -1  # not hashed yet
    offset = sizeof(PyStringObject)
    memset(id(s) + offset, 0, len(s))

例如:

>>> s = 'abcd' # interned by code object
>>> zerostr(s)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 10, in zerostr
RuntimeError: cannot zero interned string

>>> s = raw_input() # not interned
abcd
>>> zerostr(s)
>>> s
'\x00\x00\x00\x00'

这篇关于python中的ctypes与memset崩溃的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆