如何在python ctypes中使用UTF-16? [英] How to work with UTF-16 in python ctypes?

查看:61
本文介绍了如何在python ctypes中使用UTF-16?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个外国C库,该库在API中使用utf-16:作为函数参数,返回值和结构成员.

I have a foreign C library which uses utf-16 in API: as function arguments, return values and structure members.

在Windows上使用ctypes.c_wchar_p可以,但是在OSX下ctypes在c_wchar中使用UCS-32,我找不到支持utf-16的方法.

On Windows its OK with ctypes.c_wchar_p, but under OSX ctypes uses UCS-32 in c_wchar and I could not find the way to support utf-16.

这是我的研究

  1. 使用_SimpleCData子类化来重新定义_check_retval_ .

  • 它允许将utf-16透明转换为Python字符串.
  • 可以放置为C结构成员
  • 但是它不允许将字符串作为参数处理,它的 from_param()方法从未被调用过(为什么?): func('str',b'W \ x00B \ x00 \ x00 \ x00')#无需转换即可传递
  • it allows a transparent conversion of utf-16 to Python string.
  • can be placed as C structure member
  • But it doesn't allow to handle strings as arguments, its from_param() method never been called (Why?): func('str', b'W\x00B\x00\x00\x00') # passed without conversion

通过 from_param()方法使用自己的类型.

Use own type with from_param() method.

  • 优点:可以使用构造函数初始化,也可以在将字符串传递给函数时动态编码:
  • 缺点:不能用作函数返回类型或结构成员.

这里是:

ustr = myutf16('hello')
func(ustr)
func('hello')   # calls myutf16.from_param('hello')

推荐答案

您可以在 c_char_p 子类中覆盖 from_param 来对 unicode 进行编码字符串为UTF-16.您可以添加 _check_retval _ 方法以将UTF-16结果解码为 unicode 字符串.对于结构字段,可以使用处理设置和获取属性的描述符类.将字段设为类型为 c_char_p 的私有 _name ,并将描述符设置为公共 name .例如:

You can override from_param in a c_char_p subclass to encode a unicode string as UTF-16. You can add a _check_retval_ method to decode a UTF-16 result as a unicode string. For struct fields you can use a descriptor class that handles setting and getting the attribute. Make the field a private _name of type c_char_p, and set the descriptor as the public name. For example:

import sys
import ctypes

if sys.version_info[0] > 2:
    unicode = str

def decode_utf16_from_address(address, byteorder='little',
                              c_char=ctypes.c_char):
    if not address:
        return None
    if byteorder not in ('little', 'big'):
        raise ValueError("byteorder must be either 'little' or 'big'")
    chars = []
    while True:
        c1 = c_char.from_address(address).value
        c2 = c_char.from_address(address + 1).value
        if c1 == b'\x00' and c2 == b'\x00':
            break
        chars += [c1, c2]
        address += 2
    if byteorder == 'little':
        return b''.join(chars).decode('utf-16le')
    return b''.join(chars).decode('utf-16be')

class c_utf16le_p(ctypes.c_char_p):
    def __init__(self, value=None):
        super(c_utf16le_p, self).__init__()
        if value is not None:
            self.value = value

    @property
    def value(self,
              c_void_p=ctypes.c_void_p):
        addr = c_void_p.from_buffer(self).value
        return decode_utf16_from_address(addr, 'little')

    @value.setter
    def value(self, value,
              c_char_p=ctypes.c_char_p):
        value = value.encode('utf-16le') + b'\x00'
        c_char_p.value.__set__(self, value)

    @classmethod
    def from_param(cls, obj):
        if isinstance(obj, unicode):
            obj = obj.encode('utf-16le') + b'\x00'
        return super(c_utf16le_p, cls).from_param(obj)

    @classmethod
    def _check_retval_(cls, result):
        return result.value

class UTF16LEField(object):
    def __init__(self, name):
        self.name = name

    def __get__(self, obj, cls,
                c_void_p=ctypes.c_void_p,
                addressof=ctypes.addressof):
        field_addr = addressof(obj) + getattr(cls, self.name).offset
        addr = c_void_p.from_address(field_addr).value
        return decode_utf16_from_address(addr, 'little')

    def __set__(self, obj, value):
        value = value.encode('utf-16le') + b'\x00'
        setattr(obj, self.name, value)

示例:

if __name__ == '__main__':
    class Test(ctypes.Structure):
        _fields_ = (('x', ctypes.c_int),
                    ('y', ctypes.c_void_p),
                    ('_string', ctypes.c_char_p))
        string = UTF16LEField('_string')

    print('test 1: structure field')
    t = Test()
    t.string = u'eggs and spam'
    print(t.string)

    print('test 2: parameter and result')
    result = None

    @ctypes.CFUNCTYPE(c_utf16le_p, c_utf16le_p)
    def testfun(string):
        global result
        print('parameter: %s' % string.value)
        # callbacks leak memory except for simple return
        # values such as an integer address, so return the
        # address of a global variable.
        result = c_utf16le_p(string.value + u' and eggs')
        return ctypes.c_void_p.from_buffer(result).value

    print('result: %s' % testfun(u'spam'))

输出:

test 1: structure field
eggs and spam

test 2: parameter and result
parameter: spam
result: spam and eggs

这篇关于如何在python ctypes中使用UTF-16?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆