如何在C#中使用Swig包装UTF-8编码的C ++ std ::字符串? [英] How to wrap UTF-8 encoded C++ std::strings with Swig in C#?

查看:728
本文介绍了如何在C#中使用Swig包装UTF-8编码的C ++ std ::字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题与这个问题几乎相同,除了链接的问题处理char *,而我在我的代码中使用std :: string。类似于链接的问题,我也使用C#作为我的目标语言。



我有一个用C ++编写的类:

  class MyClass 
{
public:
const std :: string get_value()const; //返回utf8-string
void set_value(const std :: string& value); // sets utf8-string
private:
// ...
};

这个get被SWIG包装在C#中如下:

  public class MyClass 
{
public string get_value();
public void set_value(string value);
}

SWIG为我做了一切,除了它没有utf8 utf16在调用MyClass期间的字符串转换。我的字符串通过罚款,如果他们是可表示在ASCII,但如果我尝试通过set_value和get_value往返传递一个非ascii字符的字符串,我结束了难以理解的字符。



如何在C#中编写SWIG包装UTF-8编码的C ++字符串? n.b.我使用的是std :: string,而不是std :: wstring,而不是char *。



SWIG sourceforge网站,但它处理char * not std :: string,它使用(可配置)固定长度的缓冲区。

解决方案

借助 David Jeske 在链接的代码项目文章中,我终于能够回答这个问题。



您需要在C#库中使用这个类(来自David Jeske的代码)。

  public class UTF8Marshaler:ICustomMarshaler {
static UTF8Marshaler static_instance;

public IntPtr MarshalManagedToNative(object managedObj){
if(managedObj == null)
return IntPtr.Zero;
if(!(managedObj is string))
throw new MarshalDirectiveException(
UTF8Marshaler必须在字符串上使用。

// not null terminated
byte [] strbuf = Encoding.UTF8.GetBytes((string)managedObj);
IntPtr buffer = Marshal.AllocHGlobal(strbuf.Length + 1);
Marshal.Copy(strbuf,0,buffer,strbuf.Length);

//写终止null
Marshal.WriteByte(buffer + strbuf.Length,0);
return buffer;
}

public unsafe object MarshalNativeToManaged(IntPtr pNativeData){
byte * walk =(byte *)pNativeData;

//找到字符串的结尾
while(* walk!= 0){
walk ++;
}
int length =(int)(walk - (byte *)pNativeData);

//不应该是null终止
byte [] strbuf = new byte [length];
//跳过尾部的null
Marshal.Copy((IntPtr)pNativeData,strbuf,0,length);
string data = Encoding.UTF8.GetString(strbuf);
return data;
}

public void CleanUpNativeData(IntPtr pNativeData){
Marshal.FreeHGlobal(pNativeData);
}

public void CleanUpManagedData(object managedObj){
}

public int GetNativeDataSize(){
return -1;
}

public static ICustomMarshaler GetInstance(string cookie){
if(static_instance == null){
return static_instance = new UTF8Marshaler();
}
return static_instance;然后,在Swig的std_string.i中,打开一个窗口,在窗口中显示一个窗口。第24行替换这行:

 %typemap(imtype)stringstring

 %typemap ,inattributes =[MarshalAs(UnmanagedType.CustomMarshaler,MarshalTypeRef = typeof(UTF8Marshaler))],outattributes =[return:MarshalAs(UnmanagedType.CustomMarshaler,MarshalTypeRef = typeof(UTF8Marshaler))]stringstring

并在第61行替换此行:

 %typemap(imtype)const string& string

 %typemap(imtype,inattributes =[MarshalAs(UnmanagedType.CustomMarshaler,MarshalTypeRef = typeof(UTF8Marshaler))],outattributes =[return:MarshalAs(UnmanagedType.CustomMarshaler,MarshalTypeRef = typeof (UTF8Marshaler))])string& string

阅读链接的文章,以了解这是如何工作的。


My question is nearly identical to this question, except that the linked question deals with char*, whereas I'm using std::string in my code. Like the linked question, I'm also using C# as my target language.

I have a class written in C++:

class MyClass
{
public:
    const std::string get_value() const; // returns utf8-string
    void set_value(const std::string &value); // sets utf8-string
private:
    // ...
};

And this get's wrapped by SWIG in C# as follows:

public class MyClass
{
    public string get_value();
    public void set_value(string value);
}

SWIG does everything for me, except that it doesn't make an utf8 to utf16 string conversion during the calls to MyClass. My strings come through fine if they are representable in ASCII, but if I try passing a string with non-ascii characters in a round-trip through "set_value" and "get_value", I end up with unintelligible characters.

How can I make SWIG wrap UTF-8 encoded C++ strings in C#? n.b. I'm using std::string, not std::wstring, and not char*.

There's a partial solution on the SWIG sourceforge site, but it deals with char* not std::string, and it uses a (configurable) fixed length buffer.

解决方案

With the help (read: genius!) of David Jeske in the linked Code Project article, I have finally been able to answer this question.

You'll need this class (from David Jeske's code) in your C# library.

public class UTF8Marshaler : ICustomMarshaler {
    static UTF8Marshaler static_instance;

    public IntPtr MarshalManagedToNative(object managedObj) {
        if (managedObj == null)
            return IntPtr.Zero;
        if (!(managedObj is string))
            throw new MarshalDirectiveException(
                   "UTF8Marshaler must be used on a string.");

        // not null terminated
        byte[] strbuf = Encoding.UTF8.GetBytes((string)managedObj); 
        IntPtr buffer = Marshal.AllocHGlobal(strbuf.Length + 1);
        Marshal.Copy(strbuf, 0, buffer, strbuf.Length);

        // write the terminating null
        Marshal.WriteByte(buffer + strbuf.Length, 0); 
        return buffer;
    }

    public unsafe object MarshalNativeToManaged(IntPtr pNativeData) {
        byte* walk = (byte*)pNativeData;

        // find the end of the string
        while (*walk != 0) {
            walk++;
        }
        int length = (int)(walk - (byte*)pNativeData);

        // should not be null terminated
        byte[] strbuf = new byte[length];  
        // skip the trailing null
        Marshal.Copy((IntPtr)pNativeData, strbuf, 0, length); 
        string data = Encoding.UTF8.GetString(strbuf);
        return data;
    }

    public void CleanUpNativeData(IntPtr pNativeData) {
        Marshal.FreeHGlobal(pNativeData);            
    }

    public void CleanUpManagedData(object managedObj) {
    }

    public int GetNativeDataSize() {
        return -1;
    }

    public static ICustomMarshaler GetInstance(string cookie) {
        if (static_instance == null) {
            return static_instance = new UTF8Marshaler();
        }
        return static_instance;
    }
}

Then, in Swig's "std_string.i", on line 24 replace this line:

%typemap(imtype) string "string"

with this line:

%typemap(imtype, inattributes="[MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(UTF8Marshaler))]", outattributes="[return: MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(UTF8Marshaler))]") string "string"

and on line 61, replace this line:

%typemap(imtype) const string & "string"

with this line:

%typemap(imtype, inattributes="[MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(UTF8Marshaler))]", outattributes="[return: MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(UTF8Marshaler))]") string & "string"

Lo and behold, everything works. Read the linked article for a good understanding of how this works.

这篇关于如何在C#中使用Swig包装UTF-8编码的C ++ std ::字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆