.net中的转换:本机Utf-8& lt;-& gt;托管字符串 [英] Conversion in .net: Native Utf-8 <-> Managed String

查看:66
本文介绍了.net中的转换:本机Utf-8& lt;-& gt;托管字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建了这两种方法来将本机utf-8字符串(char *)转换为托管字符串,反之亦然.以下代码可以完成这项工作:

I created those two methods to convert Native utf-8 strings (char*) into managed string and vice versa. The following code does the job:

public IntPtr NativeUtf8FromString(string managedString)
{
    byte[] buffer = Encoding.UTF8.GetBytes(managedString); // not null terminated
    Array.Resize(ref buffer, buffer.Length + 1);
    buffer[buffer.Length - 1] = 0; // terminating 0
    IntPtr nativeUtf8 = Marshal.AllocHGlobal(buffer.Length);
    Marshal.Copy(buffer, 0, nativeUtf8, buffer.Length);
    return nativeUtf8;
}

string StringFromNativeUtf8(IntPtr nativeUtf8)
{
    int size = 0;
    byte[] buffer = {};
    do
    {
        ++size;
        Array.Resize(ref buffer, size);
        Marshal.Copy(nativeUtf8, buffer, 0, size);
    } while (buffer[size - 1] != 0); // till 0 termination found

    if (1 == size)
    {
        return ""; // empty string
    }

    Array.Resize(ref buffer, size - 1); // remove terminating 0
    return Encoding.UTF8.GetString(buffer);
}

虽然NativeUtf8FromString可以,但StringFromNativeUtf8却是一团糟,但我可以运行的唯一安全代码.使用不安全的代码,我可以使用字节*,但是我不想要不安全的代码.有人可以想到另一种方法吗,我不必为每个包含的字节复制字符串以找到0终止.

While NativeUtf8FromString is ok, StringFromNativeUtf8 is a mess but the only safe code I could get to run. Using unsafe code I could use an byte* but I do not want unsafe code. Is there another way someone can think of where I do not have to copy the string for every contained byte to find the 0 termination.

我只是在这里添加未保存的代码:

I just add the unsave code here:

public unsafe string StringFromNativeUtf8(IntPtr nativeUtf8)
{
    byte* bytes = (byte*)nativeUtf8.ToPointer();
    int size = 0;
    while (bytes[size] != 0)
    {
        ++size;
    }
    byte[] buffer = new byte[size];
    Marshal.Copy((IntPtr)nativeUtf8, buffer, 0, size);
    return Encoding.UTF8.GetString(buffer);
}

如您所见,它并不丑陋,只是不安全.

As you see its not ugly just needs unsafe.

推荐答案

只执行与strlen()执行的操作完全相同的操作.请考虑保留 buffer ,代码确实会急着产生垃圾.

Just perform the exact same operation strlen() performs. Do consider keeping the buffer around, the code does generate garbage in a hurry.

    public static IntPtr NativeUtf8FromString(string managedString) {
        int len = Encoding.UTF8.GetByteCount(managedString);
        byte[] buffer = new byte[len + 1];
        Encoding.UTF8.GetBytes(managedString, 0, managedString.Length, buffer, 0);
        IntPtr nativeUtf8 = Marshal.AllocHGlobal(buffer.Length);
        Marshal.Copy(buffer, 0, nativeUtf8, buffer.Length);
        return nativeUtf8;
    }

    public static string StringFromNativeUtf8(IntPtr nativeUtf8) {
        int len = 0;
        while (Marshal.ReadByte(nativeUtf8, len) != 0) ++len;
        byte[] buffer = new byte[len];
        Marshal.Copy(nativeUtf8, buffer, 0, buffer.Length);
        return Encoding.UTF8.GetString(buffer);
    }

这篇关于.net中的转换:本机Utf-8& lt;-& gt;托管字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆