e6 b5 8b e8 af 95的编码是什么 [英] What is the encoding of e6 b5 8b e8 af 95

查看:170
本文介绍了e6 b5 8b e8 af 95的编码是什么的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个文本数据源,其中包括字节序列 e6 b5 8b e8 af 95 。在这种情况下,我认为它应该是汉字测试。

I have a source of text data that includes the byte sequence e6 b5 8b e8 af 95. In the context I believe it should be the Chinese character "测试".

我的perl源代码应该使用此字节序列(不幸的是,这不是UTF- 8,但我无法将其编码为UTF-8并解码回去),但在某些情况下,序列变为 c3 a6 c2 b5 c2 8b c3 a8 c2 af c2 95

My perl source code is supposed to pick up this byte sequence (unfortunately this is not in UTF-8 and I cannot encode it to UTF-8 and decode back), but under some circumstances the sequence becomes c3 a6 c2 b5 c2 8b c3 a8 c2 af c2 95.

试图找出 c3 c2 的可能原因被添加。是类似问题中提到的双重转换问题吗?

Trying to figure out the likely reasons why c3 and c2 are added. Is it the problem of double conversion mentioned in similar question?

推荐答案

06d4b 08bd5 是<$ c $的Unicode代码点c>测试。

b5 8b e8 af 95 是UTF-8编码测试

c3 a6 c2 b5 c2 8b c3 a8 c2 af c2 95 测试的UTF-8编码的UTF-8编码。

c3 a6 c2 b5 c2 8b c3 a8 c2 af c2 95 is the UTF-8 encoding of the UTF-8 encoding of 测试.

$ perl -e'
    use strict;
    use warnings;
    use utf8;
    use open ":std", ":encoding(UTF-8)";

    my $s = "测试";
    print "$s\n";
    printf "%v05X\n", $s;

    utf8::encode($s);
    printf "%v02X\n", $s;

    utf8::encode($s);
    printf "%v02X\n", $s;
'
测试
06D4B.08BD5
E6.B5.8B.E8.AF.95
C3.A6.C2.B5.C2.8B.C3.A8.C2.AF.C2.95







$ perl -MJSON -e'
    use strict;
    use warnings;
    use utf8;
    use open ":std", ":encoding(UTF-8)";

    my $s = "测试";
    printf "%1\$s (%1\$v05X)\n", $s;
    my $data = [ $s ];
    my $json_utf8 = JSON->new->utf8->encode($data);
    printf "%v02X\n", $json_utf8;
    $data = JSON->new->utf8->decode($json_utf8);
    $s = $data->[0];
    printf "%1\$s (%1\$v05X)\n", $s;
'
测试 (06D4B.08BD5)
5B.22.E6.B5.8B.E8.AF.95.22.5D
测试 (06D4B.08BD5)

这篇关于e6 b5 8b e8 af 95的编码是什么的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆