e6 b5 8b e8 af 95的编码是什么 [英] What is the encoding of e6 b5 8b e8 af 95
问题描述
我有一个文本数据源,其中包括字节序列 e6 b5 8b e8 af 95
。在这种情况下,我认为它应该是汉字测试。
I have a source of text data that includes the byte sequence e6 b5 8b e8 af 95
. In the context I believe it should be the Chinese character "测试".
我的perl源代码应该使用此字节序列(不幸的是,这不是UTF- 8,但我无法将其编码为UTF-8并解码回去),但在某些情况下,序列变为 c3 a6 c2 b5 c2 8b c3 a8 c2 af c2 95
。
My perl source code is supposed to pick up this byte sequence (unfortunately this is not in UTF-8 and I cannot encode it to UTF-8 and decode back), but under some circumstances the sequence becomes c3 a6 c2 b5 c2 8b c3 a8 c2 af c2 95
.
试图找出 c3
和 c2
的可能原因被添加。是类似问题中提到的双重转换问题吗?
Trying to figure out the likely reasons why c3
and c2
are added. Is it the problem of double conversion mentioned in similar question?
推荐答案
06d4b 08bd5
是<$ c $的Unicode代码点c>测试。
b5 8b e8 af 95
是UTF-8编码测试
。
c3 a6 c2 b5 c2 8b c3 a8 c2 af c2 95
是测试
的UTF-8编码的UTF-8编码。
c3 a6 c2 b5 c2 8b c3 a8 c2 af c2 95
is the UTF-8 encoding of the UTF-8 encoding of 测试
.
$ perl -e'
use strict;
use warnings;
use utf8;
use open ":std", ":encoding(UTF-8)";
my $s = "测试";
print "$s\n";
printf "%v05X\n", $s;
utf8::encode($s);
printf "%v02X\n", $s;
utf8::encode($s);
printf "%v02X\n", $s;
'
测试
06D4B.08BD5
E6.B5.8B.E8.AF.95
C3.A6.C2.B5.C2.8B.C3.A8.C2.AF.C2.95
$ perl -MJSON -e'
use strict;
use warnings;
use utf8;
use open ":std", ":encoding(UTF-8)";
my $s = "测试";
printf "%1\$s (%1\$v05X)\n", $s;
my $data = [ $s ];
my $json_utf8 = JSON->new->utf8->encode($data);
printf "%v02X\n", $json_utf8;
$data = JSON->new->utf8->decode($json_utf8);
$s = $data->[0];
printf "%1\$s (%1\$v05X)\n", $s;
'
测试 (06D4B.08BD5)
5B.22.E6.B5.8B.E8.AF.95.22.5D
测试 (06D4B.08BD5)
这篇关于e6 b5 8b e8 af 95的编码是什么的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!