Perl中的JSON编码/解码utf8字符串 [英] JSON encode/decode utf8 strings in Perl
问题描述
我从命令行将 utf8 编码的字符串传递到Perl程序中:
I pass a utf8 encoded string from my command line into a Perl program:
> ./test.pl --string='ḷet ūs try ṭhiñgs'
似乎可以正确识别字符串:
which seems to recognize the string correctly:
use utf8;
GetOptions(
'string=s' => \$string,
) or die;
print Dumper($string);
print Dumper(utf8::is_utf8($string));
print Dumper(utf8::valid($string));
打印
$VAR1 = 'ḷet ūs try ṭhiñgs';
$VAR1 = '';
$VAR1 = 1;
当我将此字符串存储到哈希中并调用 encode_json 在其上,该字符串似乎再次被 编码,而
When I store this string into a hash and call encode_json on it, the string seems to be again encoded whereas to_json seems to work (if I read the output correctly):
my %a = ( 'nāme' => $string ); # Note the Unicode character
print Dumper(\%a);
print Dumper(encode_json(\%a));
print Dumper(to_json(\%a));
打印
$VAR1 = {
"n\x{101}me" => 'ḷet ūs try ṭhiñgs'
};
$VAR1 = '{"nāme":"ḷet Å«s try á¹hiñgs"}';
$VAR1 = "{\"n\x{101}me\":\"\x{e1}\x{b8}\x{b7}et \x{c5}\x{ab}s try \x{e1}\x{b9}\x{ad}hi\x{c3}\x{b1}gs\"}";
但是,将其转换回原始哈希值似乎不适用于这两种方法,并且在两种情况下都无法实现哈希,字符串和破损:
Turning this back into the original hash, however, doesn't seem to work with either methods and in both cases hash and string and broken:
print Dumper(decode_json(encode_json(\%a)));
print Dumper(from_json(to_json(\%a)));
打印
$VAR1 = {
"n\x{101}me" => "\x{e1}\x{b8}\x{b7}et \x{c5}\x{ab}s try \x{e1}\x{b9}\x{ad}hi\x{c3}\x{b1}gs"
};
$VAR1 = {
"n\x{101}me" => "\x{e1}\x{b8}\x{b7}et \x{c5}\x{ab}s try \x{e1}\x{b9}\x{ad}hi\x{c3}\x{b1}gs"
};
哈希查找$a{'nāme'}
现在失败.
问题:如何在Perl中正确处理utf8编码和字符串以及JSON编码/解码?
Question: How do I handle utf8 encoding and strings and JSON encode/decode correctly in Perl?
推荐答案
您需要对输入内容进行解码:
You need to decode your input:
use Encode;
my $string;
GetOptions('string=s' => \$string) or die;
$string = decode('UTF-8', $string);
将所有内容放在一起,我们得到:
Putting it all together, we get:
use strict;
use warnings;
use 5.012;
use utf8;
use Encode;
use Getopt::Long;
use JSON;
my $string;
GetOptions('string=s' => \$string) or die;
$string = decode('UTF-8', $string);
my %hash = ('nāme' => $string);
my $json = encode_json(\%hash);
my $href = decode_json($json);
binmode(STDOUT, ':encoding(utf8)');
say $href->{nāme};
示例:
$ perl test.pl --string='ḷet ūs try ṭhiñgs'
ḷet ūs try ṭhiñgs
确保您的源文件实际上已编码为UTF-8!
Make sure your source file is actually encoded as UTF-8!
这篇关于Perl中的JSON编码/解码utf8字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!