在PHP中将utf8转换为latin1。所有超过255的字符都转换为char引用 [英] Convert utf8 to latin1 in PHP. All characters above 255 convert to char references
问题描述
我需要将UTF-8中的文本转换为ISO-8859-1中编码的文本,使得不属于ISO-8859-1集合的任何字符都将转换为字符引用。 (例如β
)
I need to convert text in UTF-8 into text encoded in ISO-8859-1 such that any character that are not part of ISO-8859-1 set would turn into character references. (ex β
)
示例:我要打开
hello é β 水
into
hello é β 水
我在PHP中做这些。我尝试了内置的函数,iconv,整洁和那些的组合,仍然得不到可靠的解决方案。
I am doing all this in PHP. I tried built-in functions, iconv, and tidy and combination of those and still cant get a reliable solution.
这是我到目前为止
// convert any characters fount in the entity table into HTML entities
// do not double encode entities, do not mess with quotes
// use UTF-8 as character encoding because the page submits UTF-8
$str = htmlentities($str,ENT_NOQUOTES,'UTF-8',false);
//print $str."\n";
// convert text from UTF-8 to ISO-8859-1,
// characters that cannot be converted will be converted to ?
$str = utf8_decode($str);
//print $str."\n";
// make string XML valid.
// mainly it converts text entities into numeric entities.
$opts = array( "output-xhtml" => true,
"output-xml" => true,
"show-body-only" => true,
"numeric-entities" => true,
"wrap" => 0,
"indent" => false,
"char-encoding" => 'latin1'
);
$tidy = tidy_parse_string($str, $opts,'latin1');
tidy_clean_repair($tidy);
$str = tidy_get_output($tidy);
//print $str."\n";
推荐答案
您需要多字节支持。特别是, mb_encode_numericentity():
You'll need multibyte support. In particular, mb_encode_numericentity():
$convmap= array(0x0100, 0xFFFF, 0, 0xFFFF);
$encutf= mb_encode_numericentity($utf, $convmap, 'UTF-8');
$iso= utf8_decode($encutf);
(这不会触及<
,&
,等,所以你可能还需要
htmlspecialchars()
。)
(This doesn't touch <
, &
, "
etc so you may also need htmlspecialchars()
beforehand.)
这篇关于在PHP中将utf8转换为latin1。所有超过255的字符都转换为char引用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!