是否有可能转换包含“高”字符串的字符串？ unicode字符转换为由从utf-32（“真”）代码导出的dec值组成的数组？ [英] Is it possible to convert a string containing "high" unicode chars to an array consisting of dec values derived from utf-32 ("real") codes?

查看：112 发布时间：2018/6/21 12:37:50 javascript html utf-8 character-encoding utf-32

本文介绍了是否有可能转换包含“高”字符串的字符串？ unicode字符转换为由从utf-32（“真”）代码导出的dec值组成的数组？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

请看这个以（理论上可能）字符串操作的脚本：

 <！doctype html> 
< html> 
< head> 
< meta charset =utf-8> 
< title>< / title> 
< script src =jquery.js>< / script> 
< script> 
 $（function（）{
 $（＃click）。click（function（）{
 var txt = $（'＃high-unicode'）。text（）; 
 var codes =''; 
 for（var i = 0; i< txt.length; i ++）{
 if（i> 0）codes + ='，'; 
 codes + = txt.charCodeAt（i）; 
} 
 alert（codes）; 
}）; 
}）; 
< / script> 
< / head> 
< body> 
 < span id =high-unicode>&＃x1D465;<！ - 数学斜体小x  - &＃xF31E0;<！ - 来自辅助私人使用的一些字符A-> A<！ - 字符A  - >&＃x108171;<！ - 来自补充专用区域B的一些字符 - >< / span> 
< / body> 
< / html>

取代55349,56421,56204,56800,65,56288,56689，是否有可能得到119909,995808,65,1081713？我已阅读 more-utf-32-aware -javascript-string 和问：什么是从UTF-16转换为字符代码的算法？ + 问：是否有更简单的方法来完成此操作？ from unicode.org/faq/utf_bom ，但我不确定如何使用这些信息。

解决方案

看起来你必须解码代理对。例如：

$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $函数decodeUnicode（str）{
var r = []，i = 0;
while（i< str.length）{
var chr = str.charCodeAt（i ++）;
if（chr> = 0xD800&& chr< = 0xDBFF）{
//代理对
var low = str.charCodeAt（i ++）;
r.push（0x10000 +（（chr - 0xD800）<< 10）|（low - 0xDC00））;
} else {
//普通字符
r.push（chr）;
}
}
return r;
}

完整代码： http://jsfiddle.net/twQWU/

Please, look at this script operating on a (theoretically possible) string:
<!doctype html> <html> <head> <meta charset="utf-8"> <title></title> <script src="jquery.js"></script> <script> $(function () { $("#click").click(function () { var txt = $('#high-unicode').text(); var codes = ''; for (var i = 0; i < txt.length; i++) { if (i > 0) codes += ','; codes += txt.charCodeAt(i); } alert(codes); }); }); </script> </head> <body> <span id="click">click</span><br /> <span id="high-unicode">𝑥󳇠A􈅱</span> </body> </html>
Instead of "55349,56421,56204,56800,65,56288,56689", is it possible to get "119909,995808,65,1081713"? I've read more-utf-32-aware-javascript-string and Q: What’s the algorithm to convert from UTF-16 to character codes? + Q: Isn’t there a simpler way to do this? from unicode.org/faq/utf_bom, but I'm not sure how to use this info.
解决方案
It looks like you have to decode surrogate pairs manually. For example:
function decodeUnicode(str) { var r = [], i = 0; while(i < str.length) { var chr = str.charCodeAt(i++); if(chr >= 0xD800 && chr <= 0xDBFF) { // surrogate pair var low = str.charCodeAt(i++); r.push(0x10000 + ((chr - 0xD800) << 10) | (low - 0xDC00)); } else { // ordinary character r.push(chr); } } return r; }
Complete code: http://jsfiddle.net/twQWU/

这篇关于是否有可能转换包含“高”字符串的字符串？ unicode字符转换为由从utf-32（“真”）代码导出的dec值组成的数组？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

是否有可能转换包含“高”字符串的字符串？ unicode字符转换为由从utf-32（“真”）代码导出的dec值组成的数组？ [英] Is it possible to convert a string containing "high" unicode chars to an array consisting of dec values derived from utf-32 ("real") codes?

问题描述

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

问题描述

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭