计算外语中的字符数 [英] Count number of characters present in foreign language

查看：129 发布时间：2019/5/24 19:57:10 javascript character-encoding

本文介绍了计算外语中的字符数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

是否有任何最佳方式来实现非英文字母的字符数？例如，如果我们用英语中的母亲这个词，它就是一个6个字母的单词。但是如果你在泰米尔语中键入相同的单词（மதர்），它是一个三个字母的单词（ம+த+ர்），但最后一个字母（ர்）将被系统视为两个字符（ர+ஂ=ர்）。那么有没有办法计算真实角色的数量？

Is there any optimal way to implement character count for non English letters? For example, if we take the word "Mother" in English, it is a 6 letter word. But if you type the same word(மதர்) in Tamil, it is a three letter word(ம+த+ர்) but the last letter(ர்) will be considered as two characters(ர+ஂ=ர்) by the system. So is there any way to count the number of real characters?

一个线索是，如果我们将键盘中的光标移动到单词（மதர்）中，它将仅通过3个字母，而不是系统考虑的4个字符，那有没有办法通过使用这个来找到解决方案？任何有关这方面的帮助将不胜感激......

One clue is that if we move the cursor in keyboard into the word (மதர்), it will pass through 3 letters only and not into 4 chars considering by the system, so is there any way to find the solution by using this? Any help on this would be greatly appreciated...

更新

从午餐回来=）
我担心之前的语言不能用任何外语这么好用
所以我添加了另一个小提琴可能的方式

Update

Back from lunch =) I'm afraid that the previous won't work this well with any foreign language So i added another fiddle with a possible way

var UnicodeNsm = [Array 1280] //It holds all escaped Unicode Non Space Marks
function countNSMString(str) {
    var chars = str.split("");
    var count = 0;
    for (var i = 0,ilen = chars.length;i<ilen;i++) {
      if(UnicodeNsm.indexOf(escape(chars[i])) == -1) {
        count++;
       }
    }
    return count;
}

var English = "Mother";  
var Tamil = "மதர்";
var Vietnamese = "mẹ"
var Hindi = "मां"

function logL (str) {    
      console.log(str + " has " + countNSMString(str) + " visible Characters and " + str.length + " normal Characters" ); //"மதர் has 3 visible Characters"
}

logL(English) //"Mother has 6 visible Characters and 6 normal Characters"
logL(Tamil) //"மதர் has 3 visible Characters and 4 normal Characters"
logL(Vietnamese) //"mẹ has 2 visible Characters and 3 normal Characters"
logL(Hindi) //"मां has 1 visible Characters and 3 normal Characters"

所以这只是检查字符串中的任何字符是否是Unicode NSM字符并忽略对于这个，这个应该适用于大多数语言，而不仅仅是泰米尔语，
和一个包含1280个元素的数组不应该是性能问题那么大

So this just checks if theres any Character in the String which is a Unicode NSM character and ignores the count for this, this should work for the Most languages, not Tamil only, And an array with 1280 Elements shouldn't be that big of a performance issue

这是一个包含Unicode NSM
的列表 http：/ /www.fileformat.info/info/unicode/category/Mn/list.htm

Here is a list with the Unicode NSM's http://www.fileformat.info/info/unicode/category/Mn/list.htm

这是相应的 JSB在

Here is the according JSBin

在尝试使用字符串操作后，结果是
String.indexOf 返回

After experimenting a bit with string operations, it turns out String.indexOf returns the same for

ர்和ர
含义

ர்ரர.indexOf（ர்）== ர்ரர.indexOf（ர+்）// true 但是

ர்ரர.indexOf（ர）== ர்ரர.indexOf（ர+ர） // false

"ர்" and for "ர" meaning
"ர்ரர".indexOf("ர்") == "ர்ரர".indexOf("ர" + "்") //true but
"ர்ரர".indexOf("ர") == "ர்ரர".indexOf("ர" + "ர") //false

我抓住这个机会尝试过这样的事情

I took this opportunity and tried something like this

//ர்

var char = "ரர்ர்ரர்்";
var char2 = "ரரர்ர்ரர்்";    
var char3 = "ர்ரர்ர்ரர்்";

function countStr(str) {
         var  chars = str.split("");
         var count = 0;
          for(var i = 0, ilen = chars.length;i<ilen;i++) {
                 var chars2 = chars[i] + chars[i+1];   
                 if (str.indexOf(chars[i]) == str.indexOf(chars2))
                   i += 1;
               count++;
            }
         return count;
 }


console.log("--");

console.log(countStr(char)); //6

console.log(countStr(char2)); //7

console.log(countStr(char3)); //7

这似乎适用于上面的字符串，可能需要一些调整，因为我不喜欢不知道关于编码和东西的事情，但也许你可以开始点

Which seems to work for the String above, it may take some adjustments, as i don't know a thing about Encoding and stuff, but maybe its a point you can begin with

继承人 JSBin

这篇关于计算外语中的字符数的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

计算外语中的字符数 [英] Count number of characters present in foreign language

问题描述

推荐答案

更新

Update

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

计算外语中的字符数 [英] Count number of characters present in foreign language

问题描述

推荐答案

更新

Update

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭