为什么阿拉伯字符在设计单个阿拉伯字符时表现为单独的字符? [英] Why do Arabic characters behave as separate characters when styling single Arabic character?

查看:160
本文介绍了为什么阿拉伯字符在设计单个阿拉伯字符时表现为单独的字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

基本上我想要完成的是阿拉伯字符滥用荧光笔!



为了方便理解,我将尝试解释类似的功能,但是英语。 / p>

想象一个字符串大写错误,需要正确重写它,以便用户重写输入框中的字符串并提交,js检查是否有char未被更正,那么它显示整个字符串,并修正这些字母并以红色突出显示;



ie [测试]变为[ >为了这样做,我检查这些字符,如果检测到错误的字符被包围与跨度为红色。



到目前为止很好,
现在,当我尝试复制这个阿拉伯语的错误的字符从这个词,使它不可读的。






演示 jsfiddle



  function check1(){englishanswer.innerHTML = englishWord.value.replace(/ t /,'< span style =color:red> T< / span>');} function check2(){arabicanswer.innerHTML = arabicWord.value.replace(/ \\\ه /,'< span style =color:red> ;'+ unescape(%u0629)+'< / span>')+'< br>'+ arabicWord.value.replace(/ \\\ه /,unescape('%u0629')) / code> 

  fieldset {border:2px groove threedface; border-image:initial; width:75%;} input {padding:5px; margin:5px; font-size:1.25em;} p {padding:5px; font-size:2em;}  

 < fieldset& < legend>英语:< / legend> < input id ='englishWord'value ='test'/> < input type ='submit'value ='Check'onclick ='check1()'/> < p id ='englishanswer'>< / p>< / fieldset>< fieldset style =direction:rtl> < legend>عربي< / legend> < input id ='arabicWord'value ='بطله'/> < input type ='submit'value ='Check'onclick ='check2()'/> < p id ='arabicanswer'>< / p>< / fieldset>  



注意,当测试阿拉伯语单词时,跨区的字符[第一预览]与单词的其余部分分开,而非跨区的字符[第二预览]正常出现。






编辑:预览问题[Chrome UA]



解决方案

是WebKit浏览器(Chrome,Safari)中的一个长期的错误:HTML标记断开连接行为。显式使用ZWJ(零宽连接器)用于帮助(请参阅问题部分着色的阿拉伯语字



作为一个笨拙的(但可能是唯一的)解决方法,您可以使用上下文表单阿拉伯字母。这可以首先使用静态HTML标记和CSS,例如

 بطل< span style =color:red> å< / span> 

这里我使用 span 元素,üU + FE94阿拉伯语信函TEH马布达最终形式,而不是正常的U + 0629阿拉伯语信函马尔布和U + FEE0阿拉伯语信件林媒体形式,而不是U + 0644阿拉伯语信件。



要在JavaScript中实现这一点,在将标记插入单词阿拉伯字母时,需要将断点前后的字符(由标记引起)更改为初始,中间或最终表示形式


Basically what I am trying to accomplish is Arabic characters misuse highlighter !

To make it easy for understand I will try to explain a similar functionality but for English.

Imagine a string with wrong capitalization, and it is required to rewrite it correctly, so the user rewrites the string in an input box and submits, the js checks to see if any char wasn't corrected then it displays the whole string with those letter corrected and highlighted in red;

i.e. [test ] becomes [Test ]

To do so, I was checking those chars, and if faulty char was detected it get surrounded with span to be colored in red.

So far so good, now when I try to replicate this for Arabic language the faulty char gets separated from the word making it unreadable.


Demo: jsfiddle

function check1() {
  englishanswer.innerHTML = englishWord.value.replace(/t/, '<span style="color:red">T</span>');
}

function check2() {
  arabicanswer.innerHTML =
    arabicWord.value.replace(/\u0647/, '<span style="color:red">' +
      unescape("%u0629") + '</span>') +
    '<br>' + arabicWord.value.replace(/\u0647/, unescape('%u0629'));
}

fieldset {
  border: 2px groove threedface;
  border-image: initial;
  width: 75%;
}
input {
  padding: 5px;
  margin: 5px;
  font-size: 1.25em;
}
p {
  padding: 5px;
  font-size: 2em;
}

<fieldset>
  <legend>English:</legend>
  <input id='englishWord' value='test' />
  <input type='submit' value='Check' onclick='check1()' />
  <p id='englishanswer'></p>
</fieldset>

<fieldset style="direction:rtl">
  <legend>عربي</legend>
  <input id='arabicWord' value='بطله' />
  <input type='submit' value='Check' onclick='check2()' />
  <p id='arabicanswer'></p>
</fieldset>

Notice when testing the Arabic word, the spanned char [first preview] is separated from the rest of the word, while the non-spanned char [second preview] appears normally.


Edit: Preview for the problem [Chrome UA]

解决方案

This is a longstanding bug in WebKit browsers (Chrome, Safari): HTML markup breaks joining behavior. Explicit use of ZWJ (zero-width joiner) used to help (see question Partially colored arabic word in HTML), but it seems that the bug has become worse.

As a clumsy (but probably the only) workaround, you could use contextual forms for Arabic letters. This can be tested first using just static HTML markup and CSS, e.g.

بطﻠ<span style="color:red">ﺔ</span>

Here I am using, inside the span element, ﺔ U+FE94 ARABIC LETTER TEH MARBUTA FINAL FORM instead of the normal U+0629 ARABIC LETTER TEH MARBUTA and ﻠ U+FEE0 ARABIC LETTER LAM MEDIAL FORM instead of U+0644 ARABIC LETTER LAM.

To implement this in JavaScript, you would need, when inserting markup into a word Arabic letters, change characters before and after the break (caused by markup) to initial, medial, or final representation form according to its position in the word.

这篇关于为什么阿拉伯字符在设计单个阿拉伯字符时表现为单独的字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆