Android将UTF-8字符串与edittext的UTF-8输入字符串进行比较 [英] Android compare UTF-8 string with UTF-8 input string of edittext

查看:244
本文介绍了Android将UTF-8字符串与edittext的UTF-8输入字符串进行比较的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的android应用程序中,我想将utf-8字符串(例如bãi" )与用户在edittext上键入的字符串进行比较.
但是,如果键入bãi" 来编辑文本并使用方法edittext.getText().toString()获取输入字符串,它将返回类似

In my android application, i want to compare an utf-8 string, for example "bãi" with string which user type on edittext.
However, if I type "bãi" to edittext and get input string by using method edittext.getText().toString(), it will return string like

,它将不等于bãi"

我也尝试

String input = new String(input.getBytes("UTF-8"), "UTF-8");

但不起作用. input.equals(bãi")将返回false.

but it not work. input.equals("bãi") will return false.

有人知道如何解决这个问题. 感谢您的帮助.

Is anyone know how solve this problem. Thanks for any help.

推荐答案

在Unicode中,某些字符可以用多种方式表示.例如,在单词bãi中,中间字符可以用两种方式表示:

In Unicode, certain characters can be represented in more than one way. For example, in the word bãi the middle character can be represented in two ways:

  1. 单个代码点U + 00E3(带小标题的拉丁文小写字母A)
  2. 两个代码点U + 0061(拉丁文小写字母A)和U + 0303(组合标题)

为了显示,两者应该看起来相同.

For display, both should look the same.

对于字符串比较,这带来了一个问题.解决方案是首先根据 Unicode标准附件#15 — Unicode规范化对字符串进行规范化表格.

For string comparison, this poses a problem. The solution is to normalize the strings first according to Unicode Standard Annex #15 — Unicode Normalization Forms.

Normalizer ).

Normalization is supported in Java (incl. Android) by the Normalizer class (for Android see Normalizer).

下面的代码显示结果:

String s1 = "b\u00e3i";
String s2 = "ba\u0303i";
System.out.println(String.format("Before normalization: %s == %s => %b", s1, s2, s1.equals(s2)));

String n1 = Normalizer.normalize(s1, Form.NFD);
String n2 = Normalizer.normalize(s2, Form.NFD);
System.out.println(String.format("After normalization:  %s == %s => %b", n1, n2, n1.equals(n2)));

它输出:

Before normalization: bãi == bãi => false
After normalization:  bãi == bãi => true

BTW:Form.NFD格式分解字符串,即,它创建带有两个代码点的较长表示形式. Form.NFC将创建较短的表格.

BTW: The form Form.NFD decomposes the strings, i.e. it creates the longer representation with two codepoints. Form.NFC would create the shorter form.

这篇关于Android将UTF-8字符串与edittext的UTF-8输入字符串进行比较的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆