Java-从包含UTF-8和非UTF-8字符的字符串中精确计数60个字符 [英] Java - Count exactly 60 characters from a string with a mixture of UTF-8 and non UTF-8 characters
问题描述
我有一个要保存在仅支持UTF8字符的数据库中的字符串.如果字符串大小> 60个字符,我想截断它,只存储前60个字符.使用中的Oracle数据库仅支持UTF-8字符.
I have a string which i want to save in a database that only supports UTF8 characters. If the string size is > 60 characters i want to truncate it and only store the first 60 characters. The Oracle database in use only supports UTF-8 characters.
在Java中使用String.substring(0,59)
返回60个字符,但是当我将其保存在数据库中时,它被拒绝,因为数据库声称该字符串大于60个字符.
Using String.substring(0,59)
in Java returns 60 characters but when i save it in the database it gets rejected as the database claims that the string is > 60 characters.
-
是否可以找到特定字符串是否包含非UTF8字符.我发现的一种选择是:
Is there a way to find out if a particular string contains non UTF8 characters. One option i found is:
try {
bytes = returnString.getBytes("UTF-8");
} catch (UnsupportedEncodingException e) {
// Do something
}
有没有一种方法可以将其截断为正好x个字符(数据丢失不是问题),并确保在数据库中保存时仅保存x个字符.例如,如果我有字符串§8§8§8§8§8§8§8
,并且我说要截断并仅保存5个字符,则应该只保存§8§
is there a way i can truncate it to exactly x number of characters (loss of data is not an issue) and make sure that when saved in the database only x number of characters are saved. For example if i have the string §8§8§8§8§8§8§8
and i say truncate and save only 5 characters it should only save §8§
推荐答案
据我所知,您希望以编码的UTF-8
表示形式不超过60个字节的方式限制String
长度.您可以这样操作:
As far as I understand you want to limit the String
length in a way that the encoded UTF-8
representation does not exceed 60 bytes. You can do it this way:
String s=…;
CharsetEncoder enc=StandardCharsets.UTF_8.newEncoder();
ByteBuffer bb=ByteBuffer.allocate(60);// note the limit
CharBuffer cb = CharBuffer.wrap(s);
CoderResult r = enc.encode(cb, bb, true);
if(r.isOverflow()) {
System.out.println(s+" is too long for "
+bb.capacity()+" "+enc.charset()+" bytes");
s=cb.flip().toString();
System.out.println("truncated to "+s);
}
这篇关于Java-从包含UTF-8和非UTF-8字符的字符串中精确计数60个字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!