Java-从包含UTF-8和非UTF-8字符的字符串中精确计数60个字符 [英] Java - Count exactly 60 characters from a string with a mixture of UTF-8 and non UTF-8 characters

查看:138
本文介绍了Java-从包含UTF-8和非UTF-8字符的字符串中精确计数60个字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个要保存在仅支持UTF8字符的数据库中的字符串.如果字符串大小> 60个字符,我想截断它,只存储前60个字符.使用中的Oracle数据库仅支持UTF-8字符.

I have a string which i want to save in a database that only supports UTF8 characters. If the string size is > 60 characters i want to truncate it and only store the first 60 characters. The Oracle database in use only supports UTF-8 characters.

在Java中使用String.substring(0,59)返回60个字符,但是当我将其保存在数据库中时,它被拒绝,因为数据库声称该字符串大于60个字符.

Using String.substring(0,59) in Java returns 60 characters but when i save it in the database it gets rejected as the database claims that the string is > 60 characters.

  • 是否可以找到特定字符串是否包含非UTF8字符.我发现的一种选择是:

  • Is there a way to find out if a particular string contains non UTF8 characters. One option i found is:

try {

    bytes = returnString.getBytes("UTF-8");


} catch (UnsupportedEncodingException e) {
    // Do something

}

有没有一种方法可以将其截断为正好x个字符(数据丢失不是问题),并确保在数据库中保存时仅保存x个字符.例如,如果我有字符串§8§8§8§8§8§8§8,并且我说要截断并仅保存5个字符,则应该只保存§8§

is there a way i can truncate it to exactly x number of characters (loss of data is not an issue) and make sure that when saved in the database only x number of characters are saved. For example if i have the string §8§8§8§8§8§8§8 and i say truncate and save only 5 characters it should only save §8§

推荐答案

据我所知,您希望以编码的UTF-8表示形式不超过60个字节的方式限制String长度.您可以这样操作:

As far as I understand you want to limit the String length in a way that the encoded UTF-8 representation does not exceed 60 bytes. You can do it this way:

String s=…;
CharsetEncoder enc=StandardCharsets.UTF_8.newEncoder();
ByteBuffer bb=ByteBuffer.allocate(60);// note the limit
CharBuffer cb = CharBuffer.wrap(s);
CoderResult r = enc.encode(cb, bb, true);
if(r.isOverflow()) {
    System.out.println(s+" is too long for "
                      +bb.capacity()+" "+enc.charset()+" bytes");
    s=cb.flip().toString();
    System.out.println("truncated to "+s);
}

这篇关于Java-从包含UTF-8和非UTF-8字符的字符串中精确计数60个字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆