什么是更好的方法来从字符串修剪不可打印的字符 [英] What is the better approach to trim unprintable characters from a string

查看:247
本文介绍了什么是更好的方法来从字符串修剪不可打印的字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从XML读取数据。当我在eclipse控制台检查,我发现我得到了一些方盒子整个数据。 示例如果没有在excel表我得到123与一些方盒子123。我用修剪()来避免这样的事情,但因此未获得成功,因为TRIM()方法只修剪空格。但我发现这些字符有ASCII值-17,-20 .. 我不想只修剪空格我要修剪那些方盒子也

所以我用下面的方法来削减这些字符,我得到了成功。

这是修剪字符串的更合适的方式

剪裁字符串

 字符串trimData(字符串accessNum){
        StringBuffer的SB =新的StringBuffer();
        尝试{
            如果((accessNum =空)及!及(accessNum.length()大于0)){
// Log.i(设置,accessNum +接入号码长度.....+ accessNum.length());
                accessNum = accessNum.trim();
                字节[] B = accessNum.getBytes();
                的for(int i = 0; I< b.length个;我++){
                    的System.out.println(我+.....+ B [I]);
                    如果(B [i]于大于0){
                        sb.append((char)的(B [I]));
                    }
                }
// Log.i(Settigs,accessNum +裁剪......);
            }}赶上(例外前){            }
            返回sb.toString();
    }


解决方案

编辑

使用等化器(由于Java 6)

 公共静态最终模式DIACRITICS_AND_FRIENDS
        = Pattern.compile([\\\\ p {} InCombiningDiacriticalMarks \\\\ p {} ISLM \\\\ p {} IsSk] +);
私有静态字符串stripDiacritics(字符串str){
        海峡= Normalizer.normalize(STR,Normalizer.Form.NFD);
        海峡= DIACRITICS_AND_FRIENDS.matcher(STR).replaceAll();
        返回海峡;
}

和<一个href=\"http://stackoverflow.com/questions/1453171/n-n-n-or-remove-diacritical-marks-from-uni$c$c-char\">here和 是完整的解决方案。

如果你只想从一个字符串中删除所有非打印字符,请使用

  rawString.replaceAll([^ \\\\ X20  -  \\\\ x7e],)

参考:<一href=\"http://stackoverflow.com/questions/2608205/replace-special-characters-in-string-in-java\">replace在Java中和<字符串特殊字符href=\"http://stackoverflow.com/questions/5008422/how-to-remove-high-ascii-characters-from-string-like-in-java\">How从喜欢®字符串中删除高位ASCII字符,©,™Java中

I am reading data from xml. When I checked in eclipse console I found I am getting the whole data with some square boxes. Example If there is 123 in excel sheet i am getting 123 with some square boxes. I used trim() to avoid such things but didnot get success because trim() method trims only white spaces. But I found those characters have ASCII value -17, -20 .. I dont want to trim only white spaces I want to trim those square boxes also

So I have used the following method to trim those characters and I got success.

What is the more appropriate way of trimming a string

Trimming a string

String trimData(String accessNum){
        StringBuffer sb = new StringBuffer();
        try{
            if((accessNum != null) && (accessNum.length()>0)){
//              Log.i("Settings", accessNum+"Access Number length....."+accessNum.length());
                accessNum = accessNum.trim();
                byte[] b = accessNum.getBytes();
                for(int i=0; i<b.length; i++){
                    System.out.println(i+"....."+b[i]);
                    if(b[i]>0){
                        sb.append((char)(b[i]));
                    }
                }
//              Log.i("Settigs", accessNum+"Trimming....");
            }}catch(Exception ex){

            }
            return sb.toString();
    }

解决方案

Edited

use Normalizer (since java 6)

public static final Pattern DIACRITICS_AND_FRIENDS 
        = Pattern.compile("[\\p{InCombiningDiacriticalMarks}\\p{IsLm}\\p{IsSk}]+");


private static String stripDiacritics(String str) {
        str = Normalizer.normalize(str, Normalizer.Form.NFD);
        str = DIACRITICS_AND_FRIENDS.matcher(str).replaceAll("");
        return str;
}

And here and here are complete solution.

And if you only want to remove all non printable characters from a string, use

rawString.replaceAll("[^\\x20-\\x7e]", "")

Ref : replace special characters in string in java and How to remove high-ASCII characters from string like ®, ©, ™ in Java

这篇关于什么是更好的方法来从字符串修剪不可打印的字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆