Java在字符串中看不到空格 [英] Java doesn't see space in string

查看:430
本文介绍了Java在字符串中看不到空格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,我正在尝试解析一些具有多行文本的文本文件。我的工作是遍历所有单词并将其打印出文件。

So, I'm trying to parse some text file which has multiple lines of text. My job is to go through all words and print them out in file.

因此,我阅读了所有行,遍历了它们,并用空格将每一行分开,像这样:

So, I read all lines, I'm looping through them and splitting every line by spaces, like this:

line.split( \\s +);

现在,问题是Java在某些情况下看不到两个单词之间的空格...

Now, the problem is that in some cases Java does not see space between two words...

我也在尝试循环通过具有空格但Java看不到它的字符串,而 Character.isSpaceChar(char)返回true ...

I was also trying to loop through string which has space but Java doesn't see it, and Character.isSpaceChar(char) returned true...

现在我完全感到困惑...

And now I'm completly confused...

这里是代码:

public void createMap(String inputPath, String outputPath)
            throws IOException {
                File f = new File(inputPath);
        FileWriter fw = new FileWriter(outputPath);
        List<String> lines = Files.readAllLines(f.toPath(),
                StandardCharsets.UTF_8);
        for (String l : lines) {
            for (String w : l.split("\\s+")) {
                if (isNotRubbish(w.trim())) {
                    fw.write(w.trim() + "\n");
                }
            }
        }
        fw.close();
    }
private boolean isNotRubbish(String w) {
        Pattern p = Pattern.compile("@?\\p{L}+",
                Pattern.UNICODE_CHARACTER_CLASS);
        Matcher m = p.matcher(w);
        return m.matches();
    }


推荐答案

我怀疑您有您的文字字符类似于 non-breakable-space (不是空格),因此不能通过 \\s 匹配。

I suspect that you have in your text character which is similar to non-breakable-space which is not white space so it can't be matched via \\s.

在这种情况下,请尝试使用 \p {Zs} 而不是 \s

In that case try to use \p{Zs} instead of \s.

http:// www中所述.regular-expressions.info / unicode.html


\p {Zs} 将匹配任何类型的空格字符

\p{Zs} will match any kind of space character

BTW,如果您还想包括制表符之类的空格以外的其他分隔符,则 \t 或换行符 \r \n 您可以将 \p {Zs} \s 结合起来,例如 [\p {Zs} \s]

BTW if you would also like to include other separators than spaces like tabulators \t or line breaks \r \n you can combine \p{Zs} with \s like [\p{Zs}\s]

这篇关于Java在字符串中看不到空格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆