如何在Java / Scala中跳过流中的无效字符? [英] How to skip invalid characters in stream in Java/Scala?

查看:121
本文介绍了如何在Java / Scala中跳过流中的无效字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

例如我有以下代码

Source.fromFile(new File( path), "UTF-8").getLines()

并抛出异常

Exception in thread "main" java.nio.charset.MalformedInputException: Input length = 1
    at java.nio.charset.CoderResult.throwException(CoderResult.java:260)
    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:319)

I如果没有读取某些行,但不关心如何跳过无效的字符并继续读取行?

I don't care if some lines were not read, but how to skip invalid chars and continue reading lines?

推荐答案

你可以影响通过调用 CharsetDecoder.onMalformedInput

You can influence the way that the charset decoding handles invalid input by calling CharsetDecoder.onMalformedInput.

通常你将永远不会看到 CharsetDe编码器直接对象,因为它将在幕后为您创建。因此,如果您需要访问它,您将需要使用API​​,允许您直接指定 CharsetDecoder (而不仅仅是编码名称或 Charset )。

Usually you won't ever see a CharsetDecoder object directly, because it will be created behind the scenes for you. So if you need access to it, you'll need to use API that allows you to specify the CharsetDecoder directly (instead of just the encoding name or the Charset).

此类API的最基本示例是 InputStreamReader

The most basic example of such API is the InputStreamReader:

InputStream in = ...;
CharsetDecoder decoder = StandardCharsets.UTF_8.newDecoder();
decoder.onMalformedInput(CodingErrorAction.IGNORE);
Reader reader = new InputStreamReader(in, decoder);

请注意,此代码使用Java 7类 StandardCharsets ,对于早期版本,你可以简单地用 Charset.forName(UTF-8) (或使用 Charsets http://code.google.com/p/guava-libraries/\"rel =noreferrer>番石榴)。

Note that this code uses the Java 7 class StandardCharsets, for earlier versions you can simply replace it with Charset.forName("UTF-8") (or use the Charsets class from Guava).

这篇关于如何在Java / Scala中跳过流中的无效字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆