尝试读取整个文件时出现MalformedInputException [英] MalformedInputException when trying to read entire file

查看:151
本文介绍了尝试读取整个文件时出现MalformedInputException的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个132 kb的文件(你不能说真的很大),我试图从Scala REPL中读取它,但是我不能读取过去的2048字符,因为它给了我一个<$ c $


$ b $ p $ java.util.charset.MalformedInputException 异常

以下是我采取的步骤: b

  val it = scala.io.Source.fromFile(docs / categorizer / usig_calles.json)//这是好的
it.take(2048)。 mkString //这也是可以的
it.take(1).mkString // BANG!

java.nio.charset.MalformedInputException:在java.nio.charset.CoderResult.throwException处输入长度为1
(CoderResult.java:277)
at sun.nio。 $ s $ s $($ s $ s $ s $ s $ s $ s $ s $ s $ s $ s $ s $($ s $ s $ s $ s $ s $ s $ s $ s $ s $) 184)

有什么想法可能会出错?


$ b显然问题在于文件不是UTF编码我将它保存为 UTF和一切工作,我只是在迭代器上发出mkString,它检索文件的全部内容。

奇怪的是,错误只引发了传递的第一个2048年的字符...

解决方案

不能确定没有文件,但异常的文档表明它被抛出当一个输入字节序列对于给定的字符集是不合法的,或者输入字符序列不是一条腿十六位Unicode序列。 ( MalformedInputException javadoc



我怀疑在2049年遇到的第一个字符是无效的,无论默认的JVM字符编码是在你的环境中。考虑使用重载文件中的 fromFile



显式指定文件的字符编码。如果应用程序将跨平台,您应该知道JVM上的默认字符编码因平台而异,所以如果您使用特定的编码进行操作,则在启动应用程序时要么将其显式设置为命令行参数,要么在每个平台上指定它使用适当的重载进行调用。


I have a 132 kb file (you can't really say it's big) and I'm trying to read it from the Scala REPL, but I can't read past 2048 char because it gives me a java.nio.charset.MalformedInputException exception

These are the steps I take:

val it = scala.io.Source.fromFile("docs/categorizer/usig_calles.json") // this is ok
it.take(2048).mkString // this is ok too
it.take(1).mkString // BANG!

java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:277)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:338)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
at java.io.InputStreamReader.read(InputStreamReader.java:184)

Any idea what could be wrong?

--

Apparently the problem was that the file was not UTF encoded

I saved it as UTF and everything works, I just issue mkString on the iterator and it retrieves the whole contents of the file

The strange thing is that the error only aroused passing the first 2048 chars...

解决方案

Cannot be certain without the file, but the documentation on the exception indicates it is thrown "when an input byte sequence is not legal for given charset, or an input character sequence is not a legal sixteen-bit Unicode sequence." (MalformedInputException javadoc)

I suspect that at 2049 is the first character encountered that is not valid with whatever the default JVM character encoding is in you environment. Consider explicitly stating the character encoding of the file using one of the overloads to fromFile.

If the application will be cross platform, you should know that the default character encoding on the JVM does vary by platform, so if you operate with a specific encoding you either want to explicitly set it as a command line parameter when launching of your application, or specify it at each call using the appropriate overload.

这篇关于尝试读取整个文件时出现MalformedInputException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆