使用 Scalas 的结果“fromURL";抛出异常 [英] Using result from Scalas "fromURL" throws Exception
问题描述
我正在尝试使用 Scala 的 scala.io.Source 对象获取一些网页.获取迭代器工作正常,但我不能在没有异常的情况下对它做任何事情:
I'm trying to get some webpages using Scala's scala.io.Source object. Getting the iterator works fine but i cant do anything with it without getting an exception:
scala> scala.io.Source.fromURL("http://google.com")
res0: scala.io.BufferedSource = non-empty iterator
scala> scala.io.Source.fromURL("http://google.com").length
java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:277)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:338)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:154)
at java.io.BufferedReader.read(BufferedReader.java:175)
at scala.io.BufferedSource$$anonfun$iter$1$$anonfun$apply$mcI$sp$1.apply$mcI$sp(BufferedSource.scala:38)
at scala.io.Codec.wrap(Codec.scala:64)
at scala.io.BufferedSource$$anonfun$iter$1.apply$mcI$sp(BufferedSource.scala:38)
at scala.io.BufferedSource$$anonfun$iter$1.apply(BufferedSource.scala:38)
at scala.io.BufferedSource$$anonfun$iter$1.apply(BufferedSource.scala:38)
at scala.collection.Iterator$$anon$14.next(Iterator.scala:150)
at scala.collection.Iterator$$anon$25.hasNext(Iterator.scala:562)
at scala.collection.Iterator$$anon$19.hasNext(Iterator.scala:400)
at scala.io.Source.hasNext(Source.scala:238)
at scala.collection.Iterator$class.foreach(Iterator.scala:772)
at scala.io.Source.foreach(Source.scala:181)
at scala.collection.TraversableOnce$class.size(TraversableOnce.scala:104)
at scala.io.Source.size(Source.scala:181)
at scala.collection.Iterator$class.length(Iterator.scala:1071)
at scala.io.Source.length(Source.scala:181)
at .<init>(<console>:8)
at .<clinit>(<console>)
at .<init>(<console>:11)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:704)
at scala.tools.nsc.interpreter.IMain$Request$$anonfun$14.apply(IMain.scala:920)
at scala.tools.nsc.interpreter.Line$$anonfun$1.apply$mcV$sp(Line.scala:43)
at scala.tools.nsc.io.package$$anon$2.run(package.scala:25)
at java.lang.Thread.run(Thread.java:745)
所以你可以看到获取缓冲区是有效的,我可以用它做一些事情
So as you can see obtaining the buffer works, i can do something with it
scala> scala.io.Source.fromURL("http://google.com").next
res7: Char = <
但似乎我无法迭代它.
我使用的是 Scala v 2.9.2,但问题也在 2.11.2 中再次出现.进一步我正在运行
I'm using scala v 2.9.2 but the problem recurs in 2.11.2 as well. Further I'm running
java version "1.7.0_75"
OpenJDK Runtime Environment (IcedTea 2.5.4) (7u75-2.5.4-2)
OpenJDK 64-Bit Server VM (build 24.75-b04, mixed mode)
任何帮助使其正常工作将不胜感激
Any help getting this to work would be greatly appreciated
推荐答案
您在这里遇到了编码问题.
You have an encoding issue here.
解释响应所需的编码是 latin1
,也称为 ISO-8859-1
.
The Encoding needed for interpreting the response is latin1
, also known as ISO-8859-1
.
使用 Source.fromURL("url")("encoding")
来解决您的问题.
Use Source.fromURL("url")("encoding")
to solve your problem.
Source.fromURL("http://google.com")("ISO-8859-1").mkString
res4: String =
<!doctype html><html itemscop
一点背景:当 http 请求
中没有给出编码时,标准行为是重新调整以 Latin-1 编码的所有内容.有关详细信息,请参阅 http://www.ietf.org/rfc/rfc2045.txt
A little background: When no encoding is given in a http request
the standard behaviour is to retun everything encoded in Latin-1.
For in depth info see http://www.ietf.org/rfc/rfc2045.txt
这篇关于使用 Scalas 的结果“fromURL";抛出异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!