403尝试下载远程图像时 [英] 403 when trying to download a remote image

查看:99
本文介绍了403尝试下载远程图像时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从一些网址下载图片。对于一些图片它工作正常,但对于其他人我得到403错误。

I am trying to download pictures from some urls. For some pictures it works fine, but for others I get 403 errors.

例如,这一个:http://blog.zenika.com/themes/Zenika/img/zenika.gif

此图片访问不需要任何身份验证。您可以在链接上单击自己,并使用200状态代码验证它是否可用于您的浏览器。

This picture access does not require any authentication. You can click yourself on the link and verify that it is available to your browser with a 200 status code.

以下代码会产生异常: new java.net.URL(url).openStream()。对于 org.apache.commons.io.FileUtils.copyURLToFile(new java.net.URL(url),tmp),它们使用相同的 openStream ()引擎盖下的方法。

The following code produces an exception: new java.net.URL(url).openStream(). Same for org.apache.commons.io.FileUtils.copyURLToFile(new java.net.URL(url), tmp) whichs uses the same openStream() metho under the hood.

java.io.IOException: Server returned HTTP response code: 403 for URL: http://blog.zenika.com/themes/Zenika/img/zenika.gif
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1626) ~[na:1.7.0_45]
at java.net.URL.openStream(URL.java:1037) ~[na:1.7.0_45]
at services.impl.DefaultStampleServiceComponent$RemoteImgUrlFilter$class.downloadAsTemporaryFile(DefaultStampleServiceComponent.scala:548) [classes/:na]
at services.impl.DefaultStampleServiceComponent$RemoteImgUrlFilter$class.services$impl$DefaultStampleServiceComponent$RemoteImgUrlFilter$$handleImageUrl(DefaultStampleServiceComponent.scala:523) [classes/:na]






我使用Scala / Play Framework开发。我尝试使用内置的AsyncHttpClient。


I develop with Scala / Play Framework. I tried to use the built-in AsyncHttpClient.

// TODO it could be better to use itetarees on the GET call becase I think AHC load the whole body in memory
WS.url(url).get.flatMap { res =>
  if (res.status >= 200 && res.status < 300) {
    val bodyStream = res.getAHCResponse.getResponseBodyAsStream
    val futureFile = TryUtils.tryToFuture(createTemporaryFile(bodyStream))
    play.api.Logger.info(s"Successfully downloaded file $filename with status code ${res.status}")
    futureFile
  } else {
    Future.failed(new RuntimeException(s"Download of file $filename returned status code ${res.status}"))
  }
} recover {
  case NonFatal(e) => throw new RuntimeException(s"Could not downloadAsTemporaryFile url=$url", e)
}

使用此AHC代码,它工作正常。有人可以解释这种行为以及为什么我用 URL.openStream()方法得到403错误?

With this AHC code, it works fine. Can someone explain this behavior and why I got a 403 error with the URL.openStream() method?

推荐答案

如前所述,某些托管商使用某些标头(如UserAgent)阻止此入侵:

As mentioned, some hoster prevent this intrusion using some header like UserAgent :

这不起作用:

   val urls = """http://blog.zenika.com/themes/Zenika/img/zenika.gif"""
  val url = new URL(urls)
  val urlConnection = url.openConnection() 
  val inputStream = urlConnection.getInputStream()
  val bufferedReader = new BufferedReader(new InputStreamReader(inputStream))

这有效:

val urls = """http://blog.zenika.com/themes/Zenika/img/zenika.gif"""
val url = new URL(urls)
val urlConnection = url.openConnection()   
urlConnection.setRequestProperty("User-Agent", """NING/1.0""") 
val inputStream = urlConnection.getInputStream()
val bufferedReader = new BufferedReader(new InputStreamReader(inputStream))

这篇关于403尝试下载远程图像时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆