如何使用默认的Web代理设置在R中配置curl程序包? [英] How to configure the curl package in R with default web proxy settings?

查看:103
本文介绍了如何使用默认的Web代理设置在R中配置curl程序包?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在商业环境中使用R,外部连接全部通过Web代理进行,因此我们需要指定代理服务器地址,并确保使用Windows身份验证连接到该服务器.

I'm using R in a commercial environment where external connectivity all goes via a web proxy, so we need to specify the proxy server address and ensure we connect to it with Windows authentication.

我已经有一些代码,可以将RCurl和httr软件包配置为默认使用这些设置-即

I already have code that will configure the RCurl and httr packages to use those settings by default - i.e.

httr::set_config(config(
  proxy = "my.proxy.address", 
  proxyuserpwd = ":", 
  proxyauth = 4
   ))

opts <- list(
  proxy = "my.proxy.address",
  proxyuserpwd = ":", 
  proxyauth = 4
)
RCurl::options(RCurlOptions = opts)

但是,最近在一些情况下,我发现了依赖于 curl 的软件包包以发出Web请求-例如xml2::read_xml-我找不到设置相同代理选项的任何方法,因此默认情况下将它们选中并由curl使用.

However, in a couple of cases recently, I've found packages that depend on the curl package to make web requests - for instance xml2::read_xml - and I can't find any way to set the same proxy options so they're picked up by default and used by curl.

如果我自己直接使用curl,则可以在新的句柄上设置选项,以下代码足以成功工作:

If I use curl directly myself, I can set the options on a new handle and the following code is sufficient to work successfully:

  h = new_handle(proxy = "my.proxy.address",
                 proxyuserpwd = ":")
  con = curl(url,handle = h)
  page = xml2::read_xml(con)

...但是当在其他人的功能中掩盖curl的使用时,这无济于事!

... but this isn't any help when the use of curl is buried within someone else's function!

或者,我知道我可以为代理地址设置一个环境变量,如下所示:

Alternatively, I know I can set up an environment variable for the proxy address, like this:

Sys.setenv(https_proxy = "https://my.proxy.address")

...,然后libcurl将其拾取.但是,如果我只是这样做,那么我最终会遇到HTTP 407代理身份验证错误.有没有一种方法可以指定空白的用户名/密码(与proxyuserpwd设置一样),以便我们使用Windows凭据进行身份验证?似乎也无法将proxyauth选项指定为环境变量.

... and libcurl picks it up. But if I do just this, then I end up with an HTTP 407 proxy authentication error. Is there a way to specify blank username / password (as the proxyuserpwd setting does), so we authenticate with Windows credentials? It also doesn't seem possible to specify the proxyauth option as an environment variable.

请问有人可以提供解决方案或建议吗?

Can anyone offer a solution or any suggestions, please?

推荐答案

我遇到了类似的问题.以下是对我有用的步骤:

I was having similar issues. Here are the steps that worked for me:

  1. 下载我公司的代理自动配置文件(PAC文件).对于IE:单击齿轮图标->互联网选项->连接->局域网设置->将http地址复制到新的浏览器窗口中,以下载文本文件.
  2. 在PAC文件中找到指定代理的行(例如:"auth-proxy.xxxxxxx.com:9999")
  3. 在新的R会话中,通过使用类似于以下命令的临时设置来测试这些代理设置,并从PAC文件中替换您的值:

  1. Download my company's proxy auto-config file (PAC file). For IE: click the gear icon --> internet options --> Connections --> LAN Settings --> copy the http address into a new browser window to download the text file.
  2. Locate the line in the PAC file specifying the proxy (eg: "auth-proxy.xxxxxxx.com:9999")
  3. In a new R session, test these proxy settings by temporarily setting them with a command similar to the following, substituting your values from your PAC file:

Sys.setenv(http_proxy = "auth-proxy.xxxxxxx.com:9999")
Sys.setenv(https_proxy = "auth-proxy.xxxxxxx.com:9999")

  • 在同一会话中重新运行代码,以查看这些新设置是否可以解决问题.这是我使用的测试.

  • Rerun your code in the same session to see if these new settings solve the issue. This is the test I used.

    read_html(curl('http://google.com', handle = curl::new_handle("useragent" = "Mozilla/5.0")))
    

  • 使用Sys.setenv设置代理只会持续到 当前 会话结束.要进行更永久的更改,您可以考虑将其添加为在此处解释.

    Setting the proxy using Sys.setenv will only persist to the end of your current session. To make a more permanent change you may consider adding this to your R_PROFILE as explained here.

    这篇关于如何使用默认的Web代理设置在R中配置curl程序包?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆