Guzzle 7-禁止使用403(可与CURL配合使用) [英] Guzzle 7 - 403 Forbidden (works fine with CURL)

查看:90
本文介绍了Guzzle 7-禁止使用403(可与CURL配合使用)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

更新:似乎用户代理不是某些主机提供HTML所需的唯一标头,我还必须添加"accepts"标头,最后这为许多主机解决了我的问题:

UPDATE: it seems that the user-agent isn't the only header some hosts require to serve HTML, I also had to add the 'accepts' header, in the end this solved the problem for me with many hosts:

  $response = $client->request('GET', 'http://acme.com', ['headers' => ['user-agent' => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.104 Safari/537.36',
'accept'=> 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9'
        ]]);

我正在尝试使用Guzzle检索某些网站,但收到403禁止的错误(当它们在浏览器中正常运行时),我怀疑这是由主机禁止的非标准User-Agent造成的.为了解决这个问题,我试图在Guzzle中将User-Agent设置为模仿浏览器,但找不到任何有效的方法.我可以浏览该网站,也可以使用WGET和CURL -L毫无问题地下载HTML,因此问题似乎出在Guzzle.

I'm trying to use Guzzle to retrieve some websites but recieving a 403 forbidden error (when they work fine in a browser), I suspect this is down to non-standard User-Agents being forbidden by the host. To get around this, I am trying to set the User-Agent in Guzzle to mimic a browser but I can't find any method that actually works. I can browse to the website and also use WGET and CURL -L to download the HTML with no problems so the issue seems to be with Guzzle.

我尝试过:

    $client = new Client(['allow_redirects' => ['track_redirects' => true]]);
    $client->setUserAgent("Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.62 Safari/537.36");
    $response = $client->get($domain_name);

奇怪的是,此^导致一个错误,似乎表明Guzzle试图浏览到User-Agent值:cURL错误6:无法解析主机:Mozilla(请参阅

Weirdly this ^ one results in an error that seems to say Guzzle is trying to browse to the User-Agent value: cURL error 6: Could not resolve host: Mozilla (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for Mozilla/5.0%20(Windows%20NT%206.2;%20WOW64)%20AppleWebKit/537.36%20(KHTML,%20like

    $domain_name = 'http://www.' . $domain_name;
    $client = new Client(['headers' => ['User-Agent' => 'Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.101 Safari/537.36']]);
    $response = $client->get($domain_name);

^导致客户端错误: GET http://www.xxx.co.uk 导致"403 Forbidden"";错误

^Results in a "Client error: GET http://www.xxx.co.uk resulted in a `403 Forbidden'" error

    $domain_name = 'http://www.' . $domain_name;
    $client = new Client(['allow_redirects' => ['track_redirects' => true]]);
    $client->setServerParameter('user-agent', "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.101 Safari/537.36");
    $response = $client->get($domain_name);

^传递给GuzzleHttp \ Client :: request()的参数3的结果必须是数组类型,给定的字符串".错误

    $domain_name = 'http://www.' . $domain_name;
    $client = new Client(['allow_redirects' => ['track_redirects' => true]]);
    $client->setHeader("user-agent", "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.101 Safari/537.36");
    $response = $client->get($domain_name);

^还导致传递给GuzzleHttp \ Client :: request()的参数3必须是数组类型,给定的字符串".错误

有什么建议吗?我想我在这里掉了一个兔子洞!

Any suggestions? I've gone down a rabbit hole here I think!

我想知道这里是否还有其他事情发生,因为据我了解,Guzzle只是CURL的包装,CURL可以毫无问题地从相同的IP获取相同的网页.

I'm wondering if something else is going on here because as I understand it, Guzzle is just a wrapper for CURL and CURL can fetch the same web page, from the same IP with no problem.

推荐答案

更新:似乎用户代理不是某些主机提供HTML所需的唯一标头,我还必须添加"accepts"标头,最终,这为许多主机为我解决了这个问题:

UPDATE: it seems that the user-agent isn't the only header some hosts require to serve HTML, I also had to add the 'accepts' header, in the end this solved the problem for me with many hosts:

$response = $client->request('GET', 'http://acme.com', ['headers' => ['user-agent' => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.104 Safari/537.36',
'accept'=>'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9']]);

这篇关于Guzzle 7-禁止使用403(可与CURL配合使用)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆