在Selenium中使用HttpURLConnection时如何修复403响应,因为链接是手动打开的,没有任何问题 [英] How to fix 403 response when using HttpURLConnection in Selenium since the links are opening manually without any issue
问题描述
我正在使用硒Web驱动程序和Java检查网站中的活动链接.我已将链接传递给数组,并在验证时得到的响应是站点中所有链接的403禁止访问.它只是任何人都可以访问的公共网站.手动单击时,链接正常工作.我想知道为什么它没有显示200,在这种情况下可以做什么.
I was checking the active links in a website with selenium web driver and java. I have passed the links to the array and while verifying I am getting the response as 403 forbidden for all links in the site. It is just a public website anyone can access. The links are working properly when clicking manually. I wanted to know Why it is not showing 200 and what can be done on this situation.
这是用于带有Java的Selenium Webdriver
This is for Selenium webdriver with Java
for(int j=0;j< activelinks.size();j++) {
System.out.println("Active Link address and status >>> " + activelinks.get(j).getAttribute("href"));
HttpURLConnection connection = (HttpURLConnection)new URL(activelinks.get(j).getAttribute("href")).openConnection();
connection.connect();
String response = connection.getResponseMessage();
int responsecode = connection.getResponseCode();
connection.disconnect();
System.out.println(activelinks.get(j).getAttribute("href")+ ">>"+ response+ " " + responsecode);}
我希望响应代码为200,但实际输出为403
I expect the response code as 200, but the actual output is 403
推荐答案
403禁止
HTTP > 403 Forbidden
> 客户端错误状态响应代码表示服务器可以理解该请求,但拒绝对其进行授权.
403 Forbidden
The HTTP 403 Forbidden
client error status response code indicates that the server understood the request but refuses to authorize it.
此状态类似于 401
,但是在这种情况下,重新验证不会有任何区别.永久禁止访问并将访问与应用程序逻辑绑定在一起,例如对资源的权限不足.
This status is similar to 401
, but in this case, re-authenticating will make no difference. The access is permanently forbidden and tied to the application logic, such as insufficient rights to a resource.
我在您的代码块中没有看到任何此类问题.但是,有可能检测到 WebDriver 控制的 Browser Client ,因此随后的请求被阻止,并且可能有许多因素如下:
I don't see any such issue in your code block. However, there is a possibility that the WebDriver controlled Browser Client is getting detected and hence the subsequent requests are getting blocked and there can be numerous factors as follows:
-
User agent
-
Plugins
-
Languages
-
WebGL
-
Browser features
-
Missing image
User agent
Plugins
Languages
WebGL
Browser features
Missing image
您可以在以下位置找到一些详细的讨论:
You can find a couple of detailed discussion in:
- How does recaptcha 3 know I'm using selenium/chromedriver?
- Selenium and non-headless browser keeps asking for Captcha
通用解决方案是使用代理或旋转代理 >免费代理列表.
A generic solution will be to use a proxy or rotating proxies from the Free Proxy List.
Outro
您可以在以下位置进行一些相关的讨论:
Outro
You can a couple relevant discussions in:
- Can a website detect when you are using selenium with chromedriver?
- Selenium webdriver: Modifying navigator.webdriver flag to prevent selenium detection
- Failed to load resource: the server responded with a status of 429 (Too Many Requests) and 404 (Not Found) with ChromeDriver Chrome through Selenium
这篇关于在Selenium中使用HttpURLConnection时如何修复403响应,因为链接是手动打开的,没有任何问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!