使用Cloudflare保护访问网页 [英] Accessing webpage with Cloudflare protection

查看:370
本文介绍了使用Cloudflare保护访问网页的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先,我想道歉,以防可能无法为我的问题提供足够的连接或任何类似的事情。

First of I wanted to apologize in case my question may not be provided with enough connect or anything of that matter, I'm typing this up on my phone right now.

因此,我正在开发一个项目,该项目要求我自动执行网页中的任务,为此,第一步是首先访问该页面,但遇到了一个障碍试图搜索并找出无济于事。

So I'm working on a project that requires me to automate tasks within a webpage and in order to do that, step one is to access the page in the first place, but I've reached an obstacle that I've tried searching and figuring out with no avail.

我要访问的网页受到CloudFlare的DDoS保护,这意味着在进入页面之前,先检查浏览器几秒钟,然后再通过。

The webpage I'm trying to reach had DDoS protection by CloudFlare, meaning before entering the page, your browser is checked for a couple seconds then let through.

我正在使用外部库HtmlUnit,该库为我提供了我所需要的一切,并且在访问页面时出现 503错误,说我无法访问它,可以肯定的是这是阻止它的保护措施。

I'm using the external library HtmlUnit which provides me with everything I will need and when accessing the page, I get a 503 error, saying I cannot access it, in fairly sure this is the protection blocking it.

现在我的问题是如何绕过它。我反编译了一个 .jar ,看着它和我去了同一个站点,但是我很难辨认出来。

Now my question is how should I bypass it. There is a .jar I decompiled and looked at which goes to the same site as me but it's far too illegible for me to make out.

非常感谢您在此任务上的帮助。

Would appreciate help on this task so much, thanks.

作为参考,以下是使用CloudFare进行测试的网页示例,< a href = http://www.osbot.org rel = nofollow> www.osbot.org (这不是BTW网站)。

For reference, here is an example of a webpage that uses CloudFare for testing, www.osbot.org (this isn't the site BTW).

如果您还需要其他任何信息,请让我知道,仅对文本表示抱歉,很难在手机上键入此信息,而且我目前无法使用PC。

If you need anything else please let me know and again sorry for text only, it's hard typing this up on my phone and I currently have no PC access.

编辑:无法将我的IP列入白名单或与网站所有者联系

Cannot whitelist my IP or get in contact with site owner

推荐答案

我知道这个问题已经很久了,但是有尚无正确答案。这对我有用:

I know this question is quite old, but there is no correct answer yet. Here is what works for me:

WebClient client = new WebClient(BrowserVersion.CHROME);

client.getOptions().setCssEnabled(false);
client.getOptions().setJavaScriptEnabled(true);
client.getOptions().setThrowExceptionOnFailingStatusCode(false);
client.getOptions().setRedirectEnabled(true);
client.getCache().setMaxSize(0);
client.waitForBackgroundJavaScript(10000);
client.setJavaScriptTimeout(10000);
client.waitForBackgroundJavaScriptStartingBefore(10000);

try {

    String url = "https://www.badlion.net/";

    HtmlPage page = client.getPage(url);

    synchronized(page) {
        page.wait(7000);
    }
    //Print cookies for test purposes. Comment out in production.
    URL _url = new URL(url);
    for(Cookie c : client.getCookies(_url)) {
        System.out.println(c.getName() +"="+c.getValue());
    }

    //This prints the content after bypassing Cloudflare.
    System.out.println(client.getPage(url).getWebResponse().getContentAsString());
} catch (FailingHttpStatusCodeException e) {
    e.printStackTrace();
} catch (MalformedURLException e) {
    e.printStackTrace();
} catch (IOException e) {
    e.printStackTrace();
} catch(InterruptedException e) {
    e.printStackTrace();
}

只需替换字符串url = https:// badlion.net /; 包含您尝试访问的URL。

Just replace String url = "https://badlion.net/"; with the URL you are attempting to access.

这篇关于使用Cloudflare保护访问网页的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆