php cURL登录到jsp网站并返回HTML [英] php cURL log into jsp website and return HTML

查看:296
本文介绍了php cURL登录到jsp网站并返回HTML的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试使用cURL登录jsp / tomcat网站(我们将其称为 https://unknown.com ),并从页面返回HTML。我观察了Firebug中的Net面板和Firecookie的cookie面板,概述了以下步骤的手册:

I'm trying to use cURL to log into a jsp/tomcat website (we'll call it https://unknown.com for privacy reasons) and return the HTML from a page. I've observed the Net panel in firebug and the cookie panel with Firecookie to outline the manual the steps below:


  1. 打开web root - a href =https://unknown.com =nofollow> https://unknown.com

  2. 已重定向到 https://unknown.com/common/frames.jsp
    -Cookie创建日期:JSESSIONID

  3. 填写j_username和j_password

  4. 将j_username = user& j_password = pass& submit = logon发布到 https://unknown.com/common/j_security_check

  5. 重新导向至 https://unknown.com/common/frames.jsp

  6. 用户从首页中选择HTML链接return is。

  1. Open web root - https://unknown.com
  2. Redirected to https://unknown.com/common/frames.jsp -Cookie Created: JSESSIONID
  3. Fill out j_username and j_password
  4. Post "j_username=user&j_password=pass&submit=logon" to https://unknown.com/common/j_security_check
  5. Redirect to https://unknown.com/common/frames.jsp
  6. User selects link from home page where the HTML to be return is.

所以基本上我没有很多cURL的经验,我没有太多的运气,我真的只需要开始了解cURL将需要登录到网站并转到目标网页的步骤。

So basically I don't have a lot of experience with cURL and I'm not having much luck, I really just need to start off with understanding the steps that cURL will require to log in to the site and go to the destination page.

编辑:这里是我的代码:

Here is my code:

//user login information
$username = "user";
$password = "pass";

$postData = "j_username=".$username."&j_password=".$password."&logon=submit";

$cookie_file = "/tmp/curl_cookies.txt";

//$fp = fopen($cookie_file, "w");
//fclose($fp);

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, 'https://unknown.com/common/j_security_check');
curl_setopt($ch, CURLOPT_POSTFIELDS,$postData);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3");
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file);
curl_setopt($ch, CURLOPT_REFERER, "https://unknown.com/common/Frames.jsp");
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

$data = curl_exec($ch);
curl_close($ch);

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, 'https://unknown.com/claritymatch/ClarityBatchViewer.jsp?id=123');
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3");
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

$data = curl_exec($ch);

curl_close($ch);
echo $data;

当我第一次运行.php文件时,它不工作,目标HTML - 我怎么能得到它只是把它第一次?此外,因为我在上面指出的文件中存储JSESSIONID cookie,我会遇到问题,该会话ID不更改或将根据需要更改?

It doesn't work when I first run the .php file, but the second time it brings up the destination HTML - how can I get it to just bring it up the first time? Also, since I'm storing the JSESSIONID cookie in the file indicated above, wont I run into problems with that session id not changing or will it change as needed?

推荐答案

以下是您的情况的几个建议...

Here are a few suggestions for your situation...


  • 为简单起见使用相同的curl句柄

    这减少了为每个请求复制选项的需要。在开始时设置大多数选项,并且只做一次。我主要参考cookie选项,用户代理,跟随位置等。

    然后,您可以为每个单独的请求设置URL和请求方法。

    您甚至可以获得通过向请求添加 Keep-Alive 标头来提高额外的性能,所以如果远程服务器支持它,那么将使用相同的连接来进行多个请求,而不必每次都重新连接。

  • Re-use the same curl handle for simplicity
    This reduces the need to duplicate options for each request. Set the majority of your options at the beginning and do it only once. I refer mostly to cookie options, user-agent, follow-location etc.
    You can then set the URL and request method for each individual request you make.
    You can even gain additional performance by adding a Keep-Alive header to your request so if the remote server supports it, the same connection will be used to make multiple requests without having to reconnect each time.

CURLOPT_FOLLOWLOCATION 设为 true 并从头开始

尝试按照您所看到的浏览器所做的操作。也就是说,请求web根;如果网站将您重定向到安全检查网址,cURL将跟踪该重定向并捕获进程中设置的任何Cookie。如果发送重定向,一个cURL请求可能会导致多个HTTP请求。然后继续填写登录表单。

Set CURLOPT_FOLLOWLOCATION to true and start from the beginning
Try to follow exactly what you see the browser do. That is, request the web root; if the site redirects you to the security check URL, cURL will follow that redirect and capture any cookies set in the process. One cURL request can result in multiple HTTP requests if a redirect is sent. Then proceed to "fill out" the login form.

使用 http_build_query对于您的帖子数据

设置帖子字符串的方式没有问题,但数据必须进行网址编码。使用数组使用 http_build_query()更容易操作,并且将生成可以直接提供给cURL的网址编码字符串。

Use http_build_query() for your post data
There is nothing wrong with the way you set up your post string, but the data must be url-encoded. Using http_build_query() with an array is easier to manipulate and will result in an url-encoded string you can feed directly to cURL.

另请参见此回答我几天前发布了一个试图做类似的人。我还发布了一些参考其他一些答案,包含使用cURL请求多个URL的完整示例;只是看看这些答案应该帮助你了解如何做你想要的。特别请参阅此答案这是我在文章中提到的第一个参考,因为它显示如何通过发出几个帖子请求和最终获取请求登录到Google。

See also this answer I posted a couple of days ago for a person trying to do something similar. I also posted a few references to some other answers that contain full samples of requesting multiple URLs using cURL; just looking at those answers should help you get an idea of how to do what you want. Especially see this answer which was the first reference in the post I mentioned as it shows how to log into Google by making several post requests and finally a get request.

这篇关于php cURL登录到jsp网站并返回HTML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆