饼干在PHP Curl [英] cookiejar in PHP Curl

查看:117
本文介绍了饼干在PHP Curl的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在PHP Curl的情况下,当我们需要存储/读取cookie的网络抓取,它感觉很多资源鼓励使用一个文件处理cookie与这些选项

  curl_setopt($ ch,CURLOPT_COOKIEJAR,$ CookieJarFilename); 

curl_setopt($ ch,CURLOPT_COOKIEFILE,$ CookieJarFilename);

这里的底线是使用单个文件作为cookiejar(通常是.txt文件) p>

但在实际情况下,我们的网站不仅仅被一台计算机访问,很可能有很多计算机在同一时间访问它,还有一些机器人,如Googlebot ,Yahoo Slurp等。



因此,使用单个.txt 文件,是不是显而易见的cookie jar会覆盖



或者我错误地在这里?



如果有多个用户访问您的网页,并且您需要执行<$ c $ c> curl 为所有人提供唯一的Cookie,那么您可以执行几个操作来处​​理此情况。



1)如果您的用户并且已经开始了 $ _ SESSION ,那么您可以使用 session_id()来存储cookie的文件名。



2)如果您的用户不需要任何会话(例如Google bot),则可以使用timestamp +为Cookie创建一个额外的随机数字文件名。例如:

  $ cookieName = time()。_。substr(md5(microtime()),0,5 )。。文本; 
//将输出如下:
//`1388788940_91ab4.txt`

但在这种情况下,如果用户在5分钟后返回您,则不能重复使用Cookie(除非您使用Cookie文件名设置用户的Cookie)。



对于任何一种情况,请确保您定期清理这些文件。否则,您的目录中将创建大量的Cookie文件。


In PHP Curl case when we need to store/read cookies in term of web scraping, it feels that many resources out there encourage to use a file for handling cookies with these option

curl_setopt($ch, CURLOPT_COOKIEJAR, $CookieJarFilename);

curl_setopt($ch, CURLOPT_COOKIEFILE, $CookieJarFilename);

The bottom line here is they use a single file as cookiejar (usually .txt file).

But in the real scenario, our website is not only accessed by one computer, most likely there are many computers accessed it in the same time, and also there are some bots like Googlebots, Yahoo Slurp, etc.

So, with the single .txt file, isn't it obvious that the cookie jar will overwrite the same text file, make it a real mess for cookie?

Or am I mistaken here?

What's the 'right' method for handling cookies?

解决方案

If there are multiple people accessing your page, and you need to perform curl with unique cookies for everyone, then there are several things you can do to handle this scenario.

1) If your user is authenticated and has a $_SESSION started on your end, then you can use the session_id() for cookie's file name.

2) If your user doesn't require any session(a Google bot, for example), you can create the cookie using timestamp + an extra random number for your cookie file name. For example:

$cookieName = time()."_".substr(md5(microtime()),0,5).".txt"; 
// Would output something like:
// `1388788940_91ab4.txt`

But in this case, you can not reuse the cookie if the user returns back to you 5 minutes later(unless you set the user's cookie with your cookie file name).

For either case, make sure you are cleaning these files periodically. Otherwise you'll have tons of cookie files created in your directory.

这篇关于饼干在PHP Curl的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆