如何使用CasperJS通过自定义POST请求下载文件 [英] How to download a file through a custom POST request with CasperJS

查看:148
本文介绍了如何使用CasperJS通过自定义POST请求下载文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个搜寻器,需要下载使用POST请求表单后生成的文件.

I am writing a crawler and needs to download file generated after a form request using POST.

我已成功将this.download(url,'POST',Params)用于常规表格. 其中一个站点的许多字段使用相同的名称,因此使我无法使用常规下载方法.

I have successfully used this.download(url,'POST',Params) for regular forms. One of the sites has many fields using the same name, thus preventing me from using the regular download method.

尝试了很多事情之后,我尝试使用$.ajax()__utils.sendAJAX()来处理这样的表单:

After trying a lot of things, I tried with $.ajax() and __utils.sendAJAX() to process the form like this:

response = this.evaluate(function(){
  url=...
  params = $('form#theirForm').serialize();
  data = __utils__.sendAJAX(url, 'POST', params,false,{contentType:"application/x-www-form-urlencoded"});
return __utils__.encode(data);
});
function decode_base64(s) { var e={},i,k,v=[],r='',w=String.fromCharCode; var n=[[65,91],[97,123],[48,58],[43,44],[47,48]]; for(z in n){for(i=n[z][0];i<n[z][1];i++){v.push(w(i));}} for(i=0;i<64;i++){e[v[i]]=i;} for(i=0;i<s.length;i+=72){ var b=0,c,x,l=0,o=s.substring(i,i+72); for(x=0;x<o.length;x++){ c=e[o.charAt(x)];b=(b<<6)+c;l+=6; while(l>=8){r+=w((b>>>(l-=8))%256);} } } return r; }
casper.then(function() {
    utils.dump(response);
    fs.write("test.zip",decode_base64(response),'w');
});

代码向我返回base64数据,我将其转换并存储在test.zip文件中. 但我无法解压缩它,说它已损坏. 我转储了正确的zip文件的数据=>

The codes returns me base64 data which I convert and store in a test.zip file. But I juste can't uncompress it, says it is corrupted. I dump the data of a correct zip file =>

PK^C^D^T^@^H^@^H^@<F4><89><96>F^@^@^@^@^@^@^@^@^@^@^@^@?^@^@^@fourniture denr<E9>es alimentaires - dietetique infantile\CCAP.pdf<AC><BC>^ET\K<D3><F7>;^D<B7><U+0B81>^@<C1><99>^Y^F'^D<B7><E0>^D^ON<90><E0><EE><EE><EE>^Dwww'^P<9C>^D^H<EE>^^܂<C3>%'<CF>9<E7><C9><F7><U+07B5><BE>7<F7>f^SVOzf

将其与文件的第一行进行了比较:

Compared it with the first line of my file :

PK^C^D^T^@^H^@^H^@)_^M^@^@^@^@^@^@^@^@^@^@^@^@^@b^@^@^@fourniture denr<FD>es alimentaires - dietetique infantile\Bordereau de prix dietetique infantile.xlsx<FD>zuT<FD>I<FD><FD><FD><FD>^^4hp^M^D^M^R^H<FD>.<FD><FD>}p<FD>3<FD>kpw<FD>@pw^M<FD><FD>^R4<FD>Gv<FD>~<FD>[<FD><FD><FD><FD><FD>

任何人都知道可能出了什么问题?

Anyone has an idea of what could have gone wrong?

我尝试了很多事情(编码工具,编码设置,从chrome控制台转储以获得纯base64等)

I have tried so many things (encoding tools, encoding settings, dumping from the chrome console to get pure base64, etc.)

我不明白为什么它与latin-1或utf8编码有关,因为一个网站要求我选择要使用的编码.都试过了.

I don't understand why it is related to latin-1 or utf8 encoding, since a website asks me to select which encoding to use. Tried both.

推荐答案

casper.download()乐于接受序列化形式而不是对象,因此您仍然可以使用它.您只需要事先在页面上下文中序列化表单:

casper.download() happily accepts a serialized form instead of an object, so you can still use it. You just have to serialize the form in the page context beforehand:

var formData = casper.evaluate(function(){
  return $('form#theirForm').serialize();
});

var url;
casper.download(url, targetFile, 'POST', params);

唯一的问题可能是,使用了另一个mimeType:"text/plain; charset = x-user-defined".

The only problem might be, that another mimeType is used: "text/plain; charset=x-user-defined".

在这种情况下,您将不得不重新创建casper.download()中的整个函数级联:

In that case, you will have to recreate the whole cascade of functions that go into casper.download():

var url;
var response = casper.evaluate(function(url){
    var params = $('form#theirForm').serialize();
    var data = __utils__.sendAJAX(url, 'POST', params, false);
    return __utils__.encode(data);
}, url);

var cu = require('clientutils');

fs.write("test.zip", cu.decode(response), 'wb');

__utils__.sendAJAX()默认使用"application/x-www-form-urlencoded".

"application/x-www-form-urlencoded" is used by default for __utils__.sendAJAX().

这篇关于如何使用CasperJS通过自定义POST请求下载文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆