如何使用 PhantomJS 下载 csv 文件 [英] How to download a csv file using PhantomJS

查看:21
本文介绍了如何使用 PhantomJS 下载 csv 文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我使用普通浏览器 (Chrome) 浏览网站 A 时,当我点击网站 A 上的链接时,Chrome 会立即以 CSV 文件的形式下载报告.

When I'm browsing a website A using normal browser (Chrome) and when I click on a link on the website A, Chrome imediatelly downloads report in a form of CSV file.

当我检查服务器响应头时,我得到以下结果:

When I checked a server response headers I get the following results:

Cache-Control:private,max-age=31536000
Connection:Keep-Alive
Content-Disposition:attachment; filename="report.csv"
Content-Encoding:gzip
Content-Language:de-DE
Content-Type:text/csv; charset=UTF-8
Date:Wed, 22 Jul 2015 12:44:30 GMT
Expires:Thu, 21 Jul 2016 12:44:30 GMT
Keep-Alive:timeout=15, max=75
Pragma:cache
Server:Apache
Transfer-Encoding:chunked
Vary:Accept-Encoding

现在,我想使用 PhantomJS 下载并解析这个文件.我设置了 page onResourceReceived 监听器来查看 Phantom 是否会接收/下载文件.

Now, I want to download and parse this file using PhantomJS. I set page onResourceReceived listener to see if Phantom will receive/download the file.

clientRequests.phantomPage.onResourceReceived = function(response) {
    console.log('Response (#' + response.id + ', stage "' + response.stage + '"): ' + JSON.stringify(response));
};

当我发出 Phantom 请求下载文件(这是 page.open('URL OF THE FILE'))时,我可以在 Phantom 日志中看到该文件已下载.以下是日志:

When I make Phantom request to download a file (this is page.open('URL OF THE FILE')), I can see in Phantom log that file is downloaded. Here are logs:

"contentType": "text/csv; charset=UTF-8",
    "headers": {
        "name": "Date",
        "value": "Wed, 22 Jul 2015 12:57:41 GMT"
    },
    "name": "Content-Disposition",
    "value": "attachment; filename="report.csv"",
    "status":200,"statusText":"OK"

我收到了文件及其内容,但如何访问文件数据?当我打印当前的 PhantomJS page 对象时,我得到了页面 A 的 HTML,但我不想要那个,我想要 CSV 文件,我需要使用 JavaScript 对其进行解析.

I received the file and its content, but how to access file data? When I print current PhantomJS page object, I get the HTML of the page A and I don't want that, I want CSV file, which I need to parse using JavaScript.

推荐答案

经过几天的摸索,不得不说有一些解决方案:

After days and days of investigation, I have to say that there are some solutions:

  • 在您的评估函数中,您可以进行 AJAX 调用来下载和编码您的文件,然后您可以将此内容返回给幻影脚本
  • 您可以使用一些 GitHub 页面上提供的自定义 Phantom 库

如果您需要使用 PhantomJS 下载文件,那么请远离 PhantomJS 并使用 CasperJS.CasperJS 基于 PhantomJS,但它具有更好、更直观的语法和程序流程.

If you need to download a file using PhanotmJS, then run away from PhantomJS and use CasperJS. CasperJS is based on PhantomJS, but it has much better and intuitive syntax and program flow.

这是解释为什么 CasperJS 优于 PhantomJS"的好帖子.在这篇文章中,您可以找到有关文件下载的部分.

Here is good post explaining "Why CasperJS is better than PhantomJS". In this post you can find section about file download.

如何使用 CasperJS 下载 CSV 文件(即使服务器发送标头 Content-Disposition:attachment; filename='file.csv)

How to download CSV file using CasperJS (this works even when server sends header Content-Disposition:attachment; filename='file.csv)

在这里您可以找到一些可供下载的自定义 csv 文件:http://captaincoffee.com.au/dump/items.csv

Here you can find some custom csv file available for download: http://captaincoffee.com.au/dump/items.csv

为了使用 CasperJS 下载此文件,请执行以下代码:

In order to download this file using CasperJS execute the following code:

var casper = require('casper').create();

casper.start("http://captaincoffee.com.au/dump/", function() {
    this.echo(this.getTitle())
});
casper.then(function(){
    var url = 'http://captaincoffee.com.au/dump/csv.csv';
    require('utils').dump(this.base64encode(url, 'get'));
});

casper.run();

上面的代码将下载 http://captaincoffee.com.au/dump/csv.csv CSV 文件,并将结果打印为 base64 字符串.这样一来,您甚至不必将数据下载到文件中,您的数据就是 base64 字符串.

The code above will download http://captaincoffee.com.au/dump/csv.csv CSV file and will print results as base64 string. So this way, you don't even have to download data to file, you have your data as base64 string.

如果你明确想要下载文件到文件系统,你可以使用CasperJS中的download函数.

If you explicitly want to download file to file system, you can use download function which is available in CasperJS.

这篇关于如何使用 PhantomJS 下载 csv 文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆