如何在Windows 7上使用NodeJS / PhantomJS / CasperJS [英] How to use NodeJS / PhantomJS / CasperJS on Windows 7

查看:71
本文介绍了如何在Windows 7上使用NodeJS / PhantomJS / CasperJS的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要抓一个有AJAX和SESSIONS的网站表格(即时)。我做了很多研究,我遇到了几个可能的解决方案,一个是Python :: Mechanize。我不知道单独的python和cURL PHP (根据我的理解)无法处理AJAX或提交表单。

I need to scrape a website form (on-the-fly) which has AJAX and SESSIONS. I did a lot of research and I came across several possible solutions one being Python::Mechanize. I don't know python and cURL alone for PHP (from my understanding) cannot handle AJAX or submit forms.

我发现我认为可能的堆栈可以让我优雅:)。问题是我根本不知道如何使用这些包。

I found what i believe is the possible stack which can lead me to grace :). Problem is that I do not know how to use these packages at all.


  1. 我下载并安装了NODEjs,我可以称之为来自cmd。 (好)

  1. I downloaded and installed NODEjs and I can call it from cmd. (great)

我下载并安装了PhantomJS(不知道如何设置 PATH 以便它是动态的,所以我必须在CMD中手动 cd DIR 以使其加载)我该怎么设置这个在Windows 7中?不知道在哪里指出路径。

I downloaded and installed PhantomJS (Not sure how to setup the PATH so that it is dynamic so I have to manually cd in CMD to the DIR to get it to load) How can I set this up in Windows 7? Not sure where to point the path.

下载的CasperJS - 放入DIR

Downloaded CasperJS - put in the DIR

所以在phantomjs上,我能够在CMD提示符下运行一个echos'hello world'的测试文件。现在我在这里不知道如何继续。 - 最好我需要这个从我的网络服务器运行(即时) - 所以它需要实现到我的网页。截至目前,我想从CMD运行它并将其转到页面,提交表单,抓取结果,并将其写入文件。

So on phantomjs I was able to run a test file which echos 'hello world' in the CMD prompt. And now I here no clue how to proceed. -Ultimatly i need this to run (on-the-fly) from my webserver - so it needs to be implemented into my webpage. As of now I would like to just run it from CMD and get it to go to a page, submit a form, scrape the results, and write it to a file.

有人可以解释一下如何实现这一目标的工作流程吗?

Can someone please explain like a workflow of how I can accomplish this?

CasperJS - >显示此表单示例。我想用我的变量实现,运行脚本并保存结果。

CasperJS -> shows this form example. and I would like to implement with my variables, run the script and save the result.

casper.start('http://some.tld/contact.form', function() {
    this.fill('form#contact-form', {
        'subject':    'I am watching you',
        'content':    'So be careful.',
        'civility':   'Mr',
        'name':       'Chuck Norris',
        'email':      'chuck@norris.com',
        'cc':         true,
        'attachment': '/Users/chuck/roundhousekick.doc'
    }, true);
});

casper.then(function() {
    this.evaluateOrDie(function() {
        return /message sent/.test(document.body.innerText);
    }, 'sending message failed');
});

casper.run(function() {
    this.echo('message sent').exit();
});


推荐答案

安装PhantomJS后,请执行下一步操作:

After you install PhantomJS do next:


  1. 在桌面上,右键单击我的电脑,然后单击属性。

  2. 单击左侧的高级系统设置链接列。

  3. 在系统属性窗口中,单击环境变量按钮。

  4. 查找PATH变量并单击编辑

  5. 在变量值的末尾添加PhantomJS路径(不要忘记;在它之前)

  1. From the Desktop, right-click My Computer and click Properties.
  2. Click Advanced System Settings link in the left column.
  3. In the System Properties window click the Environment Variables button.
  4. Find PATH variable and click Edit
  5. Add PhantomJS path at the end of the variable value (don't forget ; before it)

现在你可以使用phantomjs来自您的CMD。
例如:phantomjs c:\ mywebsite \\\\\ ajax \ _dopescript.js

For now you can use phantomjs from your CMD. Ex.: phantomjs c:\mywebsite\with\ajax\dopescript.js

完成这些步骤后,下载CasperJS并将其放入PhantomJS文件夹


例如:c:\phantomjs \ casperjs

After these steps download CasperJS and put it in PhantomJS folder

Ex.: c:\phantomjs\casperjs

CasperJS的PATH变量的前面步骤(最后加上\bin)


例如:c:\phantomjs \ casperjs \ bin

Do previous steps for PATH variable for CasperJS (plus \bin at the end)

Ex.: c:\phantomjs\casperjs\bin

从CMD尝试casperjs。

Try casperjs from CMD.

如果不能正常工作,请转到casperjs文件夹中的batchbin目录,午餐casperjs.bat

If it's not working go to batchbin directory in casperjs folder and lunch casperjs.bat

现在尝试从这个文件夹中调用CasperJs。 (适合我)

Now try to call CasperJs from this folder. (Works for me)

所以现在你应该有PhantomJS + CasperJS。

So for now you should have PhantomJS + CasperJS.

关于保存结果:

在脚本开头放置 var fs = require('fs'); 并致电


fs.write('result.html',myData); 其中 myData 是数据您需要保存。

Put this var fs = require('fs'); at the beginning of your script and call

fs.write('result.html', myData); where myData is data that you need to save.

以下是有关FS的更多信息:
PhantomJS文件系统

Here is more information about FS: PhantomJS File System

这篇关于如何在Windows 7上使用NodeJS / PhantomJS / CasperJS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆