casperJS如何在从web / .click()收集数据的同时单击表中的多个链接不工作? [英] casperJS how to click multiple links in a table while collecting data from the web /.click() doesn't work?

查看:231
本文介绍了casperJS如何在从web / .click()收集数据的同时单击表中的多个链接不工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用CasperJS清除一些网络数据。数据在表中,在每一行中有一个链接,导致一个页面有更多的细节。在脚本中有一个循环遍历所有表行。我想Casper单击链接,收集子页面上的数据,并返回一个历史记录步骤,以处理下一个表行。问题是,click()不工作,我不知道为什么。有什么办法解决这个问题吗? (注意:href调用了javascript函数viewContact)

I want to scrape some web data using CasperJS. The data is in a table, in each row there is a link leading to a page with more detail. In the script there is a loop iterating through all table rows. I want Casper to click the link, collect the data on a sub-page and come one history step back to process next table row. The problem is that the click() doesn't work and I don't know why. Is there any way to fix this ? (note: a javascript function viewContact is invoked by href)

以下是代码:

var employee = {
    last_name: "",
    first_name: "",
    position: "",
    department: "",
    location: "",
    email: "",
    phone: "",
    twitter: ""
};

var employees = [];
var result_number = 50;
var start_url = 'https://www.jigsaw.com/SearchContact.xhtml?companyId=489781&orderby=0&order=0&opCode=paging&mode=0&estimatedCount=126&dead=false&rpage=1&rowsPerPage=200';

var casper = require('casper').create({
    javascriptEnabled: true
});

casper.start(start_url, function() {
    var js = this.evaluate(function() {
    return document;
});

     for (var i = 1; i <= result_number; i++)
     {        
        // j stands for three neighbour td columns containing: 
        // position, name+link, location

        employee.position = this.getHTML('#sortableTable tr:nth-child(' + i + ') td:nth-child(3) span');

        // click link and get other data
        this.click('#sortableTable tr:nth-child(' + i + ') td:nth-child(4) span a');
            employee.first_name = this.getHTML('#sortableTable tr:nth-child(' + i + ') td:nth-child(4) span a');

        //collect data
        this.waitForSelector('#firstname', function() {
            employee.first_name = this.getHTML('#firstname');
        });

        this.waitForSelector('#lastname', function() {
            employee.last_name = this.getHTML('#lastname');
        });
        this.waitForSelector('#state', function() {
            employee.department = this.getHTML('#state');
        });
        this.waitForSelector('#email', function() {
            employee.email = this.getHTML('#email');
        });
        this.waitForSelector('#phone', function() {
            employee.phone = this.getHTML('#phone');
        });

        //get back to previous page
        this.back();

        employee.location = this.getHTML('#sortableTable tr:nth-child(' + i + ') td:nth-child(5) span');

        this.echo('\n\n Employee number: ' + i + " :\n");
        this.echo('first name : ' + employee.first_name);
        this.echo('last name  : ' + employee.last_name);
        this.echo('position   : ' + employee.position);
        this.echo('department : ' + employee.department);
        this.echo('location   : ' + employee.location);
        this.echo('email      : ' + employee.email);
        this.echo('phone      : ' + employee.phone);

}

});

casper.run();


推荐答案

我看到两件事需要纠正。首先,您的代码中的for循环似乎不在任何casperjs方法的范围内。

I see two things here that need to be corrected. First, The for loop in your code doesn't appear to be in the scope of any casperjs methods.

这是:

for (var i = 1; i <= result_number; i++)

它应该在 casper.then 方法中。

其次,最重要的是,可以通过复制粘贴的方式, tr:nth-​​child('+ i +')你想与之交互将不会以这种方式工作。我不知道为什么,但它似乎不工作这直向前。我试图做同样的事情。我的解决方案是首先将 i 转换为字符串,而不是像这样的数字:

Secondly and most importantly, the tr:nth-child(' + i + ') you'd like to interact with won't work in this way. I don't know why but it doesn't seem to work this straight forwardly. I've tried to do the same thing. My solution was to first of all convert the i to a string instead of a number like so:

pageturn = pageturn + 1;
// Collect <td> contents on each page.
var pageturnString = pageturn.toString();
var linknum = 'a.SomeLinkClass:nth-child('+pageturnString+')';

在我的例子中我使用这个来点击更改页面,无论如何你必须封装你的与第一个方法内的 this.then()方法中的所述css选择器交互,然后第二个子方法执行for循环的其余部分。

in my case I'm using this to click to change the page, either way you must encapsulate your interaction with the said css selector inside a this.then() method inside the first method, and then a second child method does the rest of the for loop.

示例:

casper.each(pagecount, function() {
    this.then(function() {
        pageturn = pageturn + 1;
        // Collect <td> contents on each page.
        var pageturnString = pageturn.toString();
        var linknum = 'a.SomeLinkClass:nth-child('+pageturnString+')';
    });

    this.then(function() {
        //Now run for loop here. 
    });
 });

如果您不将css选择器构造封装在 this.then ()方法在下一个方法中使用之前,它将不工作。我不知道为什么,但这是交易。在我的代码中, pagecount 可能被用来代替你的for循环,但我会留给你。

If you don't encapsulate the css selector construction within the this.then() method before it's used in the next method, it won't work. I don't know why but that's the deal. In my code, pagecount could possibly be used instead of your for loop but I'll leave that up to you.

这篇关于casperJS如何在从web / .click()收集数据的同时单击表中的多个链接不工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆