CasperJS:遍历URL [英] CasperJS: Iterating through URL's
问题描述
我对CasperJS还是陌生的,但是没有办法在for循环中打开URL并执行CasperJS命令吗?例如,此代码无法正常运行:
I'm pretty new to CasperJS, but isn't there a way to open a URL and execute CasperJS commands in for loops? For example, this code doesn't work as I expected it to:
casper.then(function() {
var counter = 2013;
for (i = counter; i < 2014; i++) {
var file_name = "./Draws/wimbledon_draw_" + counter + ".json";
// getting some local json files
var json = require(file_name);
var first_round = json["1"];
for (var key in first_round) {
var name = first_round[key].player_1.replace(/\s+/g, '-');
var normal_url = "http://www.atpworldtour.com/Tennis/Players/" + name;
// the casper command below only executes AFTER the for loop is done
casper.thenOpen(normal_url, function() {
this.echo(normal_url);
});
}
}
});
而不是Casper分别调用 thenOpen
每次迭代使用新的URL,则仅在执行for循环之后才调用它。然后,Casper thenOpen
被调用,且最后一个值normal_url设置为。
Instead of Casper is calling thenOpen
on each new URL per iteration, it gets only called AFTER the for loop executes. Casper thenOpen
then gets called with the last value normal_url is set to. Is there no Casper command to have it work each iteration within the for loop?
是否没有Casper命令让它在for循环中的每个迭代中都能正常工作?后续:如何使casper thenOpen返回当前迭代中的值for循环?
例如,我需要在 thenOpen
上返回一个值也许如果HTTP状态为404,则我需要评估另一个URL,因此我想返回false)。
Say for example, I needed a return value on that thenOpen
(maybe if the HTTP status is 404 I need to evaluate another URL so I want to return false). Is this possible to do?
编辑 casper.thenOpen
上面的调用:
var status;
// thenOpen() only executes after the console.log statement directly below
casper.thenOpen(normal_url, function() {
status = this.status(false)['currentHTTPStatus'];
if (status == 200) {
return true;
} else {
return false;
}
});
console.log(status); // This prints UNDEFINED the same number of times as iterations.
推荐答案
为法郎和达伦·库克说明,您可以使用IIFE在 thenOpen
步骤中修复url值。
As Fanch and Darren Cook stated, you could use an IIFE to fix the url value inside of the thenOpen
step.
将使用 getCurrentUrl
检查网址。因此,更改行
An alternative would be to use getCurrentUrl
to check the url. So change the line
this.echo(normal_url);
至
this.echo(this.getCurrentUrl());
问题是 normal_url
引用了最后一个设置的值,而不是当前值,因为稍后执行。使用 casper.thenOpen(normal_url,function(){...});
不会发生这种情况,因为当前引用已传递给该函数。您只是看到了错误的URL,但是实际上打开了正确的URL。
The problem is that normal_url
references the last value that was set but not the current value because it is executed later. This does not happen with casper.thenOpen(normal_url, function(){...});
, because the current reference is passed to the function. You just see the wrong url, but the correct url is actually opened.
关于您更新的问题:
所有然后*
和等待*
casperjs API中的函数是步骤函数。您传递给它们的函数将在以后安排和执行(由 casper.run()
触发)。您不应该在步骤之外使用变量。只需在 thenOpen
调用中添加其他步骤即可。他们将以正确的顺序安排。同样,您也不能从 thenOpen
返回任何内容。
All then*
and wait*
functions in the casperjs API are step functions. The function that you pass into them will be scheduled and executed later (triggered by casper.run()
). You shouldn't use variables outside of steps. Just add further steps inside of the thenOpen
call. They will be scheduled in the correct order. Also you cannot return anything from thenOpen
.
var somethingDone = false;
var status;
casper.thenOpen(normal_url, function() {
status = this.status(false)['currentHTTPStatus'];
if (status != 200) {
this.thenOpen(alternativeURL, function(){
// do something
somethingDone = true;
});
}
});
casper.then(function(){
console.log("status: " + status);
if (somethingDone) {
// something has been done
somethingDone = false;
}
});
在此示例中, this.thenOpen
将为在 casper.thenOpen
和 somethingDone
之后安排的将是 true
casper.then
,因为它紧随其后。
In this example this.thenOpen
will be scheduled after casper.thenOpen
and somethingDone
will be true
inside casper.then
because it comes after it.
有您需要修复一些问题:
There are some things that you need to fix:
- 您不用计数器
i
:您可能是说。/Draws / wimbledon_draw_ + i + .json
不是。/Draws / wimbledon_draw_ +计数器+ .json
-
您不能有趣的是,您可以要求一个JSON文件。我仍然会使用需要
JSON字符串。fs.read
读取文件并解析其中的JSON(JSON.parse
)。
- You don't use your counter
i
: you probably mean"./Draws/wimbledon_draw_" + i + ".json"
not"./Draws/wimbledon_draw_" + counter + ".json"
You cannotInterestingly, you can require a JSON file. I still would userequire
a JSON string.fs.read
to read the file and parse the JSON inside it (JSON.parse
).
关于您的问题...
您没有安排任何命令。只需在 thenOpen $之后或之内添加步骤(
then *
或 wait *
)即可。 c $ c>。
You didn't schedule any commands. Just add steps (then*
or wait*
) behind or inside of thenOpen
.
这篇关于CasperJS:遍历URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!