使用Google Apps Script和Yahoo Query Language从网页获取数据 [英] Get data from webpage using Google Apps Script and Yahoo Query Language

查看:164
本文介绍了使用Google Apps Script和Yahoo Query Language从网页获取数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用Google Apps脚本,我编写了以下函数以将一条信息提取到电子表格:

 函数myFunction(symbol,elemento){

var url =http://www.example.com/query?symbol=+ symbol;

switch(elemento){
case'one':
var xpath ='// * [@ id =sectionTitle] / h1';
休息;
case'two':
var xpath ='// * [@ id =headerQ] / div [1] / div / span [2]'
break;
}
var query =select * from html where url ='+ url +'and xpath ='+ xpath +';

var yql =https://query.yahooapis.com/v1/public/yql?format=json&q=+ encodeURIComponent(query);

var response = UrlFetchApp.fetch(yql);
var json = JSON.parse(response.getContentText());

switch(elemento){
case'one':
return json.query.results.h1;
休息;
case'two':
return ponto(json.query.results.span.content);
休息;


code


$ b现在,当将函数输入到cell,但有时在单元格中出现 #ERROR!错误:

  TypeError:无法读取null的h1属性。 (第54行)。 

删除该单元格并再次输入该函数通常是可行的。



为什么这个函数是不稳定的(即:它确实有效,但有时只是)?

解决方案

用你的抓取做一些错误检查。任何通过互联网的请求都可能失败。

如果找不到结果,结果对象值将为空。我放入了一个快速回退,但您可能需要使用这些数字来满足您的需求。

  var response = UrlFetchApp。取(YQL); 
var json = JSON.parse(response.getContentText());
var backoff = 1; ((json.query.results == null || response.getResponseCode()!= 200)){
Utilities.sleep((Math.pow(2,backoff)* 1000)+( Math.round(Math.random()* 1000)));
response = UrlFetchApp.fetch(yql);
json = JSON.parse(response.getContentText());
backoff ++;
}


Using Google Apps Script, I've written the following function to extract a piece of information to a spreadsheet:

function myFunction(symbol, elemento) {

  var url   = "http://www.example.com/query?symbol=" + symbol;

  switch (elemento) {
    case 'one':
      var xpath='//*[@id="sectionTitle"]/h1';
      break;
    case 'two':
      var xpath='//*[@id="headerQ"]/div[1]/div/span[2]'
      break;
  }
  var query = "select * from html where url = '" + url + "' and xpath = '" + xpath + "'";

  var yql   = "https://query.yahooapis.com/v1/public/yql?format=json&q=" + encodeURIComponent(query);

  var response = UrlFetchApp.fetch(yql);
  var json = JSON.parse(response.getContentText());

  switch(elemento){
    case 'one':
      return json.query.results.h1;
      break;
    case 'two':
      return ponto(json.query.results.span.content);
      break;
  }
}

Now, this works OK when typing the function into a cell, but "sometimes" I get the error #ERROR! in a cell with the note:

TypeError: Can't read "h1"  property of null. (line 54).

Deleting that cell and typing the function again usually works.

Why is this function volatile (ie: it does work, but only sometimes)?

解决方案

You will need to do some error checking with your fetch. Any request over the internet may fail.

If no results are found the results object value will be null. I put in a quick backoff, but you may need to play with the numbers to suit your needs.

var response = UrlFetchApp.fetch(yql);
var json = JSON.parse(response.getContentText());
var backoff = 1;
while((json.query.results == null || response.getResponseCode() != 200)){
  Utilities.sleep((Math.pow(2,backoff)*1000) + (Math.round(Math.random() * 1000)));
  response = UrlFetchApp.fetch(yql);
  json = JSON.parse(response.getContentText());
  backoff++;
}

这篇关于使用Google Apps Script和Yahoo Query Language从网页获取数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆