尝试使用Google表格和importxml()提取文本时出现错误 [英] Getting an error trying to pull out text using Google Sheets and importxml()

查看：68 发布时间：2021/5/12 21:02:40 xpath web-scraping google-sheets google-sheets-formula google-sheets-importxml

本文介绍了尝试使用Google表格和importxml()提取文本时出现错误的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

我在Google表格中有一列链接.我想告诉页面是否使用 importxml

生成错误消息

作为一个例子，这很好

  = importxml("https://zh.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_T"，"//td/b")

即它会寻找td，然后拉出b(加拿大的邮政编码)

但是查找错误消息的此代码不起作用:

  = importxml("https://www.awwwards.com/error1/"，"//div/h1")

我希望它拉出"您正在寻找的页面不存在."

...在此页面上

解决方案

快速尝试并使用默认公式出错后

  = IMPORTXML("https://www.awwwards.com/error1/"，"//*")

  = IMPORTHTML("https://www.awwwards.com/error1/"，"table"，1)

  = IMPORTHTML("https://www.awwwards.com/error1/"，列表"，1)

  = IMPORTDATA("https://www.awwwards.com/error1/")

似乎无法通过任何方式(常规公式)在Google表格中抓取该网站

I have a column of links in Google Sheets. I want to tell if a page is producing an error message using importxml

As an example, this works fine

=importxml("https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_T", "//td/b")

i.e. it looks for td, and pulls out b (which are postcodes in Canada)

But this code that looks for the error message does not work:

=importxml("https://www.awwwards.com/error1/", "//div/h1" )

I want it to pull out the "THE PAGE YOU WERE LOOKING FOR DOESN'T EXIST."

I'm getting a Resource at URL not found error. What could I be doing wrong? Thanks

解决方案

after quick trial and error with default formulae:

=IMPORTXML("https://www.awwwards.com/error1/", "//*")

=IMPORTHTML("https://www.awwwards.com/error1/", "table", 1)

=IMPORTHTML("https://www.awwwards.com/error1/", "list", 1)

=IMPORTDATA("https://www.awwwards.com/error1/")

it seems that the website is not possible to be scraped in Google Sheets by any means (regular formulae)

这篇关于尝试使用Google表格和importxml()提取文本时出现错误的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文