Scrape Instagram Web Hashtag帖子 [英] Scrape Instagram Web Hashtag Posts

查看：179 发布时间：2020/11/17 4:18:47 xpath google-apps-script web-scraping google-sheets instagram

本文介绍了Scrape Instagram Web Hashtag帖子的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试将帖子的数量抓取到给定的#标签(#castles)，并使用ImportXML填充Google表格单元格.

I'm trying to scrape the number of posts to a given hashtag (#castles) and populate a Google Sheet cell using ImportXML.

我尝试从Chrome复制Xpath并将其粘贴到像这样的单元格中的ImportXML参数中:

I tried copying the Xpath from Chrome and paste it to the ImportXML parameter in the cell like this:

=ImportXML("https://www.instagram.com/explore/tags/castels/", "//*[@id="react-root"]/section/main/header/div[2]/div/div[2]/span/span")

我看到引号存在问题，所以我也尝试过:

I saw there is a problem with the quotation marks so I also tried:

=ImportXML("https://www.instagram.com/explore/tags/castels/", "//*[@id='react-root']/section/main/header/div[2]/div/div[2]/span/span")

尽管如此，它们都返回错误.

Nevertheless, both return an error.

我在做什么错了?

P.S.我知道元标记描述"//meta[@name='description']/@content"的Xpath，但是我想抓取帖子的确切数目，而不是缩写的数目.

P.S. I am aware of the Xpath to the meta tag description "//meta[@name='description']/@content" however I would like to scrape the exact number of posts and not an abbreviated number.

推荐答案

尝试一下-

function hashCount() {
  var url = 'instagram.com/explore/tags/cats/';
  var response = UrlFetchApp.fetch(url, {muteHttpExceptions: true}).getContentText();
  var regex = /(edge_hashtag_to_media":{"count":)(\d+)(,"page_info":)/gm;
  var count = regex.exec(response)[2];
  Logger.log(count);
}

演示-

我添加了muteHttpExceptions: true，但上面的评论中未添加.希望这会有所帮助.

I've added muteHttpExceptions: true which was not added in my comment above. Hope this helps.

这篇关于Scrape Instagram Web Hashtag帖子的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Scrape Instagram Web Hashtag帖子 [英] Scrape Instagram Web Hashtag Posts

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Scrape Instagram Web Hashtag帖子 [英] Scrape Instagram Web Hashtag Posts

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭