使用Node.js从网页中抓取URL [英] Scraping URLs from a web page with Node.js

查看：52 发布时间：2020/9/25 0:16:44 javascript arrays node.js cheerio

本文介绍了使用Node.js从网页中抓取URL的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试从网站上抓取所有URL，并将它们放入数组中。我对数组索引有疑问。如果我在数组[2]中添加一个类似2的索引号，则命令行将以 undefined答复。如果删除索引并打印整个数组，它将逐行打印所有URL。我希望每个URL都是自己的索引，例如：

I'm trying to scrape all URLs from a website and put them into an array. I have a question about an array index. If I add an index number like 2 into array[2], the command line replies with "undefined". If I remove the index and print the whole array, it prints all the URLs line by line. I want each URL to be its own index like:

array [0] =找到第一个URL

array [1] =找到第二个URL

array [2] =找到第三个URL等。

有人能指出我正确的方向吗？谢谢。

Can anyone point me in the right direction? Thank you.

  var request = require('request');
    var cheerio = require('cheerio');

   var url = 'http://www.hobo-web.co.uk/';

    request(url, function(err, resp, body){
      $ = cheerio.load(body);
      links = $('a'); //use your CSS selector here
      $(links).each(function(i, link){
        var array = $(link).attr('href');
        console.log(array[2]);

      });
    });``

推荐答案

您需要首先将数组创建为在 .each 循环中可访问的变量，然后继续推送href值

You need to initially create the array as a variable accessible within the .each loop, then keep pushing the href values to it.

var request = require('request');
var cheerio = require('cheerio');

var url = 'http://www.hobo-web.co.uk/';

var array = [];

request(url, function(err, resp, body){
  $ = cheerio.load(body);
  links = $('a');
  $(links).each(function(i, link){
    var href = $(link).attr('href');
    array.push(href);
  });
});

这篇关于使用Node.js从网页中抓取URL的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用Node.js从网页中抓取URL [英] Scraping URLs from a web page with Node.js

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

使用Node.js从网页中抓取URL [英] Scraping URLs from a web page with Node.js

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭