使用console.print cheerio + nodejs时奇怪的字符 [英] Weird characters when using console.print cheerio + nodejs

查看:67
本文介绍了使用console.print cheerio + nodejs时奇怪的字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是node.js的新手,并写了我的第一个脚本来抓取一些数据。

I'm new to node.js and writing my first script to scrape some data.

有人知道我为什么看到奇怪的字符并带有问号吗?

Does anyone know why I'm seeing weird characters with question marks inside them when using this code?

var express = require('express');
var fs = require('fs');
var request = require('request');
var cheerio = require('cheerio');
var app = express();

var url = 'http://www.ebay.co.uk/csc/all-you-ever-want/m.html?LH_Complete=1&_ipg=50&_since=15&_sop=13&LH_FS=1&=&rt=nc&LH_ItemCondition=3';

request(url, function (error, response, html) {
  if (!error) {

    console.log(html);
    var $ = cheerio.load(html);

    $('.vip').each(function (i, element) {
      var link = $(this).text();
      console.log(link);
    });

  }
});

app.listen(process.env.PORT, process.env.IP)
console.log(process.env.PORT);
exports = module.exports = app;

这是我看到的输出:

谢谢!

Anthony

推荐答案

嘿,这是因为您所请求页面的编码方式。
要处理编码,您可能需要使用iconv-lite模块( https://github.com/ashtuchkin/ iconv-lite )像这样:

Hey this is because of the encoding of the page you're requesting. To deal with encoding, you might want to use the module iconv-lite (https://github.com/ashtuchkin/iconv-lite) like that:

var iconv = require('iconv-lite');

var encoding = 'iso-8859-1'; // You might want to replace that with the encoding the page is using or auto detect it from the charset header

request.get({url: .., headers:..., encoding:null}, function(err,res,body){

   var body1 = iconv.decode(body,encoding);

}

玩得开心,这应该可行。

Have fun, this should work.

这篇关于使用console.print cheerio + nodejs时奇怪的字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆