为什么我的 Scrapy 代码返回一个空数组? [英] Why does my Scrapy code return an empty array?

查看:29
本文介绍了为什么我的 Scrapy 代码返回一个空数组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为 wunderground.com 构建一个网络爬虫,但我的代码返回[]"的值,表示英寸_雨和湿度.有人能明白为什么会这样吗?

I am building a web scraper for wunderground.com, but I my code returns the value of "[]" for inches_rain and humidity. Could anyone see why this is happening?

# -*- coding: utf-8 -*-
import scrapy
from scrapy.selector import Selector
import time

from wunderground_scraper.items import WundergroundScraperItem


class WundergroundComSpider(scrapy.Spider):
    name = "wunderground"
    allowed_domains = ["www.wunderground.com"]
    start_urls = (
        'http://www.wunderground.com/q/zmw:10001.5.99999',
    )

    def parse(self, response):
        info_set = Selector(response).xpath('//div[@id="current"]')
        list = []
        for i in info_set:
            item = WundergroundScraperItem()
            item['description'] = i.xpath('div/div/div/div/span/text()').extract()
            item['description'] = item['description'][0]
            item['humidity'] = i.xpath('div/table/tbody/tr/td/span/span/text()').extract()
            item['inches_rain'] = i.xpath('div/table/tbody/tr/td/span/span/text()').extract()
            list.append(item)
        return list

我也知道湿度和英寸雨项目设置为相同的 xpath,但这应该是正确的,因为一旦信息在数组中,我只需将它们设置为数组中的某些值.

I also know that the humidity and inches_rain items are set to the same xpath, but that should be correct because once the information is in an array I just set them to certain values from the array.

推荐答案

让我建议一个更可靠和可读的 XPath 来定位,举个例子,湿度"值,其中基础是湿度"列标签:

Let me suggest a more reliable and readable XPath to locate, for the sake of an example, "Humidity" value where the base is that "Humidity" column label:

"".join(i.xpath('.//td[dfn="Humidity"]/following-sibling::td//text()').extract()).strip()

现在输出 45%.

仅供参考,您的 XPath 至少有一个问题 - tbody 标记 - 从 XPath 表达式中删除它.

FYI, your XPath had at least one problem - the tbody tag - remove it from the XPath expression.

这篇关于为什么我的 Scrapy 代码返回一个空数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆