空的 .json 文件 [英] Empty .json file

查看：34 发布时间：2021/7/16 21:56:33 scrapy web-crawler

本文介绍了空的 .json 文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我编写了这个简短的蜘蛛代码来从黑客新闻头版中提取标题(http://news.ycombinator.com/).

I have written this short spider code to extract titles from hacker news front page(http://news.ycombinator.com/).

import scrapy

class HackerItem(scrapy.Item): #declaring the item
    hackertitle = scrapy.Field()


class HackerSpider(scrapy.Spider):
    name = 'hackernewscrawler'
    allowed_domains = ['news.ycombinator.com'] # website we chose
    start_urls = ['http://news.ycombinator.com/']

   def parse(self,response):
        sel = scrapy.Selector(response) #selector to help us extract the titles
        item=HackerItem() #the item declared up

# xpath of the titles
        item['hackertitle'] = 
sel.xpath("//tr[@class='athing']/td[3]/a[@href]/text()").extract()


# printing titles using print statement.
        print (item['hackertitle']

但是当我运行代码时 scrapy scrawlhackernewscrawler -o hntitles.json -t json

However when i run the code scrapy scrawl hackernewscrawler -o hntitles.json -t json

我得到一个空的 .json 文件，其中没有任何内容.

i get an empty .json file that does not have any content in it.

推荐答案

你应该把print语句改为yield:

import scrapy

class HackerItem(scrapy.Item): #declaring the item
    hackertitle = scrapy.Field()


class HackerSpider(scrapy.Spider):
    name = 'hackernewscrawler'
    allowed_domains = ['news.ycombinator.com'] # website we chose
    start_urls = ['http://news.ycombinator.com/']

    def parse(self,response):
        sel = scrapy.Selector(response) #selector to help us extract the titles
        item=HackerItem() #the item declared up

# xpath of the titles
        item['hackertitle'] = sel.xpath("//tr[@class='athing']/td[3]/a[@href]/text()").extract()


# return items
        yield item

然后运行:

scrapy crawl hackernewscrawler -o hntitles.json -t json

这篇关于空的 .json 文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

空的 .json 文件 [英] Empty .json file

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

空的 .json 文件 [英] Empty .json file

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭