Python：Scrapy CSV导出错误？ [英] Python: Scrapy CSV exports incorrectly?

查看：184 发布时间：2017/10/5 15:50:36 python csv export scrapy

本文介绍了Python：Scrapy CSV导出错误？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我只是想写一个csv。但是，我有两个单独的语句，因此每个for语句的数据独立导出并且打破顺序。建议？

  def parse（self，response）：
 hxs = HtmlXPathSelector（response）
 titles = hxs.select（'// td [@ class =title]'）
 subtext = hxs.select（'// td [@ class =subtext]'）
 items = [] 
标题中的标题：
 item = HackernewsItem（）
 item [title] = title.select（a / text（））extract（）
 item [url] = title.select（a / @ href）。extract（）
 items.append（item）
在子文本中的分数：
 item = HackernewsItem（）
 item [score] = score.select（span / text（））extract（）
 items.append（item）
返回项

如下图所示，第二个for-statement打印在其他代码之下，而不是其他的 p>

附加CSV图片：

和gi thub link for full file： https://github.com/nchlswtsn/scrapy/ blob / master / items.csv

解决方案

您的导出元素顺序与您在CSV文件中找到的符合逻辑，首先导出所有的标题，然后导出所有的子文本元素。

我想你正在尝试删除HN文章，这里是我的建议：

  def parse（self，response）：
 hxs = HtmlXPathSelector（response）
 titles = hxs.select（'// td [@ class =title]'） 
 items = [] 
标题中的标题：
 item = HackernewsItem（）
 item [title] = title.select（a / text（））。 extract（）
 item [url] = title.select（a / @ href）。extract（）
 item [score] = title.select（'../ td [ @ class =subtext] / span / text（）'）。extract（）
 items.append（item）
 return items

我没有t检验，但它会给你一个想法。

I am simply trying to write to a csv. However I have two separate for-statements, therefore the data from each for-statement exports independently and breaks order. Suggestions?

def parse(self, response):
        hxs = HtmlXPathSelector(response)
        titles = hxs.select('//td[@class="title"]')
        subtext = hxs.select('//td[@class="subtext"]')
        items = []
        for title in titles:
            item = HackernewsItem()
            item["title"] = title.select("a/text()").extract()
            item["url"] = title.select("a/@href").extract()
            items.append(item)
        for score in subtext:
            item = HackernewsItem()
            item["score"] = score.select("span/text()").extract()
            items.append(item)
        return items

As is apparent in the image below, the second for-statement prints below the others instead of "among" others as header does.

CSV image attached:

and github link for full file: https://github.com/nchlswtsn/scrapy/blob/master/items.csv

解决方案

Your order of exporting element is logical to what you find in CSV file, first you exported all the titles then all subtext elements.
I guess you are trying to scrap HN articles, here is my suggestion:

def parse(self, response):
    hxs = HtmlXPathSelector(response)
    titles = hxs.select('//td[@class="title"]')
    items = []
    for title in titles:
        item = HackernewsItem()
        item["title"] = title.select("a/text()").extract()
        item["url"] = title.select("a/@href").extract()
        item["score"] = title.select('../td[@class="subtext"]/span/text()').extract()
        items.append(item)
    return items

I didn't test it, but it will give you an idea.

这篇关于Python：Scrapy CSV导出错误？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python：Scrapy CSV导出错误？ [英] Python: Scrapy CSV exports incorrectly?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python：Scrapy CSV导出错误？ [英] Python: Scrapy CSV exports incorrectly?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭