使用scrapy从表中抓取数据 [英] Scrape data from a table with scrapy

查看：50 发布时间：2021/7/16 22:02:48 python web-scraping scrapy

本文介绍了使用scrapy从表中抓取数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

用scrapy从表中抓取数据.表格 html 是这样的:

Scrape data from a table with scrapy. The table html is like:

<table class="tablehd">

<tr class="colhead">
<td width="170">MON, NOV 11</td>
<td width="80">Item</td>
<td width="60" align="center"></td>
<td width="210">Item</td>
<td width="220">Item</td>
</tr>

<tr class="oddrow">
<td> Item </a></td>
<td> Item </td>
<td align="center"> Item </td>
<td></td>
<td> Item </td>
</tr>

<tr class="evenrow">
<td> Item </a></td>
<td> Item </td>
<td align="center"> Item </td>
<td></td>
<td> Item </td>
</tr>


</table>

整个列表都可以通过

items = hxs.select('//table[@class="tablehd"]//td//text()').extract()

你如何将它们拆分到每个项目然后分配数据 td1 - td5ta

How would you split them to each item and then assign data td1 - td5ta

推荐答案

不确定您希望在您的项目中看到什么，但这里有一个示例，我希望就是这样:

Not sure what exactly do you want to see in your items, but here's an example and I hope this is it:

class MyItem(Item):
    value = Field()


class MySpider(BaseSpider):
    ...

    def parse(self, response):
        hxs = HtmlXPathSelector(response)
        items = hxs.select('//table[@class="tablehd"]/td')

        for item in items:
            my_item = MyItem()
            my_item['value'] = item.select('.//text()').extract()
            yield my_item

希望有所帮助.

这篇关于使用scrapy从表中抓取数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用scrapy从表中抓取数据 [英] Scrape data from a table with scrapy

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用scrapy从表中抓取数据 [英] Scrape data from a table with scrapy

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭