如何使用scrapy抓取各种标签之间包含的文本 [英] How to scrape text included between various tags using scrapy

查看：54 发布时间：2021/7/16 21:51:45 python scrapy

本文介绍了如何使用scrapy抓取各种标签之间包含的文本的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图从这个链接中获取产品说明.但是我如何抓取整个文本，包括标签之间的文本.这是 hxs 对象hxs.select('//div[@class="overview"]/div/text()').extract() 但原始 HTML :

I am trying to scrape product description from this link. But how do i scrape the whole text including text between tags. Here is the hxs object hxs.select('//div[@class="overview"]/div/text()').extract() but the original HTML :

These classic sneakers from
<b>Puma</b>
are best known for their neat and simple design. These basketball shoes are crafted by novel tooling that brings the sleek retro sneaker look. The pair is equipped with a
<b>leather and synthetic upper.</b>
A vulcanized non-slip rubber sole that is
<b>abrasion resistant ensures good traction.</b>

如果我使用上面提到的 hxs 对象，我会得到这个:

If i use the above mentioned hxs object i get this :

hxs.select('//div[@class="overview"]/div/text()').extract()
Output: 
[u'These classic sneakers from ',
 u' are best known for their neat and simple design. These basketball shoes are crafted by novel tooling that brings the sleek retro sneaker look. The pair is equipped with a ',
 u' A vulcanized non-slip rubber sole that is ',
 u' sportswear, jeans and tees.',
 u' Gently brush away dust or dirt using a soft cleaning brush.',
 u'\r\nUse a leather conditioner/wax and a brush for added shine.',
 u'Avoid contact with liquids.\xa0']

我想要的是这个:

These classic sneakers from Puma are best known for their neat and simple design. These
 basketball shoes are crafted by novel tooling that brings the sleek retro sneaker look. The pair is equipped with a leather and synthetic upper.A vulcanized non-slip rubber sole 
that is abrasion resistant ensures good traction.

正如你所看到的，之间的文本丢失了，所以你能告诉我如何从页面中提取整个文本.

As you can see the text between is missing so can you tell me how do i extract the whole text from the page.

如何使用scrapy抓取各种标签之间包含的文本 [英] How to scrape text included between various tags using scrapy

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何使用scrapy抓取各种标签之间包含的文本 [英] How to scrape text included between various tags using scrapy

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭