Scrapy 在 Xpath 或 Css 中找不到文本 [英] Scrapy does not find text in Xpath or Css

查看:38
本文介绍了Scrapy 在 Xpath 或 Css 中找不到文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经在这个网站上呆了几天了,无论我如何尝试,我都无法对包含在一个元素中的抽象文本变得敏感.

I've been at this one for a few days, and no matter how I try, I cannot get scrapy to abstract text that is in one element.

为了省去所有的代码,这里是重要的部分.设置确实从页面上抓取了其他所有内容,只是不是这个文本.

to spare you all the code, here are the important pieces. The setup does grab everything else off the page, just not this text.

from scrapy.selector import Selector
start_url = "https://www.tripadvisor.com/VacationRentalReview-g34416-d12428323-On_the_Beach_Wide_flat_beach_Sunsets_Gulf_view_Sharks_teeth_Shells_Fish-Manasota_Key_F.html"

#BASIC ITEM AND SPIDER YADA, SPARE YOU THE DETAILS

hxs = Selector(response)
response_css = response.css("body")

desc_data = hxs.xpath('//*[@id="DETAILS_TRUNC_TEXT"]//text()').extract()
desc_data2 = response_css.css('#DETAILS_TRUNC_TEXT::text').extract()

两者都返回空列表.是的,我通过 chrome 找到了 xpath 和 css 选择器,但其余的都可以正常工作,因为我可以在网站上找到其他数据.请帮我找出为什么这不起作用.

both return empty lists. Yes, I found the xpath and css selector via chrome, but the rest of them work just fine as I'm able to find other data on the site. Please help me find out why this isn't working.

推荐答案

我在 scrapy shell 中尝试了您的 xpath 和 css,但也一无所获.

I tried your xpath and css in scrapy shell, and got nothing also.

然后我使用 view(response) 命令,发现该站点是动态的.

Then I used view(response) command and found out the site is dynamic.

截图如下:

你可以看到Overview下的详细信息没有显示出来,这就是为什么不管你怎么尝试,你仍然一无所获的原因.

You can see that the details under Overview doesn't show up, and that's why no matter how you try, you still got nothing.

解决方案:尝试 Selenium(检查 SIM 在上一个答案中提供的解决方案)或 Splash.

Solutions: Try Selenium (check the solution that SIM provided in the last answer) or Splash.

祝你好运.:)

这篇关于Scrapy 在 Xpath 或 Css 中找不到文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆