使用xpath抓取Web内容将不起作用 [英] Scraping web content using xpath won't work

查看：133 发布时间：2020/5/4 8:38:06 python xpath web-scraping amazon lxml

本文介绍了使用xpath抓取Web内容将不起作用的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用xpath刮擦特定的亚马逊网页，但是它不起作用.有人可以给我一些建议吗?这是该页面的链接: 链接

I'm using xpath to scrape a amazon webpage particular, but it doesn't work. Can any one give me some advice? Here's the link to that page: a link

我要抓取这些内容:有趣，信用卡大小的照片" 我正在使用的代码在这里:

I want to scrape these: "Fun, credit card-sized prints" The code i'm using is here:

from lxml import html
import requests

url = 'http://www.amazon.co.uk/dp/B009CX5VN2'
page = requests.get(url)
tree = html.fromstring(page.text)
feature_bullets = tree.xpath('//*[@id="feature-bullets"]/ul/li[1]/span/text()')

但是feature_bullets始终为空.真的需要一些帮助.

But the feature_bullets is always empty. Really need some help.

推荐答案

我下载的HTML与您的期望不符.这是对我有用的表达式:

The HTML that I download doesn't match your expectations. Here is the expression that works for me:

tree.xpath('//div[@id="technicalProductFeaturesATF"]/ul/li[1]/text()')

完整程序:

from lxml import html
import requests
from pprint import pprint

url = 'http://www.amazon.co.uk/dp/B009CX5VN2'
page = requests.get(url)
tree = html.fromstring(page.text)
feature_bullets = tree.xpath('//div[@id="technicalProductFeaturesATF"]/ul/li/text()')

pprint(feature_bullets)

结果:

$ python foo.py 
['Fun, credit card-sized prints',
 'LCD film counter and shooting mode display',
 'Camera mounted mirror for self portraits',
 'Powered by CR2 Batteries, Built-in, Automatic electronic flash',
 'Fujifilm Instax Mini 25 + 30 Instax Mini Film']

这篇关于使用xpath抓取Web内容将不起作用的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用xpath抓取Web内容将不起作用 [英] Scraping web content using xpath won't work

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用xpath抓取Web内容将不起作用 [英] Scraping web content using xpath won&#39;t work

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

使用xpath抓取Web内容将不起作用 [英] Scraping web content using xpath won't work

登录关闭