将 XPath 与 Morningstar 关键比率一起使用时返回的空白列表 [英] Blank List returned when using XPath with Morningstar Key Ratios

查看:37
本文介绍了将 XPath 与 Morningstar 关键比率一起使用时返回的空白列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 XPath 从 Morningstar 关键比率页面中提取任何给定股票的数据.我有返回结果的完整路径,用于 google chrome 的 XPath Helper 工具栏附加组件,但是当我将它插入我的代码时,我得到一个返回的空白列表.

如何获得想要返回的结果?这甚至可能吗?我是否使用了错误的方法?

非常感谢任何帮助!

我想返回的数据:

AMD 关键比率示例:

我的代码:

from urllib.request import urlopen导入 os.path导入系统从 lxml 导入 html进口请求page = requests.get('http://financials.morningstar.com/ratios/r.html?t=AMD&region=USA&culture=en_US')树 = html.fromstring(page.content)rev = tree.xpath('/html/body/div[1]/div[3]/div[2]/div[1]/div[1]/div[1]/table/tbody/tr[2]/td[1]')打印(转)

代码结果:

[]

来自 XPath Helper 的期望结果:

谢谢,不是欧拉

解决方案

这是分阶段下载大部分内容的页面之一.如果您在仅使用 requests 后查找您想要的项目,您会发现它尚不可用,如下所示.

<预><代码>>>>进口请求>>>url = 'http://financials.morningstar.com/ratios/r.html?t=AMD&region=USA&culture=en_US'>>>页面 = requests.get(url).text>>>页中的5,858"错误的

处理这些页面的一种策略涉及使用 selenium 库.在这里,selenium 启动 Chrome 浏览器的副本,加载该 url,然后使用 xpath 表达式来定位感兴趣的 td 元素.最后,您想要的数字可用作该元素的 text 属性.

<预><代码>>>>从硒导入网络驱动程序>>>驱动程序 = webdriver.Chrome()>>>driver.get(url)>>>td = driver.find_element_by_xpath('.//th[@id="i0"]/td[1]')<selenium.webdriver.remote.webelement.WebElement (session="f436b07c27742abb36b262639245801f", element="0.12745670001529863-2")>>>>文本'5,858'

I am trying to pull a piece of data from the morningstar key ratio page for any given stock using XPath. I have the full path that returns a result in the XPath Helper tooldbar add-on for google chrome but when I plug it into my code I get a blank list returned.

How do I get the result that I want returned? Is this even possible? Am I using the wrong approach?

Any help is much appreciated!

Piece of Data that I want returned:

AMD Key Ratios Example:

My Code:

from urllib.request import urlopen
import os.path
import sys
from lxml import html
import requests

page = requests.get('http://financials.morningstar.com/ratios/r.html?t=AMD&region=USA&culture=en_US')
tree = html.fromstring(page.content)
rev = tree.xpath('/html/body/div[1]/div[3]/div[2]/div[1]/div[1]/div[1]/table/tbody/tr[2]/td[1]')
print(rev)

Result of code:

[]

Desired result from XPath Helper:

Thanks, Not Euler

解决方案

This is one of those pages that downloads much of its content in stages. If you look for the item you want after using just requests you will find that it's not yet available, as shown here.

>>> import requests
>>> url = 'http://financials.morningstar.com/ratios/r.html?t=AMD&region=USA&culture=en_US'
>>> page = requests.get(url).text
>>> '5,858' in page
False

One strategy for processing these pages involves the use of the selenium library. Here, selenium launches a copy of the Chrome browser, loads that url then uses an xpath expression to locate the td element of interest. Finally, the number you want becomes available as the text property of that element.

>>> from selenium import webdriver
>>> driver = webdriver.Chrome()
>>> driver.get(url)
>>> td = driver.find_element_by_xpath('.//th[@id="i0"]/td[1]')
<selenium.webdriver.remote.webelement.WebElement (session="f436b07c27742abb36b262639245801f", element="0.12745670001529863-2")>
>>> td.text
'5,858'

这篇关于将 XPath 与 Morningstar 关键比率一起使用时返回的空白列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆