为什么我的抓取代码没有从网页复制表格? [英] Why is my scraping code not copying a table from a webpage?

查看:28
本文介绍了为什么我的抓取代码没有从网页复制表格?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从网页中复制表格.我试过了:

I am trying to copy a table from a webpage. I tried:

library(XML)
url <- "https://www.cmegroup.com//content/cmegroup/en/trading/fx/g10/euro-fx_quotes_settlements_futures.html"
table1 <- readHTMLTable(url,stringsAsFactors = FALSE)
table1

但这没有用.

推荐答案

该表不在页面源上.另一种解决方案是使用 XMLHttpRequests (XHR)

The table is not on the page source. An alternative solution is using XMLHttpRequests (XHR)

library(jsonlite)

tbl <- fromJSON("https://www.cmegroup.com/CmeWS/mvc/Settlements/Futures/Settlements/58/FUT?tradeDate=03/26/2020&strategy=DEFAULT&pageSize=500&_=1585333229793")
tbl <- tbl$settlements

PS:对于其他日期,更改 URL 中的日期部分 (03/26/2020)

PS: For other date, change date part at the URL (03/26/2020)

输出

tbl

#     month    open     high      low     last  change  settle  volume openInterest
# 1  APR 20 1.08900 1.10640B  1.08870 1.10440A +.01750 1.10520     282        4,215
# 2  MAY 20 1.09165 1.10775B  1.09045 1.10620A +.01715 1.10675     651        2,627
# 3  JUN 20 1.09230  1.10960  1.09090  1.10685 +.01715 1.10800 205,562      548,213
# 4  JLY 20 1.10650 1.11015B 1.10605A 1.10830A +.01710 1.10905       2            2
# 5  SEP 20 1.09625  1.11265  1.09435 1.11020A +.01710 1.11120     939        3,646
# 6  DEC 20 1.10645 1.11480B 1.10315A 1.11310A +.01725 1.11390      48        2,047
# 7  MAR 21 1.11000 1.11620B 1.10850A 1.11620B +.01725 1.11695       3          240
# 8  JUN 21       - 1.11680B        - 1.11680B +.01740 1.11960       0          144
# 9  SEP 21       -        -        -        - +.01745 1.12210       0            1
# 10 DEC 21       -        -        -        - +.01755 1.12460       0            3
# 11 MAR 22       -        -        -        - +.01770 1.12715       0            0
# 12 JUN 22       -        -        -        - +.01785 1.12985       0            0
# 13 SEP 22       -        -        -        - +.01805 1.13280       0            0
# 14 DEC 22       -        -        -        - +.01830 1.13560       0            0
# 15 MAR 23       -        -        -        - +.01845 1.13815       0            0
# 16 JUN 23       -        -        -        - +.01865 1.14110       0            0
# 17 SEP 23       -        -        -        - +.01885 1.14385       0            0
# 18 DEC 23       -        -        -        - +.01905 1.14660       0            0
# 19 MAR 24       -        -        -        - +.01925 1.14935       0            0
# 20 JUN 24       -        -        -        - +.01945 1.15215       0            0
# 21 SEP 24       -        -        -        - +.01965 1.15490       0            0
# 22 DEC 24       -        -        -        - +.01985 1.15765       0            0
# 23 MAR 25       -        -        -        - +.02005 1.16040       0            0
# 24  Total                                                    207,487      561,138

抓取选项页面

ulr <- "https://www.cmegroup.com/CmeWS/mvc/Quotes/Option/8118/G/J0/ATM?_=1585348999038"
jsonlist <- fromJSON("https://www.cmegroup.com/CmeWS/mvc/Quotes/Option/8118/G/J0/ATM?_=1585348999038")

putcall 列在单独的数据框中

put and call columns are in separate dataframe

df_put <- jsonlist$optionContractQuotes$put
df_call <- jsonlist$optionContractQuotes$call

关注此链接 找到合适的 XHR url

Follow this link to find appropriate XHR url

这篇关于为什么我的抓取代码没有从网页复制表格?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆