网页抓取futbin.com [英] Web-scraping futbin.com
问题描述
我正在尝试从futbin.com收集包含FIFA最终团队球员的时间序列数据的数据集.我在GitHub https://www.futbin.com/19/player/143/Cristiano%20Ronaldo/
我尝试了一些操作,但似乎无法解析/提取此信息...有人可以帮我还是给我提示?预先感谢
很难以这种方式获取数据.如果您检查浏览器网络工具,则可以看到创建图表的数据来自http请求.当然不要滥用它.
导入请求从datetime导入datetimeplayer_ids = {'Arturo Vidal':181872,'Pierre-Emerick Aubameyang':188567,罗伯特·莱万多夫斯基(Robert Lewandowski):188545,'杰罗姆·博阿滕(Jerome Boateng):183907,'Sergio Ramos':155862,安东尼·格里兹曼(Antoine Griezmann):194765,大卫·阿拉巴(David Alaba):197445,保罗·迪巴拉(Paulo Dybala):211110,拉贾·宁格兰(Radja Nainggolan):178518}对于player_ids.items()中的(name,id):r = request.get('https://www.futbin.com/19/playerGraph?type=daily_graph&year=19&player= {0}'.format(id))数据= r.json()打印(名称)打印(-" * 20)#将ps更改为xbox或pc,以获取其他价格对于数据中的价格['ps']:#响应中有额外的零.date = datetime.utcfromtimestamp(price [0]/1000).strftime('%Y-%m-%d')价格=价格[1]打印(日期,价格)
这会给你
Arturo Vidal--------------------2018-09-21 84502018-09-22 93182018-09-23 108202018-09-24 132882018-09-25 133462018-09-26 172352018-09-27 190922018-09-28 159602018-09-29 142832018-09-30 149672018-10-01 153802018-10-02 153672018-10-03 13192皮埃尔·埃默里克·奥巴梅扬--------------------2018-09-21 1360002018-09-22 1606732018-09-23 2054742018-09-24 2163442018-09-25 2447502018-09-26 2770072018-09-27 2886592018-09-28 2590072018-09-29 2617992018-09-30 2707712018-10-01 2742452018-10-02 2810572018-10-03 275606罗伯特·莱万多夫斯基--------------------2018-09-21 730002018-09-22 799612018-09-23 948272018-09-24 1178932018-09-25 1253102018-09-26 1446302018-09-27 1592242018-09-28 1351222018-09-29 1326962018-09-30 1377282018-10-01 1431302018-10-02 1509682018-10-03 144250
列表继续.
I am trying to collect a dataset with time series data of FIFA ultimate team players from futbin.com. I have found a script on GitHub https://github.com/darkyin87/futbin-scraper which is able to scrape the current price of a player given a list of players/ids:
import requests
import json
domain = 'https://www.futbin.com'
version = 19
page = 'playerPrices'
player_ids = {
'Arturo Vidal': 181872,
'Pierre-Emerick Aubameyang': 188567,
'Robert Lewandowski': 188545,
'Jerome Boateng': 183907,
'Sergio Ramos': 155862,
'Antoine Griezmann': 194765,
'David Alaba': 197445,
'Paulo Dybala': 211110,
'Radja Nainggolan': 178518
}
def fetch_prices():
ret_val = {}
for name, id in player_ids.iteritems():
url = "%s/%s/%s?player=%s" % (domain, version, page, id)
response = requests.get(url)
data = response.json()
ret_val[name] = data[str(id)]['prices']['ps']['LCPrice']
return ret_val
if __name__ == "__main__":
prices = fetch_prices()
fetch_prices
But the information I am looking for is not the current price but rather the price (specifically the PS price) history which is located on the bottom as I graph. https://www.futbin.com/19/player/143/Cristiano%20Ronaldo/
I tried a few things but I seem to be unable to parse/extract this information... could someone help me out or give me a hint? Thanks in advance
It is hard to get data that way. If you check your browser network tools you can see the data that creates chart comes from http request. Don't abuse it of course.
import requests
from datetime import datetime
player_ids = {
'Arturo Vidal': 181872,
'Pierre-Emerick Aubameyang': 188567,
'Robert Lewandowski': 188545,
'Jerome Boateng': 183907,
'Sergio Ramos': 155862,
'Antoine Griezmann': 194765,
'David Alaba': 197445,
'Paulo Dybala': 211110,
'Radja Nainggolan': 178518
}
for (name,id) in player_ids.items():
r = requests.get('https://www.futbin.com/19/playerGraph?type=daily_graph&year=19&player={0}'.format(id))
data = r.json()
print(name)
print("-"*20)
#Change ps to xbox or pc to get other prices
for price in data['ps']:
#There is extra zeroes in response.
date = datetime.utcfromtimestamp(price[0] / 1000).strftime('%Y-%m-%d')
price = price[1]
print(date,price)
This will give you
Arturo Vidal
--------------------
2018-09-21 8450
2018-09-22 9318
2018-09-23 10820
2018-09-24 13288
2018-09-25 13346
2018-09-26 17235
2018-09-27 19092
2018-09-28 15960
2018-09-29 14283
2018-09-30 14967
2018-10-01 15380
2018-10-02 15367
2018-10-03 13192
Pierre-Emerick Aubameyang
--------------------
2018-09-21 136000
2018-09-22 160673
2018-09-23 205474
2018-09-24 216344
2018-09-25 244750
2018-09-26 277007
2018-09-27 288659
2018-09-28 259007
2018-09-29 261799
2018-09-30 270771
2018-10-01 274245
2018-10-02 281057
2018-10-03 275606
Robert Lewandowski
--------------------
2018-09-21 73000
2018-09-22 79961
2018-09-23 94827
2018-09-24 117893
2018-09-25 125310
2018-09-26 144630
2018-09-27 159224
2018-09-28 135122
2018-09-29 132696
2018-09-30 137728
2018-10-01 143130
2018-10-02 150968
2018-10-03 144250
And the list goes on.
这篇关于网页抓取futbin.com的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!