从机场网站抓取航班数据表失败 [英] failure in scraping the flight data table from airport website

查看:23
本文介绍了从机场网站抓取航班数据表失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试从新德里国际机场的网站上抓取国内航班的到达和离开数据.我几乎尝试了所有方法,但无法提取数据.当我运行代码时,它什么都不返回.我在另一个机场网站上尝试了类似的代码,但它有效.这是我写的代码.

res = requests.get("https://m.newdelhiairport.in/live-flight-information-all.aspx?FLMode=A&FLType=D")汤 = BeautifulSoup(res.content,'html5lib')table = soup.find_all('tbody',{'class':'arr_dep_table_body'})打印(表)

这是网站的链接:- "

I have been trying to scrape arrival and departure data of domestic flights from the website of New Delhi International Airport. I have tried almost everything but I cannot extract the data. When I run the code, it returns nothing.I tried similar code on another airport website but it worked. Here is the code I wrote.

res = requests.get("https://m.newdelhiairport.in/live-flight- information-all.aspx?FLMode=A&FLType=D")
soup = BeautifulSoup(res.content,'html5lib')
table = soup.find_all('tbody',{'class':'arr_dep_table_body'})
print(table)

Here is the link to the website:- "https://m.newdelhiairport.in/live-flight-information-all.aspx?FLMode=A&FLType=D"

A screenshot of the website

解决方案

As mentioned you can use the alternative URL where the data is being source from. You will need to add a header.

import requests
import pandas as pd

url = 'https://m.newdelhiairport.in/get-all-Fids-FlightInfo.aspx?FltType=D&FltWay=A&FltNum=&FltFrom=&rn=0.992638793938065'
re = requests.get(url, headers =  {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'})
df = pd.read_html(re.text)
print(df)


I pulled the URL from the network tab. I opened the network tab and re-loaded the page then inspected the XHR web traffic:

这篇关于从机场网站抓取航班数据表失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆