美丽汤返回空列表 [英] Beautiful Soup returns empty list

查看:66
本文介绍了美丽汤返回空列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是网络爬虫的新手.因此,我承担了从以下任务中提取数据的任务:

您可以在预览"标签中检索此URL中的数据,您可以查看所有数据.

如果您对python有很好的了解,也可以使用它来抓取数据

https://doc.scrapy.org/en/latest/intro/overview.html

I am new to webscraping. So I have been given a task to extract data from : Here

I am choosing dataset of "comments". Below is my code for scraping.

import requests
from bs4 import BeautifulSoup
url = 'https://www.kaggle.com/hacker-news/hacker-news'
headers = {'User-Agent' : 'Mozilla/5.0'}
response = requests.get(url, headers = headers)
response.status_code
response.content
soup = BeautifulSoup(response.content, 'html.parser')
soup.find_all('tbody', class_ = 'TableBody-kSbjpE jGqIxa')

When I try to execute the last command it returns : [].

So, I am stuck here. I know we can get the data from kernel, but just for practice purpose where am I going wrong? Am I choosing wrong class? I want to scrape the data and probably save it to a CSV file or to a No-SQL Database, preferred Cassandra.

解决方案

you are getting this [] because data you want to scrape is coming from API which loads after you web page load so page you are accessing does not contain that class

you can open you browser console and check in network as given in screenshot there you find data you want to scrape so you have to make request to that URL to get data

you can retrive data in this URL in preview tab you can see all data.

also if you have good knowledge of python you can also use this to scrape data

https://doc.scrapy.org/en/latest/intro/overview.html

这篇关于美丽汤返回空列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆