Beautifulsoup Python无法从网站上抓取数据 [英] Beautifulsoup Python unable to scrape data from a website

查看:37
本文介绍了Beautifulsoup Python无法从网站上抓取数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在使用Python Beautifulsoup抓取数据.到目前为止,已经成功地刮掉了.但是停留在以下网站上.

I have been using Python Beautifulsoup to scrape data. So far have beeen successfully scraped. But stuck with the following website.

目标站点: LyricsHindiSong

我的目标是从上述网站抓取歌词.但是,它始终给出空白结果,或者Nonetype对象始终没有属性种类错误.

My goal is scrape song lyrics from the mentioned website. But all the time it gives blank result or Nonetype object has no attribute kind error.

自最近15天以来一直在苦苦挣扎,无法弄清楚问题出在哪里以及如何解决?

Have been struggling since last 15 days and could not able to figure out where was the problem and how to fix it?

以下是我正在使用的代码.

Following is the code which is I am using.

import pymysql
import requests
from bs4 import Beautifulsoup

r=requests.get("https://www.lyricshindisong.in/2020/04/chnda-re-chnda-re-chhupe-rahana.html")
soup=Beautifulsoup(r.content,'html5lib')
pageTitle=soup.find('h1').text.strip()
targetContent=soup.find('div',{'style':'margin:25px; color:navy;font-size:18px;'})
print(pageTitle)
print(targetContent.text.strip())

它打印错误nonetype对象没有文本错误.如果我在检查窗口中检查,则元素同时存在两个元素.无法理解问题出在哪里.至少它应该已经打印了标题页.

It prints error nonetype object has no text error. If I check in the inspect window, element has both the elements present. Unable to understand where is the problem. Atleast it should have printed the title page.

希望您了解我的要求.请指导我.谢谢.

Hope you understand my requirement. Please guide me. Thanks.

推荐答案

您在 bs4 lib中的类名中犯了一个错误,并使用了 find 方法而不是find_all

You made a mistake in class name from bs4 lib and used find method instead of find_all

完整代码:

import requests
from bs4 import BeautifulSoup


url = "https://www.lyricshindisong.in/2020/04/chnda-re-chnda-re-chhupe-rahana.html"
response = requests.get(url)

soup = BeautifulSoup(response.content,'html5lib')

title = soup.find('h1').text.strip()
content = soup.find_all('div',{'style':'margin:25px; color:navy;font-size:18px;'})

print(title)

for line in content:
    print(line.text.strip())

结果:

python answer.py
Chnda Re Chnda Re Chhupe Rahana
चंदा रे, चंदा रे, छुपे रहनासोये मेरी मैना, लेके मेरी निंदिया रे
फूल चमेली धीरे महको, झोका ना लगा जाये नाजुक डाली कजरावाली सपने में मुस्काये लेके मेरी निंदिया रे
हाथ कहीं है, पाँव कहीं है, लागे प्यारी प्यारी ममता गाए, पवन झुलाये, झूले राजकुमारी लेके मेरी निंदिया रे  

这篇关于Beautifulsoup Python无法从网站上抓取数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆