BeautifulSoup不会返回所有数据 [英] BeautifulSoup does not returns all data

查看:113
本文介绍了BeautifulSoup不会返回所有数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我现在正在尝试使用Python的BeautifulSoup库解析月相的一些数据.

I'm trying to parse some data for the moon phase today using Python's library BeautifulSoup.

from bs4 import BeautifulSoup
import urllib2

moon_url = "http://www.moongiant.com/phase/today/"


try:
    rqest =  urllib2.urlopen(moon_url)
    moon_Soup = BeautifulSoup(rqest, 'lxml')
    moon_angle = 0
    moon_illumination = 0
    main_data = moon_Soup.find('div', {'id' : 'moonDetails'})
    print main_data

except urllib2.URLError:
    print "Error"

但是输出不是这个:

<div id="moonDetails">        
      Phase: <span>Waxing Crescent</span><br>Illumination: <span>36%
</span><br>Moon Age: <span>6.00 days</span><br>Moon Angle: <span>0.55</span><br>Moon Distance: <span>364,</span>434.78 km<br>Sun Angle: <span>0.53</span><br>Sun Distance: <span>149,</span>571,918.47 km<br>
</div>

仅仅是这个:

<div id="moonDetails">
</div>

有什么主意吗?

推荐答案

实际上,在RaminNietzsche发表评论后,我使用了 dryscrape 库.

Actually after the RaminNietzsche's comment I used dryscrape library.

from bs4 import BeautifulSoup
import urllib2
import dryscrape

    moon_url = "http://www.moongiant.com/phase/today/"

try:
    rqest =  urllib2.urlopen(moon_url)
    session = dryscrape.Session()
    session.visit(moon_url)
    response = session.body()
    soup = BeautifulSoup(response, 'lxml')

    moon_data = soup.findAll('div', {'id':'moonDetails'})
    print moon_data

结果,现在的输出是:

<div id="moonDetails">        
      Phase: <span>Waxing Crescent</span><br>Illumination: <span>36%
</span><br>Moon Age: <span>6.00 days</span><br>Moon Angle: <span>0.55</span><br>Moon Distance: <span>364,</span>434.78 km<br>Sun Angle: <span>0.53</span><br>Sun Distance: <span>149,</span>571,918.47 km<br>
</div>

感谢大家的回答!

这篇关于BeautifulSoup不会返回所有数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆