用python爬网站数据,为什么只爬到标签,爬不到标签内容呢

查看:517
本文介绍了用python爬网站数据,为什么只爬到标签,爬不到标签内容呢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问 题

我想爬电影票房的数据,网站是http://www.cbooo.cn/movieweek,我要爬网页最下面的【票房日期:2016-11-14至2016-11-20 单周票房:57271万 单周场次:1463995场 单周人次:1781万】这些数据,代码如下:

from bs4 import BeautifulSoup
import urllib.request


z = input("请输入网址:")
a = urllib.request.urlopen(z).read()

b = BeautifulSoup(a,"html.parser")

c = b.select("#content > div.alldate")

 
 
 
for i in c:
    print(i.get_text())
 

输出结果是
票房日期:
单月票房:万
单月场次:万场
单月人次:万

关键的数据没有啊,这是怎么回事呢,我最想要的是那些数据,怎么弄也没有,跪求解决办法

谢谢
谢谢
谢谢

解决方案

因为你需要的数据是有ajax动态生成的,在html源码中是找不到的,所以需要能够动态加载js工具,你可以用这个
selenium+PhantomJS来执行js的内容,不过这个相对来说比较慢。

不过针对你需要抓取的网站,用游览器抓包发现 发现ajax请求路径是

所以你可以直接发起请求,

urllib.urlopen("http://www.cbooo.cn/BoxOffice/getWeekInfoData?sdate=2016-11-14").read()

不需要用上面的phantomJS。发现返回的json字符串中有你所需要的数据,你需要的数据在最后的data2。

{

"data1": [
    {
        "MovieRank": "1",
        "MovieID": "640103",
        "MovieName": "我不是潘金莲",
        "WeekAmount": "20531",
        "SumWeekAmount": "20553",
        "People": "644",
        "MovieDay": "3",
        "AvgPrice": "32",
        "AvgPeople": "27",
        "Amount_Up": "0",
        "Screen_Up": "0",
        "People_Up": "0",
        "DefaultImage": "http://www.cbooo.cn/moviepic/229639.jpg",
        "Rank_Up": "0",
        "WomIndex": "0.00"
    },
    {
        "MovieRank": "2",
        "MovieID": "325408",
        "MovieName": "奇异博士",
        "WeekAmount": "13324",
        "SumWeekAmount": "70321",
        "People": "380",
        "MovieDay": "17",
        "AvgPrice": "35",
        "AvgPeople": "13",
        "Amount_Up": "-51",
        "Screen_Up": "-40",
        "People_Up": "-51",
        "DefaultImage": "http://www.cbooo.cn/moviepic/108737.jpg",
        "Rank_Up": "-1",
        "WomIndex": "8.32"
    },
    {
        "MovieRank": "3",
        "MovieID": "625158",
        "MovieName": "比利·林恩的中场战事",
        "WeekAmount": "5474",
        "SumWeekAmount": "13561",
        "People": "122",
        "MovieDay": "10",
        "AvgPrice": "45",
        "AvgPeople": "7",
        "Amount_Up": "-32",
        "Screen_Up": "-1",
        "People_Up": "-42",
        "DefaultImage": "http://www.cbooo.cn/moviepic/217130.jpg",
        "Rank_Up": "-1",
        "WomIndex": "8.20"
    },
    {
        "MovieRank": "4",
        "MovieID": "656548",
        "MovieName": "深海浩劫",
        "WeekAmount": "5441",
        "SumWeekAmount": "5441",
        "People": "195",
        "MovieDay": "6",
        "AvgPrice": "28",
        "AvgPeople": "12",
        "Amount_Up": "0",
        "Screen_Up": "0",
        "People_Up": "0",
        "DefaultImage": "http://www.cbooo.cn/moviepic/216485.jpg",
        "Rank_Up": "0",
        "WomIndex": "0.00"
    },
    {
        "MovieRank": "5",
        "MovieID": "653289",
        "MovieName": "航海王之黄金城",
        "WeekAmount": "3201",
        "SumWeekAmount": "10185",
        "People": "116",
        "MovieDay": "10",
        "AvgPrice": "27",
        "AvgPeople": "7",
        "Amount_Up": "-54",
        "Screen_Up": "14",
        "People_Up": "-55",
        "DefaultImage": "http://www.cbooo.cn/moviepic/232344.jpg",
        "Rank_Up": "-2",
        "WomIndex": "8.70"
    },
    {
        "MovieRank": "6",
        "MovieID": "627541",
        "MovieName": "外公芳龄38",
        "WeekAmount": "2129",
        "SumWeekAmount": "5635",
        "People": "82",
        "MovieDay": "10",
        "AvgPrice": "26",
        "AvgPeople": "7",
        "Amount_Up": "-39",
        "Screen_Up": "31",
        "People_Up": "-39",
        "DefaultImage": "http://www.cbooo.cn/moviepic/227040.jpg",
        "Rank_Up": "-2",
        "WomIndex": "8.03"
    },
    {
        "MovieRank": "7",
        "MovieID": "626571",
        "MovieName": "勇士之门",
        "WeekAmount": "1715",
        "SumWeekAmount": "1715",
        "People": "56",
        "MovieDay": "3",
        "AvgPrice": "31",
        "AvgPeople": "6",
        "Amount_Up": "0",
        "Screen_Up": "0",
        "People_Up": "0",
        "DefaultImage": "http://www.cbooo.cn/moviepic/210856.jpg",
        "Rank_Up": "0",
        "WomIndex": "0.00"
    },
    {
        "MovieRank": "8",
        "MovieID": "633157",
        "MovieName": "阿拉丁与神灯",
        "WeekAmount": "1338",
        "SumWeekAmount": "1338",
        "People": "53",
        "MovieDay": "3",
        "AvgPrice": "25",
        "AvgPeople": "9",
        "Amount_Up": "0",
        "Screen_Up": "0",
        "People_Up": "0",
        "DefaultImage": "http://www.cbooo.cn/moviepic/231914.jpg",
        "Rank_Up": "0",
        "WomIndex": "0.00"
    },
    {
        "MovieRank": "9",
        "MovieID": "628324",
        "MovieName": "驴得水",
        "WeekAmount": "818",
        "SumWeekAmount": "17104",
        "People": "26",
        "MovieDay": "24",
        "AvgPrice": "31",
        "AvgPeople": "9",
        "Amount_Up": "-72",
        "Screen_Up": "-68",
        "People_Up": "-72",
        "DefaultImage": "http://www.cbooo.cn/moviepic/236741.jpg",
        "Rank_Up": "-4",
        "WomIndex": "8.16"
    },
    {
        "MovieRank": "10",
        "MovieID": "627597",
        "MovieName": "夏有乔木 雅望天堂",
        "WeekAmount": "437",
        "SumWeekAmount": "15631",
        "People": "11",
        "MovieDay": "108",
        "AvgPrice": "40",
        "AvgPeople": "110",
        "Amount_Up": "0",
        "Screen_Up": "0",
        "People_Up": "0",
        "DefaultImage": "http://www.cbooo.cn/moviepic/216992.jpg",
        "Rank_Up": "0",
        "WomIndex": ""
    }
],
"data2": [
    {
        "sDate": "2016-11-14至2016-11-20",
        "BoxOffice": "57271",
        "ShoCount": "1463995",
        "AudienceCount": "1781"
    }
] }

这篇关于用python爬网站数据,为什么只爬到标签,爬不到标签内容呢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆