python - Scrapy的使用,如何请求新的URL,并回调指定的函数?

查看:176
本文介绍了python - Scrapy的使用,如何请求新的URL,并回调指定的函数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问 题

关于Python3下Scrapy的使用问题

import re
import scrapy
from bs4 import BeautifulSoup
from scrapy.http import Request
from ..items import ZhibobaItem
import json
import lxml.html
import requests
import json


class Myspider(scrapy.Spider):
    name = 'zhiboba'
    allowed_domains = ['zhibo8.cc']
    json_url = 'https://bifen4pc.qiumibao.com/json/list.htm?85591'
    bash_url = 'https://www.zhibo8.cc/'

    def start_requests(self):
        yield Request(self.bash_url, self.parse_index)

    def parse_index(self, response):
        print("enter the parse_index")
        print(self.bash_url)
        divs = BeautifulSoup(response.text, 'lxml').find_all(label=re.compile("足球"))
        item = ZhibobaItem()
        for single_div in divs:
            item['label'] = single_div.get('label')
            item['sdate'] = single_div.get('data-time')
            item['linkurl'] = self.bash_url + single_div.find('a')['href']
            home_team = single_div.get_text().split()[2]
            item['home_team'] = home_team
            visit_team = single_div.get_text().split()[4]
            item['visit_team'] = visit_team
            print("quit the parse_index")
            print(self.json_url)
            yield Request(self.json_url, callback=self.get_score, meta={'home_team': home_team,
                                                                        'visit_team': visit_team
                                                                        })
    def get_score(self, response):
        print("enter the get_score")
        json_url = self.json_url
        wbdata = response.get(json_url).text
        data = json.loads(wbdata)
        news = data['list']
        print(wbdata)
        print("quit the get_score")

当我执行上述代码时,无法成功的调用json_url以及相应的响应函数get_score,哪里不对?

解决方案

试着修改allow_domains = []

这篇关于python - Scrapy的使用,如何请求新的URL,并回调指定的函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆