如何动态设置 Scrapy 规则? [英] How to dynamically set Scrapy rules?
问题描述
我有一个类在初始化之前运行一些代码:
I have a class running some code before the init:
class NoFollowSpider(CrawlSpider):
rules = ( Rule (SgmlLinkExtractor(allow=("", ),),
callback="parse_items", follow= True),
)
def __init__(self, moreparams=None, *args, **kwargs):
super(NoFollowSpider, self).__init__(*args, **kwargs)
self.moreparams = moreparams
我使用以下命令运行这个scrapy代码:
I am running this scrapy code with the following command:
> scrapy runspider my_spider.py -a moreparams="more parameters" -o output.txt
现在,我希望可以从命令行配置名为 rules 的静态变量:
Now, I want the static variable named rules to be configurable from the command-line:
> scrapy runspider my_spider.py -a crawl=True -a moreparams="more parameters" -o output.txt
将 init 更改为:
def __init__(self, crawl_pages=False, moreparams=None, *args, **kwargs):
if (crawl_pages is True):
self.rules = ( Rule (SgmlLinkExtractor(allow=("", ),), callback="parse_items", follow= True),
)
self.moreparams = moreparams
然而,如果我在 init 中切换静态变量 rules,scrapy 不再考虑它:它运行,但只抓取给定的 start_urls 而不是整个域.好像规则必须是静态类变量.
However, if I switch the static variable rules within the init, scrapy does not take it into account anymore: It runs, but only crawls the given start_urls and not the whole domain. It seems that rules must be a static class variable.
那么,如何动态设置静态变量?
So, How can I dynamically set a static variable?
推荐答案
这里是我在@Not_a_Golfer 和@nramirezuy 的大力帮助下解决问题的方法,我只是使用了他们的建议:
So here is how I resolved the problem with the great help of @Not_a_Golfer and @nramirezuy, I'm simply using a bit of both what they suggested:
class NoFollowSpider(CrawlSpider):
def __init__(self, crawl_pages=False, moreparams=None, *args, **kwargs):
super(NoFollowSpider, self).__init__(*args, **kwargs)
# Set the class member from here
if (crawl_pages is True):
NoFollowSpider.rules = ( Rule (SgmlLinkExtractor(allow=("", ),), callback="parse_items", follow= True),)
# Then recompile the Rules
super(NoFollowSpider, self)._compile_rules()
# Keep going as before
self.moreparams = moreparams
谢谢大家的帮助!
这篇关于如何动态设置 Scrapy 规则?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!