在scrapy shell中呈现JS内容的FormRequest [英] FormRequest that renders JS content in scrapy shell
问题描述
我正在尝试从这个
您发送的请求缺少几个字段,这可能就是您没有收到回复的原因.您填写的字段也与他们在请求中期望的字段不对应.解决这个问题的一个好方法是使用 scrapy 的 from_response (doc),它可以根据表单中的信息为您填充一些字段.
对于这个网站,以下对我有用(使用scrapy shell):
<预><代码>>>>url = https://registers.maryland.gov/RowNetWeb/Estates/frmEstateSearch2.aspx">>>获取(网址)>>>从scrapy导入FormRequest>>>req = FormRequest.from_response(... 回复,... formxpath="//form[@id='form1']", # 指定当前页面的表单...表单数据={... 'cboCountyId': '16', # 你选择的县被转换成数字... 'DateOfFilingFrom': '01-01-2001',... 'cboPartyType': 'Decedent',... 'cmdSearch': '搜索'... },... clickdata={'type': 'submit'},……)>>>获取(请求)I'm trying to scrape content from this page with the following form data:
I need the County:
set to Prince George's and DateOfFilingFrom
set to 01-01-2000
so I do the following:
% scrapy shell
In [1]: from scrapy.http import FormRequest
In [2]: request = FormRequest(url='https://registers.maryland.gov/RowNetWeb/Estates/frmEstateSearch2.aspx', formdata={'DateOfFilingFrom': '01-01-2000', 'County:': "Prince George's"})
In [3]: response
In [4]:
But it's not working(response is None) plus, the next page looks like the following which is loaded dynamically, I need to know how to be able to access each of the links shown below with the following inspection(as far as I know this might be done using Splash
however, I'm not sure how to combine a SplashRequest
within a FormRequest
and do it all from within scrapy shell for testing purposes. I need to know what I'm doing wrong and how to render the next page(the one that results from the FormRequest
shown below)
The request you're sending is missing a couple of fields, which is probably why you don't get a response back. The fields you fill in also don't correspond to the fields they are expecting in the request. A good way to deal with this is using scrapy's from_response (doc), which can populate some fields for you already based on the information in the form.
For this website the following worked for me (using scrapy shell):
>>> url = "https://registers.maryland.gov/RowNetWeb/Estates/frmEstateSearch2.aspx"
>>> fetch(url)
>>> from scrapy import FormRequest
>>> req = FormRequest.from_response(
... response,
... formxpath="//form[@id='form1']", # specify the form on the current page
... formdata={
... 'cboCountyId': '16', # the county you select is converted to a number
... 'DateOfFilingFrom': '01-01-2001',
... 'cboPartyType': 'Decedent',
... 'cmdSearch': 'Search'
... },
... clickdata={'type': 'submit'},
... )
>>> fetch(req)
这篇关于在scrapy shell中呈现JS内容的FormRequest的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!