获得表格“动作".来自BeautifulSoup结果 [英] Getting form "action" from BeautifulSoup result

查看:55
本文介绍了获得表格“动作".来自BeautifulSoup结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为网站的Python解析器编写代码以自动完成某些工作,但是我对Py的"re"模块(正则表达式)的兴趣不大,并且无法使其正常工作.

I'm coding a Python parser for a website to do some job automatically but I'm not much into "re" module (regex) for Py and can't make it work.

req = urllib2.Request(tl2)
req.add_unredirected_header('User-Agent', ua)
response = urllib2.urlopen(req)
try:
    html = response.read()
except urllib2.URLError, e:
    print "Error while reading data. Are you connected to the interwebz?!", e

soup = BeautifulSoup.BeautifulSoup(html)
form = soup.find('form', id='form_product_page')
pret = form.prettify()

print pret

结果:

<form id="form_product_page" name="form_1362737440" action="/download/791055/164084/" method="get">
<input id="nojssubmit" type="submit" value="Download" />
</form>

确实完成了代码,这正是我开始所需要的.现在,我想知道应该以哪种方式从"form"标签中提取"action"属性.这只是我从BeautifulSoup响应中所需要的.

Indeed that code is done, just what I need for start. Now, I'm wondering on which way should I extract "action" attribute from "form" tag. That is only what I need from BeautifulSoup response.

我尝试使用 form = soup.find('form',id ='form_product_page').parent.get('action'),但结果为无".我要提取的内容例如是"/download/791055/164084/".链接中的每个URL都不同.

I've tried using form = soup.find('form', id='form_product_page').parent.get('action') but result was 'None'. What I want to extract is for example "/download/791055/164084/". This is different on every URL from link.


变量(示例):
tl2 = http://example.com
ua = Mozilla Firefox/14.04


Variables (example):
tl2 = http://example.com
ua = Mozilla Firefox / 14.04

推荐答案

您可以一步完成:

action = soup.find('form', id='form_product_page').get('action')

这篇关于获得表格“动作".来自BeautifulSoup结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆