问题与使用python的HTML中找到特定标签的父 [英] Issue with finding parent of a particular tag in html using python
问题描述
我试图获取使用下述code特定标签的父元素:
I am trying to fetch parent element of a particular tag using below mentioned code:
# -*- coding: cp1252 -*-
import csv
import urllib2
import sys
import time
from bs4 import BeautifulSoup
from itertools import islice
page1= urllib2.urlopen('http://www.sfr.fr/mobile/telephones?vue=000029&tgp=toutes-les-offres&typesmartphone=se-android&typesmartphone=se-apple&typesmartphone=se-bada&typesmartphone=se-rim-blackberry&typesmartphone=se-windows&p=0').read()
soup1 = BeautifulSoup(page1)
price_parent = soup1.findParents('div')
print price_parent
问题:输出,我运行这个code后获得返回空数组 []
,如果我用 findParent
,而不是父母随后也返回无
值。
Problem: Output which I am getting after running this code returns Null array []
, if I use findParent
instead of Parents then also it returns None
value.
我实际的问题是类似这样的<一个href=\"http://stackoverflow.com/questions/10777250/beautifulsoup-findall-not-within-certain-tag\">BeautifulSoup - 不是的findAll中的某些标记
My actual problem is similar to this BeautifulSoup - findAll not within certain tag
要解决我的实际问题,我需要得到元素的父母,这是上面提到的我越来越无
值。
To solve my actual problem I need to get parents of elements for which I am getting None
value as mentioned above.
请帮我解决这个问题,原谅我的无知,我是新来的节目。
Please help me in solving this issue and pardon my ignorance as I am new to programming.
推荐答案
.findParents()
不会做你认为它。它发现的当前元素的父母的搜索匹配的。您正在试图找到一个页面元素,这已经是顶级元素的父母。
.findParents()
does not do what you think it does. It finds the parents of the current element that match the search. You are trying to find the parents of a page element, which is already the top-level element.
如果你有这样的结构:
<html>
<body>
<div class="foo">
<span id="bar">Some text</span>
</div>
</body>
</html>
其中,汤
是整个结构的BeautifulSoup变量,可以找到跨度
与
where soup
is a BeautifulSoup variable for the whole structure, you can find the span
with:
spanelement = soup.find('span', id='bar')
,然后调用 .findParent('格')
将返回一个结果,即&LT; DIV CLASS =foo的&GT;
元素。
and then calling .findParent('div')
will return a result, namely the <div class="foo">
element.
所以,调用 .findParents()
顶级元素会的总是的返回一个空的结果,还有的在没有父母的。调用它的东西,确实有一个父元素来代替。
So, calling .findParents()
on a top-level element will always return an empty result, there are no parents. Call it on something that does have a parent element instead.
这篇关于问题与使用python的HTML中找到特定标签的父的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!