问题与使用python的HTML中找到特定标签的父 [英] Issue with finding parent of a particular tag in html using python

查看:763
本文介绍了问题与使用python的HTML中找到特定标签的父的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图获取使用下述code特定标签的父元素:

I am trying to fetch parent element of a particular tag using below mentioned code:

# -*- coding: cp1252 -*-
import csv
import urllib2
import sys
import time
from bs4 import BeautifulSoup
from itertools import islice
page1= urllib2.urlopen('http://www.sfr.fr/mobile/telephones?vue=000029&tgp=toutes-les-offres&typesmartphone=se-android&typesmartphone=se-apple&typesmartphone=se-bada&typesmartphone=se-rim-blackberry&typesmartphone=se-windows&p=0').read()
soup1 = BeautifulSoup(page1)
price_parent = soup1.findParents('div')
print price_parent

问题:输出,我运行这个code后获得返回空数组 [] ,如果我用 findParent ,而不是父母随后也返回值。

Problem: Output which I am getting after running this code returns Null array [], if I use findParent instead of Parents then also it returns None value.

我实际的问题是类似这样的<一个href=\"http://stackoverflow.com/questions/10777250/beautifulsoup-findall-not-within-certain-tag\">BeautifulSoup - 不是的findAll中的某些标记

My actual problem is similar to this BeautifulSoup - findAll not within certain tag

要解决我的实际问题,我需要得到元素的父母,这是上面提到的我越来越值。

To solve my actual problem I need to get parents of elements for which I am getting None value as mentioned above.

请帮我解决这个问题,原谅我的无知,我是新来的节目。

Please help me in solving this issue and pardon my ignorance as I am new to programming.

推荐答案

.findParents()不会做你认为它。它发现的当前元素的父母的搜索匹配的。您正在试图找到一个页面元素,这已经是顶级元素的父母。

.findParents() does not do what you think it does. It finds the parents of the current element that match the search. You are trying to find the parents of a page element, which is already the top-level element.

如果你有这样的结构:

<html>
    <body>
        <div class="foo">
            <span id="bar">Some text</span>
        </div>
    </body>
</html>

其中,是整个结构的BeautifulSoup变量,可以找到跨度

where soup is a BeautifulSoup variable for the whole structure, you can find the span with:

spanelement = soup.find('span', id='bar')

,然后调用 .findParent('格')将返回一个结果,即&LT; D​​IV CLASS =foo的&GT; 元素。

and then calling .findParent('div') will return a result, namely the <div class="foo"> element.

所以,调用 .findParents()顶级元素会的总是的返回一个空的结果,还有的在没有父母的。调用它的东西,确实有一个父元素来代替。

So, calling .findParents() on a top-level element will always return an empty result, there are no parents. Call it on something that does have a parent element instead.

这篇关于问题与使用python的HTML中找到特定标签的父的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆