美汤:访问<li>来自<ul>的元素没有身份证 [英] Beautiful Soup: Accessing <li> elements from <ul> with no id

查看：18 发布时间：2021/12/23 20:57:04 python html-parsing web-scraping beautifulsoup

本文介绍了美汤:访问<li>来自<ul>的元素没有身份证的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是现有的代码:

hdr = {'User-Agent': 'Mozilla/5.0'}
site = "http://en.wikipedia.org/wiki/"+"january"+"_"+"1"
req = urllib2.Request(site,headers=hdr)    
page = urllib2.urlopen(req)
soup = BeautifulSoup(page)

print soup

这一切都很好，我得到了整个 HTML 页面，但我想要特定的数据，我不知道如何在没有 id 的情况下使用 Beautiful Soup 访问这些数据.

标签也没有.另外，我不能只要求每个
标记，因为页面上还有其他列表.有没有特定的方法来调用给定的列表?(我不能只对这个页面使用修复程序，因为我计划遍历所有日期并获取每个页面的生日，而且我不能保证每个页面的布局都与此页面完全相同).

This all works fine and I get the entire HTML page, but I want specific data, and I don't know how to access that with Beautiful Soup without an id to use. The <ul> tag does not have an id and neither do the <li> tags. Plus, I can't just ask for every <li> tag because there are other lists on the page. Is there a specific way to call a given list? (I can't just use a fix for this one page because I plan on iterating through all the dates and getting every pages birthday, and I can't guarentee that every page is the exact same layout as this one).

推荐答案

查找出生部分:
```
section = soup.find('span', id='Births').parent
```
然后找到下一个无序列表:

And then find the next unordered list:
```
births = section.find_next('ul').find_all('li')
```
这篇关于美汤:访问<li>来自<ul>的元素没有身份证的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

美汤:访问<li>来自<ul>的元素没有身份证 [英] Beautiful Soup: Accessing <li> elements from <ul> with no id

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

美汤:访问<li>来自&lt;ul&gt;的元素没有身份证 [英] Beautiful Soup: Accessing &lt;li&gt; elements from &lt;ul&gt; with no id

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

美汤:访问<li>来自<ul>的元素没有身份证 [英] Beautiful Soup: Accessing <li> elements from <ul> with no id

登录关闭