美丽的汤:访问< li>来自< ul>的元素没有身份证 [英] Beautiful Soup: Accessing <li> elements from <ul> with no id

查看：72 发布时间：2020/9/20 6:20:14 python html-parsing web-scraping beautifulsoup

本文介绍了美丽的汤:访问< li>来自< ul>的元素没有身份证的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是现有代码:

hdr = {'User-Agent': 'Mozilla/5.0'}
site = "http://en.wikipedia.org/wiki/"+"january"+"_"+"1"
req = urllib2.Request(site,headers=hdr)    
page = urllib2.urlopen(req)
soup = BeautifulSoup(page)

print soup

一切正常，我可以获取整个HTML页面，但是我想要特定的数据，而且我不知道如何在没有ID的情况下使用Beautiful Soup进行访问. <ul>标记没有ID，<li>标记也没有.另外，我不能只问每个<li>标记，因为页面上还有其他列表.有没有特定的方法来调用给定列表? (我不能只为此页面使用修复程序，因为我计划遍历所有日期并让每一页生日，而且我无法保证每一页与该页面的布局完全相同.)

This all works fine and I get the entire HTML page, but I want specific data, and I don't know how to access that with Beautiful Soup without an id to use. The <ul> tag does not have an id and neither do the <li> tags. Plus, I can't just ask for every <li> tag because there are other lists on the page. Is there a specific way to call a given list? (I can't just use a fix for this one page because I plan on iterating through all the dates and getting every pages birthday, and I can't guarentee that every page is the exact same layout as this one).

美丽的汤:访问< li>来自< ul>的元素没有身份证 [英] Beautiful Soup: Accessing <li> elements from <ul> with no id

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

美丽的汤:访问&lt; li&gt;来自&lt; ul&gt;的元素没有身份证 [英] Beautiful Soup: Accessing &lt;li&gt; elements from &lt;ul&gt; with no id

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

美丽的汤:访问< li>来自< ul>的元素没有身份证 [英] Beautiful Soup: Accessing <li> elements from <ul> with no id

登录关闭