无法从表中刮取所有 ul 标签 [英] Can't scrape all of ul tags from a table

查看:34
本文介绍了无法从表中刮取所有 ul 标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从该站点抓取所有代理 ip:https://proxy-list.org/english/index.php 但我最多只能得到一个ip这是我的代码:

I'm trying to scrape all of proxy ips from this site : https://proxy-list.org/english/index.php but i can only get one ip at most here is my code :

from helium import *
   from bs4 import BeautifulSoup
   url = 'https://proxy-list.org/english/index.php'
   browser = start_chrome(url, headless=True)
   soup = BeautifulSoup(browser.page_source, 'html.parser')
   proxies = soup.find_all('div', {'class':'table'})
   for ips in proxies:
   print(ips.find('li', {'class':'proxy'}).text)

我尝试使用 ips.find_all 但没有用.

i tried to use ips.find_all but it didn't work.

推荐答案

from bs4 import BeautifulSoup
import requests

url = 'https://proxy-list.org/english/index.php'

pagecontent = requests.get(url)
soup = BeautifulSoup(browser.pagecontent, 'html.parser')
maintable = soup.find_all('div', {'class':'table'})
for div_element  in maintable:
    rows = div_element.find_all('li', class_='proxy')
    for ip in rows:
        print(ip.text)

这篇关于无法从表中刮取所有 ul 标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆