如何使用beautifulsoup提取onClick网址 [英] How to extract onClick url using beautifulsoup

查看：441 发布时间：2020/9/20 7:23:45 python html beautifulsoup

本文介绍了如何使用beautifulsoup提取onClick网址的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

下面是需要提取的HTML代码

<div class="one_block" style="display:block;" onClick="location.href=\'/games/box.html
?&game_type=01&game_id=13&game_date=2020-04-19&pbyear=2020\';" style="cursor:pointer;">
<!-- \xe5\xb0\x8d\xe6\x88\xb0\xe7\x90\x83\xe9\x9a\x8
a\xe5\x8f\x8a\xe5\xa0\xb4\xe5\x9c\xb0 start -->
<table width="100%" border="0" cellspacing="0" cellpadding="0" class="schedule_team">
<tr>

如何获取location.href值?

尝试:

soup.findAll("div", {"onClick": "location.href"})

返回null

Desired Output:

/games/box.html?&game_type=01&game_id=13&game_date=2020-04-19&pbyear=2020

PS:有很多location.href

推荐答案

如何使用.select()方法用于 SoupSieve包运行CSS选择器

How about using .select() method for SoupSieve package to run a CSS selector

from bs4 import BeautifulSoup

html = '<div class="one_block" style="display:block;" onClick="location.href=\'/games/box.html?&game_type=01&game_id=13&game_date=2020-04-19&pbyear=2020\';" style="cursor:pointer;">' \
        '<!-- \xe5\xb0\x8d\xe6\x88\xb0\xe7\x90\x83\xe9\x9a\x8a\xe5\x8f\x8a\xe5\xa0\xb4\xe5\x9c\xb0 start -->' \
        '<table width="100%" border="0" cellspacing="0" cellpadding="0" class="schedule_team"><tr>'

soup = BeautifulSoup(html, features="lxml")
element = soup.select('div.one_block')[0]
print(element.get('onclick'))

使用split来获取 print(element.get('onclick').split("'")[1])

/games/box.html?&game_type=01&game_id=13&game_date=2020-04-19&pbyear=2020

这篇关于如何使用beautifulsoup提取onClick网址的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用beautifulsoup提取onClick网址 [英] How to extract onClick url using beautifulsoup

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

如何使用beautifulsoup提取onClick网址 [英] How to extract onClick url using beautifulsoup

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭