从python中的标记名称刮取数据 [英] Scraping data from the tag names in python

查看：122 发布时间：2019/5/27 15:44:57 javascript python html selenium beautifulsoup

本文介绍了从python中的标记名称刮取数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

您好我正在尝试从网站上抓取用户数据。我需要标签名称中可用的用户ID。我正在尝试使用 div 标签中的python selenium和漂亮的汤来刮取UID。

Hi I am trying to scrape user data from a website. I need User ID which are available in the tag names itself.I am trying to scrape the UID using python selenium and beautiful soup in the div tag.

示例：

<"div id="UID_**60CE07D6DF5C02A987ED7B076F4154F3**-SRC_328619641" class="memberOverlayLink" onmouseover="ta.trackEventOnPage('Reviews','show_reviewer_info_window','user_name_photo'); ta.call('ta.overlays.Factory.memberOverlayWOffset', event, this, 's3 dg rgba_gry update2012', 0, (new Element(this)).getElement('.avatar')&amp;&amp;(new Element(this)).getElement('.avatar').getStyle('border-radius')=='100%'?-10:0);">

我正在尝试使用div标签中的python selenium和漂亮的汤来刮掉UID。我查看了所有文档和几个网页但是我找不到解决办法。如果有人能告诉我这件事是否可能我会非常感激。

I am trying to scrape the UID using python selenium and beautiful soup in the div tag . I have looked through all the documentation and several web pages but I can't find a solution for this. If anyone can please tell me if such a thing is possible I would be very grateful.

推荐答案

假设 id 属性值始终采用 UID _ 格式，后跟一个或多个字母数字字符，后跟 -SRC _ 后跟一个或多个数字：

Assuming the id attribute value is always in the format UID_ followed by one or more alphanumeric characters followed by -SRC_ followed by one or more digits:

import re
from bs4 import BeautifulSoup

soup = BeautifulSoup(html)

pattern = re.compile(r"UID_(\w+)\-SRC_\d+")
id = soup.find("div", id=pattern)["id"]

uid = pattern.match(id).group(1)
print(uid)

这里我们使用 BeautifulSoup 并搜索 id 与特定正则表达式匹配的属性值。它包含保存组 （\ w + ）帮助我们提取UID值。

Here we are using BeautifulSoup and searching for an id attribute value to match a specific regular expression. It contains a saving group (\w+) that helps us to extract the UID value.

这篇关于从python中的标记名称刮取数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

从python中的标记名称刮取数据 [英] Scraping data from the tag names in python

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

从python中的标记名称刮取数据 [英] Scraping data from the tag names in python

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭