Beautifulsoup 和 selenium:单击 svg 路径进入下一页并从该页面获取数据 [英] beautifulsoup and selenium: clicking on an svg path to get to the next page and get data from that page

查看：52 发布时间：2021/9/1 19:16:14 python selenium svg beautifulsoup path

本文介绍了Beautifulsoup 和 selenium:单击 svg 路径进入下一页并从该页面获取数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在做一个项目，网站上有一个表格，里面填满了数据，表格有 7 页长.这是本网站上的表格:

最后一个打印是该项目的所有内容.这将使您看到结构并帮助您获取数据.它看起来像这样:

<代码>{名称":meebits"，总计":{所有时间":{计数":15622，交易者":5023，美元":157251919.08，平均值":10066.06，transfer_count":29826，transfer_unique_assets":19981，asset_unique_owners":4812，asset_usd":95541331.68，资产平均值":10566.39},某一天":{计数":0，交易者":0，美元":0，平均":0，transfer_count":0，transfer_unique_assets":0，asset_unique_owners":0，asset_usd":0，资产平均值":0},两天前":{计数":0，交易者":0，美元":0，平均":0},第七天":{计数":144，交易者":165，美元":703913.21，平均":4888.29，transfer_count":265，transfer_unique_assets":204，asset_unique_owners":125，asset_usd":611620.92，资产平均值":5412.57},第三天":{计数":1663，交易者":1167，美元":12662841.8，平均值":7614.46，transfer_count":2551，transfer_unique_assets":1704，asset_unique_owners":781，asset_usd":9908945.2，资产平均值":9107.49}}}

I'm working on a project where there is a table on a website that is filled with data, and the table is 7 pages long. it is the table on this website: https://nonfungible.com/market/history . You get to the next page through an svg path. I have to get data from all 7 pages. I don't know how to click on this svg path. Please let me know if you know how to click on the path. even though the svg doesn't have an aria-label or a class.

this is a photo of the source code.

I have tried many different things including:

    driver.find_element_by_xpath('//div[@id="icon-chevron-right"]/*[name()="svg"]/*[name()="path"]').click()

this is the error that I am getting: raise exception_class(message, screen, stacktrace) selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//div[@id="icon-chevron-right"]/[name()="svg"]/[name()="path"]"} (Session info: chrome=92.0.4515.107)

Thank you for your help. please help me with this.

解决方案

Slightly different approach than using the GUI - but have a look at the below. Doing it this way offers much more data than the front end shows.

It looks like the /market/history page comes down with the JSON data (it's not a separate call in can identify in dev tools). However - if you:

Get the page with the python requests library
Parse the html and find the json data object which is @id="__NEXT_DATA__"
Get the right part of the json which has the table data
Filter the object to get rid of a few bits and pieces (where name != none)

from lxml import html
import requests
import json

url = "https://nonfungible.com/market/history"

#get the page and parse
response = requests.get(url)
page = html.fromstring(response.content)

#get the data and convert to json
datastring = page.xpath('//script[@id="__NEXT_DATA__"]/text()')
data = json.loads(datastring[0])
#print(json.dumps(data, indent=4)) #this prints everything

#Get the relevant part of the json (it has lots of other cr*p in there - it was effort to find this
tabledata = data['props']['pageProps']['currentTotals']
# this filters out some of the unneeded data
AllItems = list(filter(lambda x: x['name'] !=None, tabledata)) 

#print out each item - which relates to an row in the table 
for item in  AllItems:
    print (item['name'])
    print (item['totals']['alltime']['usd'])
    print (json.dumps(item, indent=4))

What you need to do from here is extract what you want from the json.

I've started you off... The first 2 prints in the loop output this:

meebits

157251919.08

Which match the items on the website:

The last print does this is everything that item has. This will let you see the structure and help you get your data out. It looks like this:

{
    "name": "meebits",
    "totals": {
        "alltime": {
            "count": 15622,
            "traders": 5023,        
            "usd": 157251919.08,    
            "average": 10066.06,    
            "transfer_count": 29826,
            "transfer_unique_assets": 19981,
            "asset_unique_owners": 4812,
            "asset_usd": 95541331.68,
            "asset_average": 10566.39
        },
        "oneday": {
            "count": 0,
            "traders": 0,
            "usd": 0,
            "average": 0,
            "transfer_count": 0,
            "transfer_unique_assets": 0,
            "asset_unique_owners": 0,
            "asset_usd": 0,
            "asset_average": 0
        },
        "twodayago": {
            "count": 0,
            "traders": 0,
            "usd": 0,
            "average": 0
        },
        "sevenday": {
            "count": 144,
            "traders": 165,
            "usd": 703913.21,
            "average": 4888.29,
            "transfer_count": 265,
            "transfer_unique_assets": 204,
            "asset_unique_owners": 125,
            "asset_usd": 611620.92,
            "asset_average": 5412.57
        },
        "thirtyday": {
            "count": 1663,
            "traders": 1167,
            "usd": 12662841.8,
            "average": 7614.46,
            "transfer_count": 2551,
            "transfer_unique_assets": 1704,
            "asset_unique_owners": 781,
            "asset_usd": 9908945.2,
            "asset_average": 9107.49
        }
    }
}

这篇关于Beautifulsoup 和 selenium:单击 svg 路径进入下一页并从该页面获取数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Beautifulsoup 和 selenium:单击 svg 路径进入下一页并从该页面获取数据 [英] beautifulsoup and selenium: clicking on an svg path to get to the next page and get data from that page

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Beautifulsoup 和 selenium:单击 svg 路径进入下一页并从该页面获取数据 [英] beautifulsoup and selenium: clicking on an svg path to get to the next page and get data from that page

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭