pandas 和Matplotlib-需要按国家接种疫苗的百分比,并使用下拉菜单绘制特定国家首选疫苗的条形图 [英] Pandas and Matplotlib - Need vaccination percentage by country and bar plot for Preferred vaccine in specific country using dropdown

查看:77
本文介绍了 pandas 和Matplotlib-需要按国家接种疫苗的百分比,并使用下拉菜单绘制特定国家首选疫苗的条形图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是数据集。

    location    date    vaccine total_vaccinations
0   Austria 2021-01-08  Johnson&Johnson 0
1   Austria 2021-01-08  Moderna 0
2   Austria 2021-01-08  Oxford/AstraZeneca  0
3   Austria 2021-01-08  Pfizer/BioNTech 30938
4   Austria 2021-01-15  Johnson&Johnson 0
... ... ... ... ...
8633    Uruguay 2021-07-05  Pfizer/BioNTech 1024793
8634    Uruguay 2021-07-05  Sinovac 3045997
8635    Uruguay 2021-07-06  Oxford/AstraZeneca  43245
8636    Uruguay 2021-07-06  Pfizer/BioNTech 1038942
8637    Uruguay 2021-07-06  Sinovac 3079853
8638 rows × 4 columns

我在Jupyter笔记本上工作。

  1. 需要按国家/地区列出的疫苗接种百分比
  2. 使用下拉菜单(交互式绘图小工具)在特定国家/地区绘制首选疫苗的条形图

推荐答案

  • 您可以从OWID
  • 获取包含人口数据的COVID数据
  • 您似乎就是在这里按制造商获取数据的
  • 数据可以与整体COVID数据合并,这样您记录的所有属性都可用
  • 已图案化使用,因此隐藏/显示痕迹是交互式的
  • NB按制造商发布数据的国家并不多
import requests, io
import pandas as pd

# get data by manufactuerer
dfm = pd.read_csv(io.StringIO(
    requests.get("https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/vaccinations/vaccinations-by-manufacturer.csv").text))

# get all COVID data
dfall = pd.read_csv(io.StringIO(
    requests.get("https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/owid-covid-data.csv").text))

# join two datasets together and make manufactuerer data columns. NB not all countries publish this data...
dfv = (
    dfall.set_index(["location", "date"])
    .join(
        dfm.set_index(["location", "date", "vaccine"])
        .unstack("vaccine")
        .droplevel(0, 1),
        how="inner",
    )
    .reset_index()
)

# filter to latest data only
dfplot = (
    dfv.sort_values(["iso_code", "date"])
    .groupby("iso_code", as_index=False)
    .last()
    .sort_values("people_fully_vaccinated_per_hundred", ascending=False)
)

import plotly.express as px
import plotly.graph_objects as go

# use plotly so it's interactive.  rebase vaccines given by population
fig = px.bar(
    dfplot.assign(
        **{c: dfplot[c] / dfplot["population"] for c in dfm["vaccine"].unique()}
    ),
    x="location",
    y=dfm["vaccine"].unique(),
)
# add a line of people fully vaccinated
fig.add_trace(
    go.Scatter(
        x=dfplot["location"],
        y=dfplot["people_fully_vaccinated_per_hundred"] / 100,
        name="Fully vaccinated",
        mode="lines",
        line={"color": "purple", "width": 4},
    )
)

已更新

  • 原要求规定接种人数百分比为必填项。已根据评论将其删除
  • 要求确实已重新声明为交互式仪表板,因此使用了DASH
from jupyter_dash import JupyterDash
import dash_core_components as dcc
import dash_html_components as html
import dash_table
import dash_bootstrap_components as dbc
from dash.dependencies import Input, Output, State
import requests, io
import pandas as pd
import plotly.express as px

# get data by manufactuerer
dfm = pd.read_csv(io.StringIO(
    requests.get("https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/vaccinations/vaccinations-by-manufacturer.csv").text))


def buildTab(col="location"):
    dfc = pd.DataFrame({col: dfm[col].unique()})
    return dash_table.DataTable(
        id=col,
        columns=[{"name": c, "id": c} for c in dfc.columns],
        data=dfc.to_dict("records"),
        row_selectable="multi",
        style_header={"fontWeight": "bold"},
        style_as_list_view=True,
        css=[{"selector": ".dash-spreadsheet tr", "rule": "height: 5px;"}],
    )

# Build App
app = JupyterDash(__name__, external_stylesheets=[dbc.themes.BOOTSTRAP])
app.layout = html.Div(
    [
        dbc.Row(
            [
                dbc.Col(
                    buildTab(col="location"),
                    width=3,
                    style={"height": "20vh", "overflow-y": "auto"},
                ),
                dbc.Col(
                    buildTab(col="vaccine"),
                    width=3,
                    style={"height": "20vh", "overflow-y": "auto"},
                ),
            ],
        ),
        html.Div(id="graphs"),
    ],
    style={
        "font-family": "Arial",
        "font-size": "0.9em",
    },
)

@app.callback(
    Output(component_id="graphs", component_property="children"),
    Input("location", "selected_rows"),
    Input("vaccine", "selected_rows"),
    State("location", "data"),
    State("vaccine", "data"),
)
def updateGraphs(selected_location, selected_vaccine, location, vaccine):
    global dfm
    if selected_location and selected_vaccine:
        d = dfm.merge(
            pd.DataFrame(location).iloc[selected_location], on="location", how="inner"
        ).merge(pd.DataFrame(vaccine).iloc[selected_vaccine], on="vaccine", how="inner")
        return dcc.Graph(
            figure=px.bar(
                d.sort_values(["location", "vaccine", "date"])
                .groupby(["location", "vaccine"], as_index=False)
                .last(),
                x="location",
                y="total_vaccinations",
                color="vaccine",
            )
        )
    else:
        return None

# Run app and display result inline in the notebook
app.run_server(mode="inline")

这篇关于 pandas 和Matplotlib-需要按国家接种疫苗的百分比,并使用下拉菜单绘制特定国家首选疫苗的条形图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆