xlabel 和 ylabel 值未在 matplotlib 散点图中排序 [英] xlabel and ylabel values are not sorted in matplotlib scatterplot

查看：66 发布时间：2021/6/1 19:16:39 python matplotlib scatter-plot

本文介绍了xlabel 和 ylabel 值未在 matplotlib 散点图中排序的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已经在互联网上进行了大量繁琐的搜索工作，而且似乎还无法弄清楚如何提出正确的问题来获得我想做什么的答案.

我正在尝试创建一个散点图，其中 市盈率 在 y 轴上，股息收益率 在 x 轴上.我将数据放入一个 CSV 文件中，然后将每一列作为单独的列表导入 Python.

我的散点图如下所示.我很困惑为什么 x 轴和 y 轴没有按数字排序.我认为我必须将列表中的元素转换为浮点数，然后在将其转换为散点图之前进行某种的操作.

我能想到的另一个选项是能够在创建散点图的过程中对值进行排序.

这些都没有奏效，我已经走到了死胡同.我们将不胜感激，因为我只能描述我的问题，但似乎无法在搜索中提出正确的问题.

  import csv导入matplotlib.pyplot作为pltetf_data = csv.reader(open('xlv_xlu_combined_td.csv'，'r'))对于我，在etf_data.iterrows()中的行:符号.附加(行 [0])index.append(row [1])股息.追加(行 [2])pe.append(row [3])符号.pop(0)index.pop(0)股息.流行音乐(0)pe.pop(0)索引 = [i.split('%', 1)[0] for i in index]红利_收益 = [d.split('%', 1)[0] 用于分红中的 d]pe_ratio = [p.split('X'，1)[0] for pe in pe]x =股息_收益[:5]y = pe_ratio[:5]plt.scatter(x, y, label='Healthcare P/E & Dividend', alpha=0.5)plt.xlabel('股息收益率')plt.ylabel('Pe 比率')plt.legend()plt.show()

xlv_xlu_combined_td.csv

 符号，索引，股利，peJNJ，10.11％，2.81％，263.00XUNH，7.27％，1.40％，21.93XPFE，6.48％，3.62％，10.19XMRK,4.96%,3.06%,104.92XABBV，4.43％，4.01％，23.86倍AMGN，3.86％，2.72％，60.93XMDT,3.50%,2.27%,38.10XABT,3.26%,1.78%,231.74X黄金，2.95％，2.93％，28.69倍BMY,2.72%,2.81%,97.81XTMO,2.55%,0.32%,36.98XLLY,2.49%,2.53%,81.83X

解决方案

问题在于值是 string 类型，因此它们以列表中给定的顺序绘制，而不是以数字顺序绘制.
这些值必须从末尾删除符号，然后将其转换为数字类型.

使用 `csv` 模块添加到现有代码

鉴于现有代码，

I have done tedious amounts of searching on the internet and it seems that I have not been able to figure out how to ask the right question to get the answer for what I want to do.

I am trying to create a scatterplot with P/E ratio on the y-axis and Dividend Yield on the x-axis. I put the data into a CSV file and then imported each column into Python as individual lists.

Here is how my scatterplot turns out below. I am confused why the x- and y- axes are not sorted numerically. I think I have to turn the elements within the list into floats and then do some sort of sort before turning it into a scatterplot.

The other option I can think of is being able to sort the values in the process of creating the scatterplot.

Neither of these have worked out and I have reached a dead end. Any help or pointing in the right direction would be much appreciated as I can only describe my problem, but don't seem to be able to be asking the right questions in my search.

import csv
import matplotlib.pyplot as plt

etf_data = csv.reader(open('xlv_xlu_combined_td.csv', 'r'))

for i, row in etf_data.iterrows():
    symbol.append(row[0])
    index.append(row[1])
    dividend.append(row[2])
    pe.append(row[3])

symbol.pop(0)
index.pop(0)
dividend.pop(0)
pe.pop(0)

indexes = [i.split('%', 1)[0] for i in index]
dividend_yield = [d.split('%', 1)[0] for d in dividend]
pe_ratio = [p.split('X', 1)[0] for p in pe]

x = dividend_yield[:5]
y = pe_ratio[:5]

plt.scatter(x, y, label='Healthcare P/E & Dividend', alpha=0.5)
plt.xlabel('Dividend yield')
plt.ylabel('Pe ratio')
plt.legend()
plt.show()

xlv_xlu_combined_td.csv

symbol,index,dividend,pe
JNJ,10.11%,2.81%,263.00X
UNH,7.27%,1.40%,21.93X
PFE,6.48%,3.62%,10.19X
MRK,4.96%,3.06%,104.92X
ABBV,4.43%,4.01%,23.86X
AMGN,3.86%,2.72%,60.93X
MDT,3.50%,2.27%,38.10X
ABT,3.26%,1.78%,231.74X
GILD,2.95%,2.93%,28.69X
BMY,2.72%,2.81%,97.81X
TMO,2.55%,0.32%,36.98X
LLY,2.49%,2.53%,81.83X

解决方案

The issue is that the values are string type, so they are plotted in the order given in the list, not in numeric order.
The values must have the symbols removed from the end, and then converted to a numeric type.

Add-on to existing code using `csv` module

Given the existing code, it would be easy to map() the values in the lists to a float type.

indexes = [i.split('%', 1)[0] for i in index]
dividend_yield = [d.split('%', 1)[0] for d in dividend]
pe_ratio = [p.split('X', 1)[0] for p in pe]

# add mapping values to floats after removing the symbols from the values
indexes = list(map(float, indexes))
dividend_yield = list(map(float, dividend_yield))
pe_ratio = list(map(float, pe_ratio))

# plot
x = dividend_yield[:5]
y = pe_ratio[:5]

plt.scatter(x, y, label='Healthcare P/E & Dividend', alpha=0.5)
plt.xlabel('Dividend yield')
plt.ylabel('Pe ratio')
plt.legend(bbox_to_anchor=(1, 1), loc='upper left')
plt.show()

Using `pandas`

Remove the symbol from the end of the strings in the columns with col.str[:-1]
Convert the columns to float type with .astype(float)
Using pandas v1.2.4 and matplotlib v3.3.4
This option reduces the required code from 23 lines to 4 lines.

import pandas as pd

# read the file
df = pd.read_csv('xlv_xlu_combined_td.csv')

# remove the symbols from the end of the number and set the columns to float type
df.iloc[:, 1:] = df.iloc[:, 1:].apply(lambda col: col.str[:-1]).astype(float)

# plot the first five rows of the two columns
ax = df.iloc[:5, 2:].plot(x='dividend', y='pe', kind='scatter', alpha=0.5,
                          ylabel='Dividend yield', xlabel='Pe ratio',
                          label='Healthcare P/E & Dividend')
ax.legend(bbox_to_anchor=(1, 1), loc='upper left')

Plot output of both implementations

Note the numbers are now ordered correctly.

这篇关于xlabel 和 ylabel 值未在 matplotlib 散点图中排序的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

xlabel 和 ylabel 值未在 matplotlib 散点图中排序 [英] xlabel and ylabel values are not sorted in matplotlib scatterplot

问题描述

使用 `csv` 模块添加到现有代码

Add-on to existing code using `csv` module

Using `pandas`

Plot output of both implementations

相关文章

Python最新文章

热门教程

热门工具

登录关闭

xlabel 和 ylabel 值未在 matplotlib 散点图中排序 [英] xlabel and ylabel values are not sorted in matplotlib scatterplot

问题描述

使用 csv 模块添加到现有代码

Add-on to existing code using csv module

Using pandas

Plot output of both implementations

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

使用 `csv` 模块添加到现有代码

Add-on to existing code using `csv` module

Using `pandas`

登录关闭