将字符串转换为日期以numpy解压缩 [英] Converting string to date in numpy unpack
问题描述
我正在学习如何从链接中提取数据,然后对它们进行图形化处理.
I'm learning how to extract data from links and then proceeding to graph them.
在本教程中,我使用的是股票的yahoo数据集.
For this tutorial, I was using the yahoo dataset of a stock.
代码如下
import matplotlib.pyplot as plt
import numpy as np
import urllib
import matplotlib.dates as mdates
import datetime
def bytespdate2num(fmt, encoding='utf-8'):
strconverter = mdates.strpdate2num(fmt)
def bytesconverter(b):
s = b.decode(encoding)
return strconverter(s)
return bytesconverter
def graph_data(stock):
stock_price_url = 'https://pythonprogramming.net/yahoo_finance_replacement'
source_code = urllib.request.urlopen(stock_price_url).read().decode()
stock_data = []
split_source=source_code.split('\n')
print(len(split_source))
for line in split_source:
split_line=line.split(',')
if (len(split_line)==7):
stock_data.append(line)
date,openn,closep,highp,lowp,openp,volume=np.loadtxt(stock_data,delimiter=',',unpack=True,converters={0:bytespdate2num('%Y-%m-%d')})
plt.plot_date(date,closep)
plt.xlabel('x')
plt.ylabel('y')
plt.title('Graph')
plt.show()
graph_data('TSLA')
除了使用bytesupdate2num函数将字符串数据类型转换为日期格式的部分之外,整个代码都非常容易理解.
The whole code is pretty easy to understand except the part of converting the string datatype into date format using bytesupdate2num function.
在numpy提取过程中,是否有更简便的方法将读取URL所提取的字符串转换为日期格式,或者我可以使用另一种方法.
Is there an easier way to convert strings extracted from reading a URL into date format during numpy extraction or is there another method I can use.
谢谢
推荐答案
对于csv格式的猜测,我可以使用numpy
'native'datetime dtype:
With a guess as to the csv format, I can use the numpy
'native' datetime dtype:
In [183]: txt = ['2020-10-23 1 2.3']*3
In [184]: txt
Out[184]: ['2020-10-23 1 2.3', '2020-10-23 1 2.3', '2020-10-23 1 2.3']
如果我让genfromtxt
做自己的dtype
转换:
If I let genfromtxt
do its own dtype
conversions:
In [187]: np.genfromtxt(txt, dtype=None, encoding=None)
Out[187]:
array([('2020-10-23', 1, 2.3), ('2020-10-23', 1, 2.3),
('2020-10-23', 1, 2.3)],
dtype=[('f0', '<U10'), ('f1', '<i8'), ('f2', '<f8')])
日期列呈现为字符串.
the date column is rendered as a string.
如果我指定datetime64
格式:
In [188]: np.array('2020-10-23', dtype='datetime64[D]')
Out[188]: array('2020-10-23', dtype='datetime64[D]')
In [189]: np.genfromtxt(txt, dtype=['datetime64[D]',int,float], encoding=None)
Out[189]:
array([('2020-10-23', 1, 2.3), ('2020-10-23', 1, 2.3),
('2020-10-23', 1, 2.3)],
dtype=[('f0', '<M8[D]'), ('f1', '<i8'), ('f2', '<f8')])
此日期似乎在plt
In [190]: plt.plot_date(_['f0'], _['f1'])
之所以使用genfromtxt
是因为我对它处理dtypes的能力更加熟悉.
I used genfromtxt
because I'm more familiar with its ability to handle dtypes.
这篇关于将字符串转换为日期以numpy解压缩的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!