将字符串转换为日期以numpy解压缩 [英] Converting string to date in numpy unpack

查看:174
本文介绍了将字符串转换为日期以numpy解压缩的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在学习如何从链接中提取数据,然后对它们进行图形化处理.

I'm learning how to extract data from links and then proceeding to graph them.

在本教程中,我使用的是股票的yahoo数据集.

For this tutorial, I was using the yahoo dataset of a stock.

代码如下


import matplotlib.pyplot as plt
import numpy as np
import urllib
import matplotlib.dates as mdates
import datetime

def bytespdate2num(fmt, encoding='utf-8'):
    strconverter = mdates.strpdate2num(fmt)
    def bytesconverter(b):
        s = b.decode(encoding)
        return strconverter(s)
    return bytesconverter


def graph_data(stock):
    stock_price_url = 'https://pythonprogramming.net/yahoo_finance_replacement'
    source_code = urllib.request.urlopen(stock_price_url).read().decode()

    stock_data = []
    split_source=source_code.split('\n')

    print(len(split_source))

    for line in split_source:
        split_line=line.split(',')
        if (len(split_line)==7):
            stock_data.append(line)


    date,openn,closep,highp,lowp,openp,volume=np.loadtxt(stock_data,delimiter=',',unpack=True,converters={0:bytespdate2num('%Y-%m-%d')})

    plt.plot_date(date,closep)
    plt.xlabel('x')
    plt.ylabel('y')
    plt.title('Graph')
    plt.show()

graph_data('TSLA')

除了使用bytesupdate2num函数将字符串数据类型转换为日期格式的部分之外,整个代码都非常容易理解.

The whole code is pretty easy to understand except the part of converting the string datatype into date format using bytesupdate2num function.

在numpy提取过程中,是否有更简便的方法将读取URL所提取的字符串转换为日期格式,或者我可以使用另一种方法.

Is there an easier way to convert strings extracted from reading a URL into date format during numpy extraction or is there another method I can use.

谢谢

推荐答案

对于csv格式的猜测,我可以使用numpy'native'datetime dtype:

With a guess as to the csv format, I can use the numpy 'native' datetime dtype:

In [183]: txt = ['2020-10-23 1 2.3']*3                                                                               
In [184]: txt                                                                                                        
Out[184]: ['2020-10-23 1 2.3', '2020-10-23 1 2.3', '2020-10-23 1 2.3']

如果我让genfromtxt做自己的dtype转换:

If I let genfromtxt do its own dtype conversions:

In [187]: np.genfromtxt(txt, dtype=None, encoding=None)                                                              
Out[187]: 
array([('2020-10-23', 1, 2.3), ('2020-10-23', 1, 2.3),
       ('2020-10-23', 1, 2.3)],
      dtype=[('f0', '<U10'), ('f1', '<i8'), ('f2', '<f8')])

日期列呈现为字符串.

the date column is rendered as a string.

如果我指定datetime64格式:

In [188]: np.array('2020-10-23', dtype='datetime64[D]')                                                              
Out[188]: array('2020-10-23', dtype='datetime64[D]')

In [189]: np.genfromtxt(txt, dtype=['datetime64[D]',int,float], encoding=None)                                       
Out[189]: 
array([('2020-10-23', 1, 2.3), ('2020-10-23', 1, 2.3),
       ('2020-10-23', 1, 2.3)],
      dtype=[('f0', '<M8[D]'), ('f1', '<i8'), ('f2', '<f8')])

此日期似乎在plt

In [190]: plt.plot_date(_['f0'], _['f1'])       

之所以使用genfromtxt是因为我对它处理dtypes的能力更加熟悉.

I used genfromtxt because I'm more familiar with its ability to handle dtypes.

这篇关于将字符串转换为日期以numpy解压缩的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆