用numpy为RNN准备数据的最快方法是什么? [英] What is the fastest way to prepare data for RNN with numpy?

查看：105 发布时间：2020/5/4 6:19:57 python performance numpy machine-learning lstm

本文介绍了用numpy为RNN准备数据的最快方法是什么?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我目前有一个(1631160,78) np数组作为我对神经网络的输入.我想用LSTM尝试一些需要3D结构作为输入数据的东西.我目前正在使用以下代码来生成所需的3D结构，但是它非常慢(ETA> 1day).有没有更好的方法来用numpy做到这一点?

I currently have a (1631160,78) np array as my input to a neural network. I would like to try something with LSTM which requires a 3D structure as input data. I'm currently using the following code to generate the 3D structure needed but it is super slow (ETA > 1day). Is there a better way to do this with numpy?

我当前用于生成数据的代码:

My current code to generate data:

def transform_for_rnn(input_x, input_y, window_size):
    output_x = None
    start_t = time.time()
    for i in range(len(input_x)):
        if i > 100 and i % 100 == 0:
            sys.stdout.write('\rTransform Data: %d/%d\tETA:%s'%(i, len(input_x), str(datetime.timedelta(seconds=(time.time()-start_t)/i * (len(input_x) - i)))))
            sys.stdout.flush()
        if output_x is None:
            output_x = np.array([input_x[i:i+window_size, :]])
        else:
            tmp = np.array([input_x[i:i+window_size, :]])
            output_x = np.concatenate((output_x, tmp))

    print
    output_y = input_y[window_size:]
    assert len(output_x) == len(output_y)
    return output_x, output_y

推荐答案

这是使用样品运行-

In [83]: input_x
Out[83]: 
array([[ 0.73089384,  0.98555845,  0.59818726],
       [ 0.08763718,  0.30853945,  0.77390923],
       [ 0.88835985,  0.90506367,  0.06204614],
       [ 0.21791334,  0.77523643,  0.47313278],
       [ 0.93324799,  0.61507976,  0.40587073],
       [ 0.49462016,  0.00400835,  0.66401908]])

In [84]: window_size = 4

In [85]: out
Out[85]: 
array([[[ 0.73089384,  0.98555845,  0.59818726],
        [ 0.08763718,  0.30853945,  0.77390923],
        [ 0.88835985,  0.90506367,  0.06204614],
        [ 0.21791334,  0.77523643,  0.47313278]],

       [[ 0.08763718,  0.30853945,  0.77390923],
        [ 0.88835985,  0.90506367,  0.06204614],
        [ 0.21791334,  0.77523643,  0.47313278],
        [ 0.93324799,  0.61507976,  0.40587073]],

       [[ 0.88835985,  0.90506367,  0.06204614],
        [ 0.21791334,  0.77523643,  0.47313278],
        [ 0.93324799,  0.61507976,  0.40587073],
        [ 0.49462016,  0.00400835,  0.66401908]]])

这将创建输入数组的视图，因此在内存方面，我们正在提高效率.在大多数情况下，这需要进一步的操作，也可以转化为性能上的好处.让我们验证它的确是一个视图-

This creates a view into the input array and as such memory-wise we are being efficient. In most cases, this should translate to benefits on performance too with further operations involving it. Let's verify that its a view indeed -

In [86]: np.may_share_memory(out,input_x)
Out[86]: True   # Doesn't guarantee, but is sufficient in most cases

另一种肯定的验证方法是将一些值设置为output并检查输入-

Another sure-shot way to verify would be to set some values into output and check the input -

In [87]: out[0] = 0

In [88]: input_x
Out[88]: 
array([[ 0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ],
       [ 0.93324799,  0.61507976,  0.40587073],
       [ 0.49462016,  0.00400835,  0.66401908]])

这篇关于用numpy为RNN准备数据的最快方法是什么?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

用numpy为RNN准备数据的最快方法是什么? [英] What is the fastest way to prepare data for RNN with numpy?

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

用numpy为RNN准备数据的最快方法是什么? [英] What is the fastest way to prepare data for RNN with numpy?

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭