Numpy数组列表到单个Numpy数组而不复制数据 [英] List of Numpy Arrays to single Numpy Array without Copying Data

查看:98
本文介绍了Numpy数组列表到单个Numpy数组而不复制数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Python OpenCV读取视频数据,并希望存储K个帧.当前,我具有执行以下操作(伪代码)的循环:

I'm reading in video data using Python OpenCV and want to store K number of frames. Currently, I have loop that does the following (pseudo code):

frame_list = 1:K
frame_buffer = list(map(ReadFrameNumber, frame_list))

我现在有一个列表,frame_buffer,长度为K帧,数据为NxMx3 numpy数组.一切都很好,但现在我想重组数据,以便可以有效地使用scikit-learn尝试一些模型.为此,我需要创建一个可以构造为((N * M * 3)x(K))或((K)x(N * M * 3)矩阵的numpy数组.可以成功执行此操作,但是正在复制数据,这使该功能非常慢.我使用numpy.ravelnumpy.asarraynumpy.transpose的组合来完成我的慢速方法,我本质上只是想要一个新视图数据.

I now have a list, frame_buffer, that is K frames in length with the data being an NxMx3 numpy array. This is all fine and dandy, but now I want to restructure the data so I can effectively use scikit-learn to try some models out. In order to do this, I need to create a numpy array that can be structured as an ((N*M*3) x (K)) or as a ((K) x (N*M*3) matrix. I can successfully do this, but the data is being copied which makes this the function VERY slow. I am using a combination of numpy.ravel, numpy.asarray, and numpy.transpose to accomplish my slow method. I essentially just want a new view to the data.

这是我现在正在做的事情,这不起作用(花费的时间太长):

Here is what I am doing now, and this is NOT working (it takes way too long):

def RearrangeData(data): 
   b = list(map(np.ravel, data))
   b = np.asarray(b, dtype=np.float32)
   return b

更新: 这就是我从opencv中读取框架的方法:

UPDATE: This is how I am reading frames from opencv:

import numpy as np
import cv2 

K= 10
num_frames = K
cap = cv2.VideoCapture(filename)
    def PopulateBuffer(num):
        cap.set(cv2.CAP_PROP_POS_FRAMES, num)
        ret, frame = cap.read()
        if not ret:
            print("Fatal Error: Could not read/decode frame %d" % num)
            exit(-1)
        return frame
frame_nums = list(range(0, int(num_frames)))
return (list(map(PopulateBuffer, frame_nums)))

推荐答案

所以我相信我已经解决了.

So I believe I figured it out.

  • 第一个错误:使用列表复制帧.我最终预分配了一个numpy数组:

  • First mistake: using a list to copy in frames. I ended up preallocating a numpy array:

 #Preallocate frame buffer                              
 frame_buffer = np.zeros((num_frames,) + frame.shape)
 # Copy frames
 for i in range(0, num_frames):                    
     frame_buffer[i, :, :, :] = PopulateBuffer(i)

  • 第二个错误:我没有意识到numpy.reshape()会创建一个新视图(我认为在大多数情况下).因此,只要我正确设置了初始阵列,就和执行以下操作一样简单.

  • Second mistake: I didn't realize that numpy.reshape() would create a new view (in most cases I think). So it was as simple as doing the following once I had my initial array setup correctly.

    buf_s = frame_buffer.shape
    K = buf_s[0] # Number of frames
    M = buf_s[1] # Number of rows
    N = buf_s[2] # Number of cols
    chan = buf_s[3] # Color channel
    
    # If this copies data, I'm screwed. 
    %time scikit_buffer = frame_buffer.reshape([K, M*N*chan])
    

  • 我很肯定它不会复制数据,因为reshape命令以微秒为单位运行:

    I'm positive it's not copying data because the reshape command runs on the order of micro seconds:

    CPU时间:用户17 µs,sys:1 µs,总计:18 µs 墙时间:21.9 µs

    CPU times: user 17 µs, sys: 1 µs, total: 18 µs Wall time: 21.9 µs

    现在我可以在scikit-learn中分析我的框架了!酷!

    And now I can analyze my frames in scikit-learn! Cool!

    这篇关于Numpy数组列表到单个Numpy数组而不复制数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆