错误:根据psutils,使用17%-21%时超出了内存限制 [英] Error: memory limit exceeded when 17%-21% is used, according to psutils

查看:68
本文介绍了错误:根据psutils,使用17%-21%时超出了内存限制的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用以下要求来部署具有某些密集计算功能的Cloud Function:

I am deploying a Cloud Function with some intensive computing, using the following requirements:

requirements.txt 

google-cloud-storage
google-cloud-datastore 
numpy==1.16.2
pandas==0.24.2
scikit-image==0.16.1
psutil
memory-profiler==0.55.0
scikit-learn==0.20.3
opencv-python==4.0.0.21

我为部署设置了以下参数:

I have set the following arguments for deployment:

[--memory: "2147483648", --runtime: "python37", --timeout: "540", --trigger-http: "True", --verbosity: "debug"]

随着该函数迭代处理帧,使用率会增加,但是当达到18%-21%时,它将以a停止:

As the function iterates processing frames, the usage increases, but when reaching 18% - 21%, it stops with a:

错误:超出了内存限制.函数调用被中断.

"Error: memory limit exceeded. Function invocation was interrupted.

使用psutils跟踪代码,在调用函数的开头,我得到以下输出(来自函数日志):

Using psutils to make traces of the code, at the beginning of the call function I have this output (from the function's logs):

"svmem(总数= 2147483648,可用= 1882365952,百分比= 12.3,已使用= 152969216,免费= 1882365952,有效= 221151232,无效= 43954176,缓冲区= 0,缓存= 112148480,共享= 24240128,平板= 0)"

"svmem(total=2147483648, available=1882365952, percent=12.3, used=152969216, free=1882365952, active=221151232, inactive=43954176, buffers=0, cached=112148480, shared=24240128, slab=0)"

根据我的理解,这应该意味着一开始只使用了12.3%.这是有道理的,因为代码包本身(包含一些二进制文件)加上原始视频块一起使用的都是100MB,我认为根据上述要求进行的所有安装都可能会使用额外的160MB.

This should mean, as for my understanding, that only 12.3% is being used at the beginning. It makes sense, as the code packet itself (containing some binaries) plus the raw video chunks all together use 100MB, and I assume that all the installs from the requirements above may use an extra 160MB.

大约15次迭代后,这是psutil的踪迹:

After about 15 iterations, this is the trace of psutil:

svmem(总数= 2147483648,可用= 1684045824,百分比= 21.6,已使用= 351272960,免费= 1684045824,有效= 419463168,无效= 43962368,缓冲区= 0,缓存= 112164864,共享= 24240128,平板= 0)

svmem(total=2147483648, available=1684045824, percent=21.6, used=351272960, free=1684045824, active=419463168, inactive=43962368, buffers=0, cached=112164864, shared=24240128, slab=0)

然后,函数被中止.

这是代码停止的功能:

    def capture_to_array(self, capture):
        """
        Function to convert OpenCV video capture to a list of
        numpy arrays for faster processing and analysis
        """

        # List of numpy arrays
        frame_list = []
        frame_list_hd = []
        i = 0
        pixels = 0
        # Iterate through each frame in the video
        while capture.isOpened():

            # Read the frame from the capture
            ret_frame, frame = capture.read()

            # If read successful, then append the retrieved numpy array to a python list
            if ret_frame:
                i += 1
                # Count the number of pixels
                height = frame.shape[1]
                width = frame.shape[0]
                pixels += height * width

                # Add the frame to the list if it belong to the random sampling list
                if i in self.random_sampler:
                    # Change color space to have only luminance
                    frame = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)[:, :, 2]
                    # Resize the frame 
                    if frame.shape[0] != 1920:
                        frame_hd = cv2.resize(frame, (1920, 1080), interpolation=cv2.INTER_LINEAR)
                    else:
                        frame_hd = frame

                    frame_list_hd.append(frame_hd)

                    frame = cv2.resize(frame, (480, 270), interpolation=cv2.INTER_LINEAR)
                    frame_list.append(frame)
                    print('Frame size: {}, HD frame size: {}'.format(sys.getsizeof(frame), sys.getsizeof(frame_hd)), i)
                    print('Frame list size: {}, HD size: {}'.format(sys.getsizeof(frame_list), sys.getsizeof(frame_list_hd)), i)
                    print(psutil.virtual_memory())
            # Break the loop when frames cannot be taken from original
            else:
                break

        # Clean up memory
        capture.release()

        return np.array(frame_list), np.array(frame_list_hd), pixels, height, width

推荐答案

好.解决了.使用此功能后,将在以下功能中调用创建的框架列表:

Ok. Got it solved. After this function the created frame lists are called within the following function:

def compute(self, frame_list, frame_list_hd, path, dimensions, pixels):
        """
        Function to compare lists of numpy arrays extracting their corresponding metrics.
        It basically takes the global original list of frames and the input frame_list
        of numpy arrrays to extract the metrics defined in the constructor.
        frame_pos establishes the index of the frames to be compared.
        It is optimized by means of the ThreadPoolExecutor of Python's concurrent package
        for better parallel performance.
        """

        # Dictionary of metrics
        rendition_metrics = {}
        # Position of the frame
        frame_pos = 0
        # List of frames to be processed
        frames_to_process = []

        # Iterate frame by frame and fill a list with their values
        # to be passed to the ThreadPoolExecutor. Stop when maximum
        # number of frames has been reached.

        frames_to_process = range(len(frame_list)-1)
        print('computing')
        # Execute computations in parallel using as many processors as possible
        # future_list is a dictionary storing all computed values from each thread
        with ThreadPoolExecutor(max_workers=3) as executor:
            # Compare the original asset against its renditions
            future_list = {executor.submit(self.compare_renditions_instant,
                                           i,
                                           frame_list,
                                           frame_list_hd,
                                           dimensions,
                                           pixels,
                                           path): i for i in frames_to_process}

        # Once all frames in frame_list have been iterated, we can retrieve their values
        for future in future_list:
            # Values are retrieved in a dict, as a result of the executor's process
            result_rendition_metrics, frame_pos = future.result()
            # The computed values at a given frame

            rendition_metrics[frame_pos] = result_rendition_metrics

        # Return the metrics for the currently processed rendition
        return rendition_metrics

问题是,由于不带参数地调用了ThreadPoolExecutor(),因此它使用的是默认数量的worker(可用CPU数量的5倍,即2).这使许多帧对于内存来说太大了,因此饱和了我的系统.假设每个线程都输出自己的psutil数据,那么我将被自己的跟踪所误导.

Problem is, because of the ThreadPoolExecutor() was called with no arguments, it was using the default number of workers (5 times the number of available CPUs, which is 2). This was putting a number of frames too large for the memory, hence saturating my system. Provided that each thread was outputting its own psutil data, I was being misled by my own traces.

这篇关于错误:根据psutils,使用17%-21%时超出了内存限制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆