错误:根据psutils,使用17%-21%时超出了内存限制 [英] Error: memory limit exceeded when 17%-21% is used, according to psutils
问题描述
我正在使用以下要求来部署具有某些密集计算功能的Cloud Function:
I am deploying a Cloud Function with some intensive computing, using the following requirements:
requirements.txt
google-cloud-storage
google-cloud-datastore
numpy==1.16.2
pandas==0.24.2
scikit-image==0.16.1
psutil
memory-profiler==0.55.0
scikit-learn==0.20.3
opencv-python==4.0.0.21
我为部署设置了以下参数:
I have set the following arguments for deployment:
[--memory: "2147483648", --runtime: "python37", --timeout: "540", --trigger-http: "True", --verbosity: "debug"]
随着该函数迭代处理帧,使用率会增加,但是当达到18%-21%时,它将以a停止:
As the function iterates processing frames, the usage increases, but when reaching 18% - 21%, it stops with a:
错误:超出了内存限制.函数调用被中断.
"Error: memory limit exceeded. Function invocation was interrupted.
使用psutils跟踪代码,在调用函数的开头,我得到以下输出(来自函数日志):
Using psutils to make traces of the code, at the beginning of the call function I have this output (from the function's logs):
"svmem(总数= 2147483648,可用= 1882365952,百分比= 12.3,已使用= 152969216,免费= 1882365952,有效= 221151232,无效= 43954176,缓冲区= 0,缓存= 112148480,共享= 24240128,平板= 0)"
"svmem(total=2147483648, available=1882365952, percent=12.3, used=152969216, free=1882365952, active=221151232, inactive=43954176, buffers=0, cached=112148480, shared=24240128, slab=0)"
根据我的理解,这应该意味着一开始只使用了12.3%.这是有道理的,因为代码包本身(包含一些二进制文件)加上原始视频块一起使用的都是100MB,我认为根据上述要求进行的所有安装都可能会使用额外的160MB.
This should mean, as for my understanding, that only 12.3% is being used at the beginning. It makes sense, as the code packet itself (containing some binaries) plus the raw video chunks all together use 100MB, and I assume that all the installs from the requirements above may use an extra 160MB.
大约15次迭代后,这是psutil的踪迹:
After about 15 iterations, this is the trace of psutil:
svmem(总数= 2147483648,可用= 1684045824,百分比= 21.6,已使用= 351272960,免费= 1684045824,有效= 419463168,无效= 43962368,缓冲区= 0,缓存= 112164864,共享= 24240128,平板= 0)
svmem(total=2147483648, available=1684045824, percent=21.6, used=351272960, free=1684045824, active=419463168, inactive=43962368, buffers=0, cached=112164864, shared=24240128, slab=0)
然后,函数被中止.
这是代码停止的功能:
def capture_to_array(self, capture):
"""
Function to convert OpenCV video capture to a list of
numpy arrays for faster processing and analysis
"""
# List of numpy arrays
frame_list = []
frame_list_hd = []
i = 0
pixels = 0
# Iterate through each frame in the video
while capture.isOpened():
# Read the frame from the capture
ret_frame, frame = capture.read()
# If read successful, then append the retrieved numpy array to a python list
if ret_frame:
i += 1
# Count the number of pixels
height = frame.shape[1]
width = frame.shape[0]
pixels += height * width
# Add the frame to the list if it belong to the random sampling list
if i in self.random_sampler:
# Change color space to have only luminance
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)[:, :, 2]
# Resize the frame
if frame.shape[0] != 1920:
frame_hd = cv2.resize(frame, (1920, 1080), interpolation=cv2.INTER_LINEAR)
else:
frame_hd = frame
frame_list_hd.append(frame_hd)
frame = cv2.resize(frame, (480, 270), interpolation=cv2.INTER_LINEAR)
frame_list.append(frame)
print('Frame size: {}, HD frame size: {}'.format(sys.getsizeof(frame), sys.getsizeof(frame_hd)), i)
print('Frame list size: {}, HD size: {}'.format(sys.getsizeof(frame_list), sys.getsizeof(frame_list_hd)), i)
print(psutil.virtual_memory())
# Break the loop when frames cannot be taken from original
else:
break
# Clean up memory
capture.release()
return np.array(frame_list), np.array(frame_list_hd), pixels, height, width
推荐答案
好.解决了.使用此功能后,将在以下功能中调用创建的框架列表:
Ok. Got it solved. After this function the created frame lists are called within the following function:
def compute(self, frame_list, frame_list_hd, path, dimensions, pixels):
"""
Function to compare lists of numpy arrays extracting their corresponding metrics.
It basically takes the global original list of frames and the input frame_list
of numpy arrrays to extract the metrics defined in the constructor.
frame_pos establishes the index of the frames to be compared.
It is optimized by means of the ThreadPoolExecutor of Python's concurrent package
for better parallel performance.
"""
# Dictionary of metrics
rendition_metrics = {}
# Position of the frame
frame_pos = 0
# List of frames to be processed
frames_to_process = []
# Iterate frame by frame and fill a list with their values
# to be passed to the ThreadPoolExecutor. Stop when maximum
# number of frames has been reached.
frames_to_process = range(len(frame_list)-1)
print('computing')
# Execute computations in parallel using as many processors as possible
# future_list is a dictionary storing all computed values from each thread
with ThreadPoolExecutor(max_workers=3) as executor:
# Compare the original asset against its renditions
future_list = {executor.submit(self.compare_renditions_instant,
i,
frame_list,
frame_list_hd,
dimensions,
pixels,
path): i for i in frames_to_process}
# Once all frames in frame_list have been iterated, we can retrieve their values
for future in future_list:
# Values are retrieved in a dict, as a result of the executor's process
result_rendition_metrics, frame_pos = future.result()
# The computed values at a given frame
rendition_metrics[frame_pos] = result_rendition_metrics
# Return the metrics for the currently processed rendition
return rendition_metrics
问题是,由于不带参数地调用了ThreadPoolExecutor(),因此它使用的是默认数量的worker(可用CPU数量的5倍,即2).这使许多帧对于内存来说太大了,因此饱和了我的系统.假设每个线程都输出自己的psutil数据,那么我将被自己的跟踪所误导.
Problem is, because of the ThreadPoolExecutor() was called with no arguments, it was using the default number of workers (5 times the number of available CPUs, which is 2). This was putting a number of frames too large for the memory, hence saturating my system. Provided that each thread was outputting its own psutil data, I was being misled by my own traces.
这篇关于错误:根据psutils,使用17%-21%时超出了内存限制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!