OpenCV/Python:用于实时面部识别的多线程 [英] OpenCV / Python : multi-threading for live facial recognition

查看:400
本文介绍了OpenCV/Python:用于实时面部识别的多线程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用OpenCv和Dlib执行人脸识别(带有地标),并通过网络摄像头流进行直播.语言是 Python .它可以在我的Macbook笔记本电脑上正常工作,但我需要它从24/7全天候台式计算机运行.该计算机是运行Debian Jessie的PCIntel®Core™2 Quad CPU Q6600 @ 2.40GHz 32bit. 性能大幅下降:由于处理,延迟了10秒!

I'm using OpenCv and Dlib to execute facial recognition w/ landmarks, live from the webcam stream. The language is Python. It works fine on my macbook laptop, but I need it to run from a desktop computer 24/7. The computer is a PC Intel® Core™2 Quad CPU Q6600 @ 2.40GHz 32bit running Debian Jessie. The drop in performance is drastic : there is a 10 seconds delay due to processing !

因此,我研究了多线程以提高性能:

I therefore looked into multi-threading to gain performance :

  1. 我首先通过OpenCv尝试了示例代码,结果很棒!所有四个内核的命中率均达到100%,并且性能要好得多.
  2. 然后我用我的代码替换了帧处理代码,这根本没有提高性能!只有一个核心达到100%,其他核心保持非常低. 我什至认为启用多线程会更糟.

我从dlib示例代码中获得了面部标志性代码.我知道它可能可以优化,但是我想了解为什么我的旧计算机无法完全使用多线程功能?

I got the facial landmark code from the dlib sample code. I know it can probably be optimized, but I want to understand why am I not able to use my (old) computer's full power with multi-threading ?

我将代码放在下面,非常感谢您阅读:)

I'll drop my code below, thanks a lot for reading :)

from __future__ import print_function

import numpy as np
import cv2
import dlib

from multiprocessing.pool import ThreadPool
from collections import deque

from common import clock, draw_str, StatValue
import video

class DummyTask:
    def __init__(self, data):
        self.data = data
    def ready(self):
        return True
    def get(self):
        return self.data

if __name__ == '__main__':
    import sys

    print(__doc__)

    try:
        fn = sys.argv[1]
    except:
        fn = 0
    cap = video.create_capture(fn)
    
    #Face detector
    detector = dlib.get_frontal_face_detector()

    #Landmarks shape predictor 
    predictor = dlib.shape_predictor("landmarks/shape_predictor_68_face_landmarks.dat")

    # This is where the facial detection takes place
    def process_frame(frame, t0, detector, predictor):
        # some intensive computation...
        gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
        clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
        clahe_image = clahe.apply(gray)
        detections = detector(clahe_image, 1)
        for k,d in enumerate(detections): 
            shape = predictor(clahe_image, d) 
            for i in range(1,68): #There are 68 landmark points on each face
               cv2.circle(frame, (shape.part(i).x, shape.part(i).y), 1, (0,0,255), thickness=2)
        return frame, t0

    threadn = cv2.getNumberOfCPUs()
    pool = ThreadPool(processes = threadn)
    pending = deque()

    threaded_mode = True

    latency = StatValue()
    frame_interval = StatValue()
    last_frame_time = clock()
    while True:
        while len(pending) > 0 and pending[0].ready():
            res, t0 = pending.popleft().get()
            latency.update(clock() - t0)
            draw_str(res, (20, 20), "threaded      :  " + str(threaded_mode))
            draw_str(res, (20, 40), "latency        :  %.1f ms" % (latency.value*1000))
            draw_str(res, (20, 60), "frame interval :  %.1f ms" % (frame_interval.value*1000))
            cv2.imshow('threaded video', res)
        if len(pending) < threadn:
            ret, frame = cap.read()
            t = clock()
            frame_interval.update(t - last_frame_time)
            last_frame_time = t
            if threaded_mode:
                task = pool.apply_async(process_frame, (frame.copy(), t, detector, predictor))
            else:
                task = DummyTask(process_frame(frame, t, detector, predictor))
            pending.append(task)
        ch = cv2.waitKey(1)
        if ch == ord(' '):
            threaded_mode = not threaded_mode
        if ch == 27:
            break
cv2.destroyAllWindows()

推荐答案

性能问题归因于dlib的编译错误. 请勿使用 pip install dlib,与正确的编译相比,它出于某种原因运行非常缓慢.这样,我从几乎10秒钟的延迟变为了大约2秒钟.所以最后我不需要多线程/处理,但是我正在努力以进一步提高速度.感谢您的帮助:)

Performance issue was due to a bad compilation of dlib. Do not use pip install dlib which runs very very slowly for some reason compared to the proper compilation. I went from almost 10 seconds lag to about 2 seconds this way. So finally I didn't need multi-threading/processing, but I'm working on it to enhance the speed even more. Thanks for the help :)

这篇关于OpenCV/Python:用于实时面部识别的多线程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆