带视频的HOGDescriptor来识别对象 [英] HOGDescriptor with videos to recognize objects

查看:83
本文介绍了带视频的HOGDescriptor来识别对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

不幸的是,我既是python也是openCV的初学者,所以如果问题很愚蠢,请原谅.

Unfortunately I am both a python and a openCV beginner, so wish to excuse me if the question is stupid.

我正在尝试使用cv2.HOGDescriptor识别视频中的对象.我担心逐帧识别(即没有跟踪).

I am trying to use a cv2.HOGDescriptor to recognize objects in a video. I am concerned with a frame-by-frame recognition (i.e. no tracking or so).

这是我在做什么:

  1. 我使用

  1. I read the video (currently a .mpg) by using

capture = cv.CreateFileCapture(video_path) #some path in which I have my video
#capturing frames
frame = cv.QueryFrame(capture) #returns cv2.cv.iplimage

  • 为了最终在框架上使用检测器(我将使用

  • In order to ultimately use the detector on the frames (which I would do by using

    found, w = hog.detectMultiScale(frame, winStride, padding, scale)
    

    )我认为我需要将framecv2.cv.iplimage转换为numpy.ndarray 我所做的

    ) I figured that I need to convert frame from cv2.cv.iplimage to numpy.ndarray which I did by

    tmp = cv.CreateImage(cv.GetSize(frame),8,3)
    cv.CvtColor(frame,tmp,cv.CV_BGR2RGB)
    
    ararr = np.asarray(cv.GetMat(tmp)).
    

  • 现在我出现以下错误:

        found, w = hog.detectMultiScale(ararr, winStride, padding, scale)
     TypeError: a float is required
    

    其中

        winStride=(8,8)
        padding=(32,32)
        scale=1.05
    

    我真的不明白哪个元素是真正的问题. IE.浮点数应该是哪个数字?

    I really can't understand which element is the real problem here. I.e. which number should be the float?

    任何帮助表示赞赏

    推荐答案

    不需要自己执行额外的转换,该问题与Python的新旧OpenCV绑定的混合有关.关于hog.detectMultiScale的另一个问题仅仅是由于参数排序不正确.

    There is no need to perform that extra conversion yourself, that problem is related to the mixing of the new and old OpenCV bindings for Python. The other problem regarding hog.detectMultiScale is simply due to incorrect parameter ordering.

    通过检查help(cv2.HOGDescriptor().detectMultiScale)可以直接看到第二个问题:

    The second problem can be directly seen by checking help(cv2.HOGDescriptor().detectMultiScale):

    detectMultiScale(img[, hitThreshold[, winStride[, padding[, 
               scale[, finalThreshold[, useMeanshiftGrouping]]]]]])
    

    如您所见,

    每个参数都是可选的,但第一个(图像)是可选的.排序也很重要,因为您有效地将winStride用作第一个,而预期将其用作第二个,依此类推.您可以使用命名参数来传递它. (所有这些都已经在前面的答案中观察到了.)

    as you can see, every parameter is optional but the first (the image). The ordering is also important, since you are effectively using winStride as the first, while it is expected to be the second, and so on. You can used named arguments to pass it. (All this has been observed in the earlier answer.)

    另一个问题是代码混合,这是您应考虑使用的示例代码:

    The other problem is the code mix, here is a sample code that you should consider using:

    import sys
    import cv2
    
    hog = cv2.HOGDescriptor()
    hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())
    hogParams = {'winStride': (8, 8), 'padding': (32, 32), 'scale': 1.05}
    
    video = cv2.VideoCapture(sys.argv[1])
    while True:
        ret, frame = video.read()
        if not ret:
            break
    
        result = hog.detectMultiScale(frame, **hogParams)
        print result
    

    这篇关于带视频的HOGDescriptor来识别对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆