车辆细分和跟踪 [英] Vehicle segmentation and tracking

查看：108 发布时间：2020/5/20 20:05:02 opencv tracking classification image-segmentation object-detection

本文介绍了车辆细分和跟踪的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已经从事了一段时间的项目，以检测和跟踪从无人机获取的视频中的车辆，目前，我正在使用经过SVM训练的SVM，该功能对从车辆和汽车上提取的局部特征进行特征包表示背景图像.然后，我正在使用滑动窗口检测方法来尝试对图像中的车辆进行定位，然后对其进行跟踪.问题在于这种方法的运行速度远远不够，而且我的检测器不如我希望的那样可靠，因此我收到了很多误报.

I've been working on a project for some time, to detect and track (moving) vehicles in video captured from UAV's, currently I am using an SVM trained on bag-of-feature representations of local features extracted from vehicle and background images. I am then using a sliding window detection approach to try and localise vehicles in the images, which I would then like to track. The problem is that this approach is far to slow and my detector isn't as reliable as I would like so I'm getting quite a few false positives.

因此，我一直在考虑尝试从背景中分割汽车以找到大概位置，以便在应用分类器之前减少搜索空间，但是我不确定如何进行此操作，希望有人可以提供帮助?

So I have been considering attempting to segment the cars from the background to find the approximate position so to reduce the search space before applying my classifier, but I am not sure how to go about this, and was hoping someone could help?

此外，我一直在阅读有关使用图层进行运动分割的信息，使用光流按流模型对帧进行分割，是否有人对此方法有任何经验，如果可以的话，您是否可以提供一些输入，例如您是否认为此方法会适用于我的问题.

Additionally, I have been reading about motion segmentation with layers, using optical flow to segment the frame by flow model, does anyone have any experience with this method, if so could you offer some input to as whether you think this method would be applicable for my problem.

下面是示例视频中的两帧

Below is two frames from a sample video

帧0:

第5帧:

推荐答案

假设您的汽车正在行驶，您可以尝试估算地平面(道路).

Assumimg your cars are moving, you could try to estimate the ground plane (road).

您可以通过提取特征(对于速度，使用SURF而不是SIFT进行速度)，将其与帧对进行匹配，并使用RANSAC求解单应性来获得下降的地平面估算值，因为3d平面根据两个摄像机之间的单应性运动框架.

You may get a descent ground plane estimate by extracting features (SURF rather than SIFT, for speed), matching them over frame pairs, and solving for a homography using RANSAC, since plane in 3d moves according to a homography between two camera frames.

一旦您有了地面飞机，就可以通过根据估计的单应性查看不会移动的像素簇来识别汽车.

Once you have your ground plane you can identify the cars by looking at clusters of pixels that don't move according to the estimated homography.

一种更复杂的方法是在地形上进行运动构造.这仅是假设它是刚性的，而不是平面的.

A more sophisticated approach would be to do Structure from Motion on the terrain. This only presupposes that it is rigid, and not that it it planar.

更新

我想知道您是否可以继续研究如何根据估计的单应性寻找不会移动的像素簇?

I was wondering if you could expand on how you would go about looking for clusters of pixels that don't move according to the estimated homography?

好的.假设I和K是两个视频帧，并且H是I中的单应性映射特征到K中的特征.首先，根据H将I扭曲到K上，即将扭曲的图像Iw计算为Iw( [x y]' )=I( inv(H)[x y]' )(大致为Matlab表示法).然后，您查看平方或绝对差图像Diff=(Iw-K)*(Iw-K).根据单应性H移动的图像内容应具有很小的差异(假定图像之间的照明和曝光恒定).违反H的图像内容(例如动车)应突出显示.

Sure. Say I and K are two video frames and H is the homography mapping features in I to features in K. First you warp I onto K according to H, i.e. you compute the warped image Iw as Iw( [x y]' )=I( inv(H)[x y]' ) (roughly Matlab notation). Then you look at the squared or absolute difference image Diff=(Iw-K)*(Iw-K). Image content that moves according to the homography H should give small differences (assuming constant illumination and exposure between the images). Image content that violates H such as moving cars should stand out.

要在Diff中聚类高误差像素组，我将从简单的阈值处理开始("Diff中大于X的每个像素差都是相关的"，也许使用自适应阈值).可以使用形态学操作(膨胀，腐蚀)清理阈值图像，并与连接的组件聚类.这可能太简单了，但是第一次尝试就很容易实现，而且应该很快.有关更多信息，请参见在Wikipedia中进行聚类. 2D 高斯混合模型可能很有趣；当您使用前一帧的检测结果对其进行初始化时，它应该非常快.

For clustering high-error pixel groups in Diff I would start with simple thresholding ("every pixel difference in Diff larger than X is relevant", maybe using an adaptive threshold). The thresholded image can be cleaned up with morphological operations (dilation, erosion) and clustered with connected components. This may be too simplistic, but its easy to implement for a first try, and it should be fast. For something more fancy look at Clustering in Wikipedia. A 2D Gaussian Mixture Model may be interesting; when you initialize it with the detection result from the previous frame it should be pretty fast.

我对您提供的两个框架进行了一些实验，不得不说我对它的效果感到有些惊讶. :-) 左图:您发布的两个框架之间的差异(以颜色区分). 右图:将帧与单应性进行匹配后，帧之间的差异.其余的区别显然是行驶中的汽车，它们足够坚固，可以轻松进行阈值设置.

I did a little experiment with the two frames you provided, and I have to say I am somewhat surprised myself how well it works. :-) Left image: Difference (color coded) between the two frames you posted. Right image: Difference between the frames after matching them with a homography. The remaining differences clearly are the moving cars, and they are sufficiently strong for simple thresholding.

考虑到您当前使用的方法，将其与我的建议结合起来可能会很有趣:

Thinking of the approach you currently use, it may be intersting combining it with my proposal:

您可以尝试在差异图像D中而不是原始图像中学习和分类汽车.这将等于了解汽车的运动模式而不是汽车的外观，这可能更可靠.
您可以摆脱昂贵的窗口搜索，仅在D具有足够高值的区域上运行分类器.

You could try to learn and classify the cars in the difference image D instead of the original image. This would amount to learning what a car motion pattern looks like rather than what a car looks like, which could be more reliable.
You could get rid of the expensive window search and run the classifier only on regions of D with sufficiently high value.

一些补充说明:

从理论上讲，由于它们不平坦，因此即使它们不动也应该脱颖而出，但是考虑到您与场景的距离和摄像机的分辨率，这种效果可能太微妙了.
如果愿意，可以用 Optical Flow 代替我提案的特征提取/匹配部分.这相当于识别从地面的一致帧到帧运动中突出"的流向量.但是，光流中可能会出现异常值.您也可以尝试从流向量中获取单应性.
这很重要::无论使用哪种方法，一旦在同一帧中找到了汽车，就应该使用此信息来加强对连续帧中这些汽车的搜索，从而提高了可能性检测值接近旧值(卡尔曼滤波器等).这就是跟踪的全部内容！

In theory, the cars should even stand out if they are not moving since they are not flat, but given your distance to the scene and camera resolution this effect may be too subtle.
You can replace the feature extraction / matching part of my proposal with Optical Flow, if you like. This amounts to identifying flow vectors that "stick out" from a consistent frame-to-frame motion of the ground. It may be prone to outliers in the optical flow, however. You can also try to get the homography from the flow vectors.
This is important: Regardless of which method you use, once you have found cars in one frame you should use this information to robustify your search of these cars in consecutive frame, giving a higher likelyhood to detections close to the old ones (Kalman filter, etc). That's what tracking is all about!

这篇关于车辆细分和跟踪的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

车辆细分和跟踪 [英] Vehicle segmentation and tracking

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

车辆细分和跟踪 [英] Vehicle segmentation and tracking

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭