与Python相比,C#中的OpenCV MatchTemplate太慢了 [英] OpenCV MatchTemplate in C# is too slow compared to Python

查看:109
本文介绍了与Python相比,C#中的OpenCV MatchTemplate太慢了的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经用Python编写了一个解决方案,效果很好,但是需要安装多个库并需要大量管理工作。我决定在Visual Studio Community 2017上使用C#中的GUI来构建它,但是在第一个成功的函数中,结果要比Python慢​​得多。

I've programmed a solution in Python which worked great, but required several libraries to install and a lot of burocratic setup to work. I've decided to build it with a GUI in C# on Visual Studio Community 2017 but in the first successful function the result was way slower than in Python. Which IMO it should actually be faster.

该代码本质上是在大海捞针图像搜索中进行针刺工作,方法是从文件夹中获取所有图像并测试每根针刺(总计60个图像)在干草堆中,在python中我返回了字符串,但在C#中我只打印了。

The code essentially is just doing a needle in a haystack image search, by getting all images from a folder and testing each needle (total 60 images) in a haystack, in python I return the string, but in C# I'm only printing.

我在Python中的代码如下:

My code in Python is the following:

def getImages(tela):
    retorno = []
    folder = 'Images'
    img_rgb = cv2.imread(tela)
    for filename in os.listdir(folder):
        template = cv2.imread(os.path.join(folder,filename))
        w, h = template.shape[:-1]
        res = cv2.matchTemplate(img_rgb, template, cv2.TM_CCOEFF_NORMED)
        threshold = .96
        loc = np.where(res >= threshold)
        if loc[0]>0:
            retorno.append(filename[0]+filename[1].lower())
            if len(retorno)> 1:
                return retorno

并在C#中:

Debug.WriteLine(ofd.FileName);
Image<Bgr, byte> source = new Image<Bgr, byte>(ofd.FileName);
string filepath = Directory.GetCurrentDirectory().ToString()+"\\Images";
DirectoryInfo d = new DirectoryInfo(filepath);
var files = d.GetFiles();
foreach (var fname in files){
    Image<Bgr, byte> template = new Image<Bgr, byte>(fname.FullName);
    Image<Gray, float> result = source.MatchTemplate(template, Emgu.CV.CvEnum.TemplateMatchingType.CcoeffNormed);
    double[] minValues, maxValues;
    Point[] minLocations, maxLocations;
    result.MinMax(out minValues, out maxValues, out minLocations, out maxLocations);
    if (maxValues[0] > 0.96) {
        Debug.WriteLine(fname);
    }
}

我没有测量每一个之间的时间间隔,但我可以说C#中的结果大约需要3秒钟,而Python中的结果大约需要100毫秒。

I didn't measure the time elapsed between each one, but I can say the result in C# takes about 3 seconds and in Python about 100ms.

还有优化的空间,如果有人想提出任何改进,他们

There is room for optimization, if anyone would like to suggest any improvements, they are welcome.

推荐答案

我已经结合了 denfromufa HouseCat 在下面的源代码中进行了整体清理,因此您可以看到您的代码如何。您还会注意到可读性方面的一些改进,因为我使用 C#7.0 / .NET 4.7 编写了重构代码。

I've combined the solutions proposed by denfromufa and HouseCat in the source code below, and did some overall cleanup, so you can see how your code could be. You will also notice minor readability improvements, since I wrote the refactored code using C# 7.0 / .NET 4.7.

实际算法优化

尽管 denfromula 正确指出了实现问题,并且 HouseCat 提到使用更多的CPU资源,真正的收益取决于减少图像搜索算法期间执行的操作数量。

Although denfromula correctly pointed out that implementation issue, and HouseCat mentioned using more CPU resources, the true gain relies on reducing the number of operations executed during your image search algorithm.


  • 涡轮阶段1-假设 MinMax()函数遍历图像的所有像素以收集所有这些统计信息,但是您只对使用 maxValue [0] 。极好的微调是编写一个特定的函数,当 maxValue [0] 低于最小阈值时,该函数将停止迭代图像的所有像素。显然,这就是您功能所需的一切。切记:切勿消耗所有处理器,以计算大量未使用的图像统计信息

  • TURBO STAGE 1 - Suppose the MinMax() function goes through all your image's pixels to collect all those statistics, but you are only interested in using maxValue[0]. An extreme fine tuning would be to write a specific function which stops iterating through all your image's pixels when maxValue[0] goes below your minimum threshold. Apparently, that's all you need in your function. Remember: never burn all your processors computing lots of unused image statistics.

TURBO STAGE 2-您似乎正在尝试识别图像集中的任何图像是否与您输入的屏幕截图( tela )匹配。如果没有太多要匹配的图像,并且您一直在检查屏幕上是否有新的匹配项,则强烈建议预先加载所有这些图像匹配对象,并在函数调用之间重用它们。 恒定的磁盘IO操作和实例化位图类(对于每个单个屏幕截图)都会导致性能大幅下降。

TURBO STAGE 2 - It looks like you are trying to recognize whether any image of your set of images matches your input screenshot (tela). If there are not too many images to be matched, and if you are constantly checking your screen for new matches, it is highly recommended to pre-load all those image match objects, and reuse them among your function calls. Constant disk IO operations and instantiating bitmap classes (for every single screenshot) leads to strong performance hit.

TURBO STAGE 3-以防万一您每秒要捕获多个屏幕截图,然后尝试重用屏幕截图的缓冲区。

TURBO STAGE 3 - Just in case you are taking several screenshots per second, then try to reuse the screenshot's buffer. Constantly reallocating the whole screenshot's buffer when its dimensions simply did not change also causes performance loss.

TURBO STAGE 4-不断地重新分配整个屏幕快照的缓冲区时,其尺寸根本不变,也会导致性能损失。 这很难实现,取决于您要为此投资多少。 将您的图像识别系统视为一个庞大的管道。位图是在各个阶段(图像匹配阶段,OCR阶段,鼠标位置绘画阶段,视频记录阶段等)之间流动的数据容器。这个想法是创建固定数量的容器并重用它们,避免它们的创建和破坏。容器的数量类似于管道系统的缓冲区大小。 当管道的几个阶段完成使用这些容器后,它们将返回到管道的开始,返回到一种容器池。



使用这些外部库确实很难实现最后一个优化,因为在大多数情况下,它们的API需要一些内部位图实例化,并进行微调还会在您的库与外部库之间造成极端的软件耦合。因此,您将不得不深入研究这些漂亮的库以了解它们的实际工作原理,并构建自己的自定义框架。我可以说这是一次很好的学习体验。


This last optimization this is really hard to achieve using these external libraries, because in most cases their API require some internal bitmap instantiation, and the fine tuning would also cause extreme software coupling between your library and the external one. So you will have to dig into these nice libraries to understand how they actually work, and build your own custom Framework. I can say it's a nice learning experience.

这些库在很多方面确实很酷;它们提供了通用API,以提高功能的可重用性。这也意味着它们在单个API调用中处理的内容远远超出您的实际需要。对于高性能算法,您应该始终重新考虑那些库中实现目标所需的基本功能是什么,如果它们是您的瓶颈,请自己解决。

Those libraries are really cool for many purposes; they provide a generic API for improved functionality re-usability. This also means they address much more stuff than you actually need in a single API call. When it comes to high performance algorithms, you should always re-think what is the essential functionality you need from those libraries to achieve your goal, and if they are your bottleneck, do it by yourself.

我可以说,一个很好的微调图像识别算法只需花费几毫秒即可完成您想要的操作。我曾经体验过图像识别应用程序,这些应用程序几乎可以瞬间完成较大的屏幕截图(例如 Eggplant Functional )。

I can say that a good fine-tuned image recognition algorithm doesn't take more than a few milliseconds to do what you want. I've experienced image recognition applications which do it almost instantaneously for larger screenshots (e.g. Eggplant Functional).

现在回到您的代码...

您的重构代码应该看起来像下面我没有包括我提到的所有那些经过微调的算法-您最好在SO中为它们分别提出问题。

Your refactored code should look like below. I did not include all those fine-tuned algorithms I've mentioned - you should better ask separate questions for them in SO.

        Image<Bgr, byte> source = new Image<Bgr, byte>(ofd.FileName);

        // Preferably use Path.Combine here:
        string dir = Path.Combine(Directory.GetCurrentDirectory(), "Images");

        // Check whether directory exists:
        if (!Directory.Exists(dir))
            throw new Exception($"Directory was not found: '{dir}'");

        // It looks like you just need filenames here...
        // Simple parallel foreach suggested by HouseCat (in 2.):
        Parallel.ForEach(Directory.GetFiles(dir), (fname) =>
        {
            Image<Gray, float> result = source.MatchTemplate(
                new Image<Bgr, byte>(fname.FullName),
                Emgu.CV.CvEnum.TemplateMatchingType.CcoeffNormed);

            // By using C# 7.0, we can do inline out declarations here:
            result.MinMax(
                out double[] minValues,
                out double[] maxValues,
                out Point[] minLocations,
                out Point[] maxLocations);

            if (maxValues[0] > 0.96)
            {
                // ...
                var result = ...
                return result; // <<< As suggested by: denfromufa
            }

            // ...
        });

快乐调音;-)

这篇关于与Python相比,C#中的OpenCV MatchTemplate太慢了的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆