如何正确使用硬件加速的Media Foundation Source Reader解码视频? [英] How to properly use a hardware accelerated Media Foundation Source Reader to decode a video?

查看:465
本文介绍了如何正确使用硬件加速的Media Foundation Source Reader解码视频?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Media Foundation的Source Reader编写硬件加速的h264解码器,但遇到了问题。我遵循了本教程并通过Windows SDK Media Foundation示例为自己提供支持。

I'm in the process of writing a hardware accelerated h264 decoder using Media Foundation's Source Reader, but have encountered a problem. I followed this tutorial and supported myself with Windows SDK Media Foundation samples.

关闭硬件加速后,我的应用似乎可以正常工作,但是它没有提供我需要的性能。当我通过将 IMFDXGIDeviceManager 传递给用于创建读取器的 IMFAttributes 来打开加速时,事情变得很复杂。

My app seems to work fine when hardware acceleration is turned off, but it doesn't provide the performance I need. When I turn the acceleration on by passing a IMFDXGIDeviceManager to IMFAttributes used to create the reader, things get complicated.

如果我使用 D3D_DRIVER_TYPE_NULL 驱动程序创建 ID3D11Device 应用程序运行良好,并且在软件模式下处理帧的速度比在软件模式下更快,但是从CPU和GPU的使用情况来看,它仍然可以在CPU上完成大部分处理。

If I create the ID3D11Device using a D3D_DRIVER_TYPE_NULL driver, the app works fine and the frames are processed faster that in the software mode, but judging by the CPU and GPU usage it still does majority of the processing on CPU.

另一方面,当我使用 D3D_DRIVER_TYPE_HARDWARE 驱动程序创建 ID3D11Device 并运行应用程序时,可能会发生以下四种情况之一

On the other hand, when I create the ID3D11Device using a D3D_DRIVER_TYPE_HARDWARE driver and run the app, one of these four things can happen.


  1. IMFMediaBuffer:之前,我只能得到不可预测的帧数(通常是1-3)。 Lock 函数返回0x887a0005,其描述为 GPU设备实例已暂停。使用 GetDeviceRemovedReason 确定适当的操作。当我调用 ID3D11Device :: GetDeviceRemovedReason 时,得到0x887a0020,它被描述为驱动程序遇到问题,并被置于设备已删除状态,但没有

  1. I only get an unpredictable number of frames (usually 1-3) before IMFMediaBuffer::Lock function returns 0x887a0005 which is described as "The GPU device instance has been suspended. Use GetDeviceRemovedReason to determine the appropriate action". When I call ID3D11Device::GetDeviceRemovedReason, I get 0x887a0020 which is described as "The driver encountered a problem and was put into the device removed state" which isn't as helpful as I wish it to be.

应用程序在 IMFMediaBuffer :: Lock 调用中的外部dll中崩溃。似乎该dll取决于所使用的GPU。对于Intel集成GPU,它是igd10iumd32.dll,对于Nvidia移动GPU,它是mfplat.dll。此特定崩溃的消息如下:在decoder_tester.exe中的0x53C6DB8C(mfplat.dll)处引发了异常:0xC0000005:访问冲突读取位置0x00000024。执行之间的地址不同,有时涉及读取,有时涉及写入。

The app crashes in an external dll on IMFMediaBuffer::Lock call. It seems that the dll depends on the GPU used. For Intel integrated GPU it's igd10iumd32.dll and for Nvidia mobile GPU it's mfplat.dll. The message for this particular crash is as follows: "Exception thrown at 0x53C6DB8C (mfplat.dll) in decoder_ tester.exe: 0xC0000005: Access violation reading location 0x00000024". The addresses are different between executions and sometimes it involves reading, sometimes writing.

图形驱动程序停止响应,系统挂起一小段时间,然后应用程序挂起像点2一样崩溃或像点1那样完成。

The graphics driver stops responding, the system hangs for a short time and then the application crashes like in point 2 or finishes like in point 1.

应用程序运行良好,并通过硬件加速处理了所有帧。

The app works fine and processes all the frames with hardware acceleration.

大部分时间是1或2,很少是3或4。

Most of the time it's 1 or 2, seldom 3 or 4.

这是在我的机器(带有HD Graphics 530,Windows 10 Pro的Intel Core i5-6500)上以不同模式进行处理时的CPU / GPU使用情况。

Here's what the CPU/GPU usage is like when processing without throttling in different modes on my machine (Intel Core i5-6500 with HD Graphics 530, Windows 10 Pro).


  • NULL-CPU:〜90%,GPU:〜15%

  • 硬件-CPU:〜15%,GPU :〜60%

  • 软件-CPU:〜40%,GPU:〜7%

我在三台机器上测试了该应用程序。它们都具有Intel集成GPU(HD 4400,HD 4600,HD 530)。其中之一还具有可切换的Nvidia专用GPU(GF 840M)。它对所有它们的行为相同,唯一的区别是,使用Nvidia的GPU时,它在不同的dll中崩溃。

I tested the app on three machines. All of them had Intel integrated GPUs (HD 4400, HD 4600, HD 530). One of them also had switchable Nvidia dedicated GPU (GF 840M). It bahaves identically on all of them, the only difference is that it crashes in a different dll when Nvidia's GPU is used.

我以前没有使用COM或DirectX的经验,但是所有这些都是前后矛盾且不可预测的,因此对我来说似乎是内存损坏。不过,我不知道我在哪里犯错。

I have no previous experience with COM or DirectX, but all of this is inconsistent and unpredictable, so it looks like a memory corruption to me. Still, I don't know where I'm making the mistake. Could you please help me find what I'm doing wrong?

我能想到的最小代码示例如下。我正在使用Visual Studio Professional 2015将其编译为C ++项目。我准备了定义以启用硬件加速并选择了硬件驱动程序。注释掉它们以更改行为。另外,该代码还希望此视频文件出现在项目目录中。 / p>

The minimal code example I could come up with with is below. I'm using Visual Studio Professional 2015 to compile it as a C++ project. I prepared definitions to enable hardware acceleration and select the hardware driver. Comment them out to change the behavior. Also, the code expects this video file to be present in the project directory.

#include <iostream>
#include <string>
#include <atlbase.h>
#include <d3d11.h>
#include <mfapi.h>
#include <mfidl.h>
#include <mfreadwrite.h>
#include <windows.h>

#pragma comment(lib, "d3d11.lib")
#pragma comment(lib, "mf.lib")
#pragma comment(lib, "mfplat.lib")
#pragma comment(lib, "mfreadwrite.lib")
#pragma comment(lib, "mfuuid.lib")

#define ENABLE_HW_ACCELERATION
#define ENABLE_HW_DRIVER

void handle_result(HRESULT hr)
{
    if (SUCCEEDED(hr))
        return;

    WCHAR message[512];

    FormatMessage(FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_IGNORE_INSERTS, nullptr, hr,
        MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), message, ARRAYSIZE(message), nullptr);

    printf("%ls", message);
    abort();
}

int main(int argc, char** argv)
{
    handle_result(CoInitializeEx(nullptr, COINIT_APARTMENTTHREADED | COINIT_DISABLE_OLE1DDE));
    handle_result(MFStartup(MF_VERSION));

    {
        CComPtr<IMFAttributes> attributes;

        handle_result(MFCreateAttributes(&attributes, 3));

#if defined(ENABLE_HW_ACCELERATION)
        CComPtr<ID3D11Device> device;
        D3D_FEATURE_LEVEL levels[] = { D3D_FEATURE_LEVEL_11_1, D3D_FEATURE_LEVEL_11_0 };

#if defined(ENABLE_HW_DRIVER)
        handle_result(D3D11CreateDevice(nullptr, D3D_DRIVER_TYPE_HARDWARE, nullptr, D3D11_CREATE_DEVICE_SINGLETHREADED | D3D11_CREATE_DEVICE_VIDEO_SUPPORT,
            levels, ARRAYSIZE(levels), D3D11_SDK_VERSION, &device, nullptr, nullptr));
#else
        handle_result(D3D11CreateDevice(nullptr, D3D_DRIVER_TYPE_NULL, nullptr, D3D11_CREATE_DEVICE_SINGLETHREADED,
            levels, ARRAYSIZE(levels), D3D11_SDK_VERSION, &device, nullptr, nullptr));
#endif

        UINT token;
        CComPtr<IMFDXGIDeviceManager> manager;

        handle_result(MFCreateDXGIDeviceManager(&token, &manager));
        handle_result(manager->ResetDevice(device, token));

        handle_result(attributes->SetUnknown(MF_SOURCE_READER_D3D_MANAGER, manager));
        handle_result(attributes->SetUINT32(MF_READWRITE_ENABLE_HARDWARE_TRANSFORMS, TRUE));
        handle_result(attributes->SetUINT32(MF_SOURCE_READER_ENABLE_ADVANCED_VIDEO_PROCESSING, TRUE));
#else
        handle_result(attributes->SetUINT32(MF_SOURCE_READER_ENABLE_VIDEO_PROCESSING, TRUE));
#endif

        CComPtr<IMFSourceReader> reader;

        handle_result(MFCreateSourceReaderFromURL(L"Rogue One - A Star Wars Story - Trailer.mp4", attributes, &reader));

        CComPtr<IMFMediaType> output_type;

        handle_result(MFCreateMediaType(&output_type));
        handle_result(output_type->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video));
        handle_result(output_type->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_RGB32));
        handle_result(reader->SetCurrentMediaType(MF_SOURCE_READER_FIRST_VIDEO_STREAM, nullptr, output_type));

        unsigned int frame_count{};

        std::cout << "Started processing frames" << std::endl;

        while (true)
        {
            CComPtr<IMFSample> sample;
            DWORD flags;

            handle_result(reader->ReadSample(MF_SOURCE_READER_FIRST_VIDEO_STREAM,
                0, nullptr, &flags, nullptr, &sample));

            if (flags & MF_SOURCE_READERF_ENDOFSTREAM || sample == nullptr)
                break;

            std::cout << "Frame " << frame_count++ << std::endl;

            CComPtr<IMFMediaBuffer> buffer;
            BYTE* data;

            handle_result(sample->ConvertToContiguousBuffer(&buffer));
            handle_result(buffer->Lock(&data, nullptr, nullptr));

            // Use the frame here.

            buffer->Unlock();
        }

        std::cout << "Finished processing frames" << std::endl;
    }

    MFShutdown();
    CoUninitialize();

    return 0;
}


推荐答案

从概念上讲,您的代码是正确的,唯一的说-而且不是很明显-Media Foundation解码器是多线程的。您正在为它提供单线程版本的Direct3D设备。您必须解决它,否则您将得到当前得到的结果:访问冲突和冻结,这是未定义的行为。

Your code is correct, conceptually, with the only remark - and it's not quite obvious - that Media Foundation decoder is multithreaded. You are feeding it with a single threaded version of Direct3D device. You have to work it around or you get what you are currently getting: access violations and freezes, that is undefined behavior.

    // NOTE: No single threading
    handle_result(D3D11CreateDevice(nullptr, D3D_DRIVER_TYPE_HARDWARE, nullptr, 
        (0 * D3D11_CREATE_DEVICE_SINGLETHREADED) | D3D11_CREATE_DEVICE_VIDEO_SUPPORT,
        levels, ARRAYSIZE(levels), D3D11_SDK_VERSION, &device, nullptr, nullptr));

    // NOTE: Getting ready for multi-threaded operation
    const CComQIPtr<ID3D10Multithread> pMultithread = device;
    pMultithread->SetMultithreadProtected(TRUE);

还请注意,这个简单的代码示例在为获得连续缓冲区而添加的代码中存在性能瓶颈。显然,这是您访问数据的举动……但是,设计使然,是已解码的数据已经在视频内存中,并且将数据传输到系统内存是一项昂贵的操作。也就是说,您在循环中增加了严重的性能损失。您将对以这种方式检查数据的有效性感兴趣,而当涉及性能基准测试时,您应该将其注释掉。

Also note that this straightforward code sample has a performance bottleneck around the lines you added for getting contiguous buffer. Apparently it's your move to get access to the data... however behavior by design is that decoded data is already in video memory, and your transfer to system memory is an expensive operation. That is, you added a severe performance hit to the loop. You will be interested in checking validity of data this way, and when it comes to performance benchmarking you should rather comment that out.

这篇关于如何正确使用硬件加速的Media Foundation Source Reader解码视频?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆