新的MacBook Pro(2016年末)GPU上的金属内核行为不正常 [英] Metal kernels not behaving properly on the new MacBook Pro (late 2016) GPUs

查看：171 发布时间：2020/4/25 11:27:47 swift macos kernel gpu metal

本文介绍了新的MacBook Pro(2016年末)GPU上的金属内核行为不正常的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在研究使用Swift和Metal在GPU上进行图像处理的macOS项目.上周，我收到了新的15英寸MacBook Pro(2016年末)，并且发现我的代码有些奇怪:本应写入纹理的内核似乎没有这样做……

I'm working on macOS project that uses Swift and Metal for image processing on the GPU. Last week, I received my new 15-inch MacBook Pro (late 2016) and noticed something strange with my code: kernels that were supposed to write to a texture did not seem to do so...

经过大量挖掘，我发现问题与Metal(AMD Radeon Pro 455或Intel(R)HD Graphics 530)使用哪个GPU进行计算有关.

After a lot of digging, I found that the problem is related to which GPU is used by Metal (AMD Radeon Pro 455 or Intel(R) HD Graphics 530) to do the computation.

使用MTLCopyAllDevices()初始化MTLDevice将返回代表Radeon和Intel GPU的设备数组(而MTLCreateSystemDefaultDevice()返回默认设备即Radeon).在任何情况下，该代码都可以在Intel GPU上正常工作，但Radeon GPU却并非如此.

Initializing the MTLDevice using MTLCopyAllDevices() returns an array of devices representing the Radeon and the Intel GPUs (while MTLCreateSystemDefaultDevice() returns the default device which is the Radeon). In any case, the code works as expected with the Intel GPU but that is not the case with the Radeon GPU.

让我给你看一个例子.

首先，这是一个简单的内核，它接受输入纹理并将其颜色复制到输出纹理:

To start, here is a simple kernel that takes an input texture and copies its colour to an output texture:

    kernel void passthrough(texture2d<uint, access::read> inTexture [[texture(0)]],
                            texture2d<uint, access::write> outTexture [[texture(1)]],
                            uint2 gid [[thread_position_in_grid]])
    {
        uint4 out = inTexture.read(gid);
        outTexture.write(out, gid);
    }

我要使用此内核，请使用以下代码:

I order to use this kernel, I use this piece of code:

    let devices = MTLCopyAllDevices()
    for device in devices {
        print(device.name!) // [0] -> "AMD Radeon Pro 455", [1] -> "Intel(R) HD Graphics 530"
    }

    let device = devices[0] 
    let library = device.newDefaultLibrary()
    let commandQueue = device.makeCommandQueue()

    let passthroughKernelFunction = library!.makeFunction(name: "passthrough")

    let cps = try! device.makeComputePipelineState(function: passthroughKernelFunction!)

    let commandBuffer = commandQueue.makeCommandBuffer()
    let commandEncoder = commandBuffer.makeComputeCommandEncoder()

    commandEncoder.setComputePipelineState(cps)

    // Texture setup
    let width = 16
    let height = 16
    let byteCount = height*width*4
    let bytesPerRow = width*4
    let region = MTLRegionMake2D(0, 0, width, height)
    let textureDescriptor = MTLTextureDescriptor.texture2DDescriptor(pixelFormat: .rgba8Uint, width: width, height: height, mipmapped: false)

    // inTexture
    var inData = [UInt8](repeating: 255, count: Int(byteCount))
    let inTexture = device.makeTexture(descriptor: textureDescriptor)
    inTexture.replace(region: region, mipmapLevel: 0, withBytes: &inData, bytesPerRow: bytesPerRow)

    // outTexture
    var outData = [UInt8](repeating: 128, count: Int(byteCount))
    let outTexture = device.makeTexture(descriptor: textureDescriptor)
    outTexture.replace(region: region, mipmapLevel: 0, withBytes: &outData, bytesPerRow: bytesPerRow)

    commandEncoder.setTexture(inTexture, at: 0)
    commandEncoder.setTexture(outTexture, at: 1)
    commandEncoder.dispatchThreadgroups(MTLSize(width: 1,height: 1,depth: 1), threadsPerThreadgroup: MTLSize(width: width, height: height, depth: 1))

    commandEncoder.endEncoding()
    commandBuffer.commit()
    commandBuffer.waitUntilCompleted()

    // Get the data back from the GPU
    outTexture.getBytes(&outData, bytesPerRow: bytesPerRow, from: region , mipmapLevel: 0)

    // Validation
    // outData should be exactly the same as inData 
    for (i,outElement) in outData.enumerated() {
        if outElement != inData[i] {
            print("Dest: \(outElement) != Src: \(inData[i]) at \(i))")
        }
    }

使用let device = devices[0](Radeon GPU)运行此代码时，outTexture永远不会写入(我的假设)，因此outData保持不变.另一方面，当使用let device = devices[1](Intel GPU)运行此代码时，一切都会按预期进行，并且outData将使用inData中的值进行更新.

When running this code with let device = devices[0] (Radeon GPU), outTexture is never written to (my supposition) and as a result outData stays unchanged. On the other hand, when running this code with let device = devices[1] (Intel GPU), everything works as expected and outData is updated with the values in inData.

新的MacBook Pro(2016年末)GPU上的金属内核行为不正常 [英] Metal kernels not behaving properly on the new MacBook Pro (late 2016) GPUs

问题描述

推荐答案

相关文章

移动开发最新文章

热门教程

热门工具

登录关闭

新的MacBook Pro(2016年末)GPU上的金属内核行为不正常 [英] Metal kernels not behaving properly on the new MacBook Pro (late 2016) GPUs

问题描述

推荐答案

相关文章

移动开发最新文章

热门教程

热门工具

登录 关闭

登录关闭