MTKView绘图性能 [英] MTKView Drawing Performance
问题描述
我正在尝试使用金属"视图MTKView
在相机摘要上显示滤镜.我正在密切关注Apple示例代码的方法-通过使用TrueDepth摄像机数据来增强实时视频(链接).
I am trying to show filters on a camera feed by using a Metal view: MTKView
. I am closely following the method of Apple's sample code - Enhancing Live Video by Leveraging TrueDepth Camera Data (link).
以下代码效果很好(主要从上述示例代码解释):
Following code works great (mainly interpreted from above-mentioned sample code) :
class MetalObject: NSObject, MTKViewDelegate {
private var metalBufferView : MTKView?
private var metalDevice = MTLCreateSystemDefaultDevice()
private var metalCommandQueue : MTLCommandQueue!
private var ciContext : CIContext!
private let colorSpace = CGColorSpaceCreateDeviceRGB()
private var videoPixelBuffer : CVPixelBuffer?
private let syncQueue = DispatchQueue(label: "Preview View Sync Queue", qos: .userInitiated, attributes: [], autoreleaseFrequency: .workItem)
private var textureWidth : Int = 0
private var textureHeight : Int = 0
private var textureMirroring = false
private var sampler : MTLSamplerState!
private var renderPipelineState : MTLRenderPipelineState!
private var vertexCoordBuffer : MTLBuffer!
private var textCoordBuffer : MTLBuffer!
private var internalBounds : CGRect!
private var textureTranform : CGAffineTransform?
private var previewImage : CIImage?
init(with frame: CGRect) {
super.init()
self.metalBufferView = MTKView(frame: frame, device: self.metalDevice)
self.metalBufferView!.contentScaleFactor = UIScreen.main.nativeScale
self.metalBufferView!.framebufferOnly = true
self.metalBufferView!.colorPixelFormat = .bgra8Unorm
self.metalBufferView!.isPaused = true
self.metalBufferView!.enableSetNeedsDisplay = false
self.metalBufferView!.delegate = self
self.metalCommandQueue = self.metalDevice!.makeCommandQueue()
self.ciContext = CIContext(mtlDevice: self.metalDevice!)
//Configure Metal
let defaultLibrary = self.metalDevice!.makeDefaultLibrary()!
let pipelineDescriptor = MTLRenderPipelineDescriptor()
pipelineDescriptor.colorAttachments[0].pixelFormat = .bgra8Unorm
pipelineDescriptor.vertexFunction = defaultLibrary.makeFunction(name: "vertexPassThrough")
pipelineDescriptor.fragmentFunction = defaultLibrary.makeFunction(name: "fragmentPassThrough")
// To determine how our textures are sampled, we create a sampler descriptor, which
// will be used to ask for a sampler state object from our device below.
let samplerDescriptor = MTLSamplerDescriptor()
samplerDescriptor.sAddressMode = .clampToEdge
samplerDescriptor.tAddressMode = .clampToEdge
samplerDescriptor.minFilter = .linear
samplerDescriptor.magFilter = .linear
sampler = self.metalDevice!.makeSamplerState(descriptor: samplerDescriptor)
do {
renderPipelineState = try self.metalDevice!.makeRenderPipelineState(descriptor: pipelineDescriptor)
} catch {
fatalError("Unable to create preview Metal view pipeline state. (\(error))")
}
}
final func update (newVideoPixelBuffer: CVPixelBuffer?) {
self.syncQueue.async {
var filteredImage : CIImage
self.videoPixelBuffer = newVideoPixelBuffer
//---------
//Core image filters
//Strictly CIFilters, chained together
//---------
self.previewImage = filteredImage
//Ask Metal View to draw
self.metalBufferView?.draw()
}
}
//MARK: - Metal View Delegate
final func draw(in view: MTKView) {
print (Thread.current)
guard let drawable = self.metalBufferView!.currentDrawable,
let currentRenderPassDescriptor = self.metalBufferView!.currentRenderPassDescriptor,
let previewImage = self.previewImage else {
return
}
// create a texture for the CI image to render to
let textureDescriptor = MTLTextureDescriptor.texture2DDescriptor(
pixelFormat: .bgra8Unorm,
width: Int(previewImage.extent.width),
height: Int(previewImage.extent.height),
mipmapped: false)
textureDescriptor.usage = [.shaderWrite, .shaderRead]
let texture = self.metalDevice!.makeTexture(descriptor: textureDescriptor)!
if texture.width != textureWidth ||
texture.height != textureHeight ||
self.metalBufferView!.bounds != internalBounds {
setupTransform(width: texture.width, height: texture.height, mirroring: mirroring, rotation: rotation)
}
// Set up command buffer and encoder
guard let commandQueue = self.metalCommandQueue else {
print("Failed to create Metal command queue")
return
}
guard let commandBuffer = commandQueue.makeCommandBuffer() else {
print("Failed to create Metal command buffer")
return
}
// add rendering of the image to the command buffer
ciContext.render(previewImage,
to: texture,
commandBuffer: commandBuffer,
bounds: previewImage.extent,
colorSpace: self.colorSpace)
guard let commandEncoder = commandBuffer.makeRenderCommandEncoder(descriptor: currentRenderPassDescriptor) else {
print("Failed to create Metal command encoder")
return
}
// add vertex and fragment shaders to the command buffer
commandEncoder.label = "Preview display"
commandEncoder.setRenderPipelineState(renderPipelineState!)
commandEncoder.setVertexBuffer(vertexCoordBuffer, offset: 0, index: 0)
commandEncoder.setVertexBuffer(textCoordBuffer, offset: 0, index: 1)
commandEncoder.setFragmentTexture(texture, index: 0)
commandEncoder.setFragmentSamplerState(sampler, index: 0)
commandEncoder.drawPrimitives(type: .triangleStrip, vertexStart: 0, vertexCount: 4)
commandEncoder.endEncoding()
commandBuffer.present(drawable) // Draw to the screen
commandBuffer.commit()
}
final func mtkView(_ view: MTKView, drawableSizeWillChange size: CGSize) {
}
}
注释
- 使用
MTKViewDelegate
代替子类的原因是,当子类化时,在主线程上调用了draw调用.使用上面显示的委托方法,似乎每个循环都调用了与金属相关的不同线程.上面的方法似乎可以提供更好的性能. - 有关上述更新方法中
CIFilter
使用情况的详细信息,必须予以删除.所有这些都是堆叠的CIFilters
的重链.不幸的是,这些过滤器没有任何调整的余地.
- The reason
MTKViewDelegate
is used instead of subclassingMTKView
is that when it was subclassed, the draw call was called on the main thread. With the delegate method shown above, it seems to be a different metal related thread call each loop. Above method seem to give much better performance. - Details on
CIFilter
usage on update method above had to be redacted. All it is a heavy chain ofCIFilters
stacked. Unfortunately there is no room for any tweaks with these filters.
上面的代码似乎会大大减慢主线程的速度,从而导致应用程序UI的其余部分变得不稳定.例如,滚动UIScrollview
变得缓慢而断断续续.
Above code seems to slow down the main thread a lot, causing rest of the app UI to be choppy. For example, scrolling a UIScrollview
gets seem to be slow and choppy.
调整Metal视图可简化CPU的工作,并在主线程上轻松进行操作,以为其余的UI留出足够的汁液.
Tweak Metal view to ease up on CPU and go easy on the main thread to leave enough juice for rest of the UI.
根据以上图形,命令缓冲区的准备工作全部在CPU中完成,直到提出并提交(?)为止.有没有办法从CPU上卸载它?
According to the above graphics, preparation of command buffer is all done in CPU until presented and committed(?). Is there a way to offload that from CPU?
感谢任何可以提高绘图效率的提示,反馈,技巧等.
Any hints, feedback, tips, etc to improve the drawing efficiency would be appreciated.
推荐答案
您可以采取一些措施来提高性能:
There are a few things you can do to improve the performance:
- 直接渲染到视图的可绘制对象中,而不是渲染到纹理中,然后再次渲染 以将该纹理渲染到视图中.
- 使用新颖的
CIRenderDestination
API将实际的纹理检索延迟到实际渲染视图的那一刻(即,完成Core Image后).
- Render into the view’s drawable directly instead of rendering into a texture and then rendering again to render that texture into the view.
- Use the newish
CIRenderDestination
API to defer the actual texture retrieval to the moment the view is actually rendered to (i.e. when Core Image is done).
这是我在Core Image项目中使用的draw(in view: MTKView)
,已针对您的情况进行了修改:
Here’s the draw(in view: MTKView)
I’m using in my Core Image project, modified for your case:
public func draw(in view: MTKView) {
if let currentDrawable = view.currentDrawable,
let commandBuffer = self.commandQueue.makeCommandBuffer() {
let drawableSize = view.drawableSize
// optional: scale the image to fit the view
let scaleX = drawableSize.width / image.extent.width
let scaleY = drawableSize.height / image.extent.height
let scale = min(scaleX, scaleY)
let scaledImage = previewImage.transformed(by: CGAffineTransform(scaleX: scale, y: scale))
// optional: center in the view
let originX = max(drawableSize.width - scaledImage.extent.size.width, 0) / 2
let originY = max(drawableSize.height - scaledImage.extent.size.height, 0) / 2
let centeredImage = scaledImage.transformed(by: CGAffineTransform(translationX: originX, y: originY))
// create a render destination that allows to lazily fetch the target texture
// which allows the encoder to process all CI commands _before_ the texture is actually available;
// this gives a nice speed boost because the CPU doesn’t need to wait for the GPU to finish
// before starting to encode the next frame
let destination = CIRenderDestination(width: Int(drawableSize.width),
height: Int(drawableSize.height),
pixelFormat: view.colorPixelFormat,
commandBuffer: commandBuffer,
mtlTextureProvider: { () -> MTLTexture in
return currentDrawable.texture
})
let task = try! self.context.startTask(toRender: centeredImage, to: destination)
// bonus: you can Quick Look the task to see what’s actually scheduled for the GPU
commandBuffer.present(currentDrawable)
commandBuffer.commit()
// optional: you can wait for the task execution and Quick Look the info object to get insights and metrics
DispatchQueue.global(qos: .background).async {
let info = try! task.waitUntilCompleted()
}
}
}
如果这仍然太慢,则可以在创建CIContext
时尝试设置priorityRequestLow
CIContextOption
,以告知Core Image以低优先级进行渲染.
If this is still too slow, you can try setting the priorityRequestLow
CIContextOption
when creating your CIContext
to tell Core Image to render in low priority.
这篇关于MTKView绘图性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!