结合 CoreML 和 ARKit [英] Combining CoreML and ARKit

查看:50
本文介绍了结合 CoreML 和 ARKit的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 Apple 上给定的 inceptionV3 模型在我的项目中结合 CoreML 和 ARKit 网站.

我从 ARKit (Xcode 9 beta 3) 的标准模板开始

我没有创建新的摄像头会话,而是重用由 ARSCNView 启动的会话.

在我的 viewDelegate 末尾,我写:

sceneView.session.delegate = self

然后我扩展我的 viewController 以符合 ARSessionDelegate 协议(可选协议)

//MARK: ARSessionDelegate扩展视图控制器:ARSessionDelegate {func session(_ session: ARSession, didUpdate frame: ARFrame) {做 {让预测 = 尝试 self.model.prediction(image: frame.capturedImage)DispatchQueue.main.async {如果让 prob = prediction.classLabelProbs[prediction.classLabel] {self.textLabel.text = "\(prediction.classLabel) \(String(describing: prob))"}}}将错误捕获为 NSError {打印(发生意外错误:\(error.localizedDescription).")}}}

起初我尝试了该代码,但后来注意到 inception 需要一个 Image 类型的像素缓冲区.<RGB,<299,299>.

虽然没有重新开始,但我想我只是调整我的框架大小,然后尝试从中得到一个预测.我正在使用此功能调整大小(取自 https://github.com/yulingtianxia/Core-ML-样品)

func resize(pixelBuffer: CVPixelBuffer) ->CVPixelBuffer?{让 imageSide = 299var ciImage = CIImage(cvPixelBuffer: pixelBuffer, options: nil)让变换 = CGAffineTransform(scaleX: CGFloat(imageSide)/CGFloat(CVPixelBufferGetWidth(pixelBuffer)), y: CGFloat(imageSide)/CGFloat(CVPixelBufferGetHeight(pixelBuffer)))ciImage = ciImage.transformed(by: transform).cropped(to: CGRect(x: 0, y: 0, width: imageSide, height: imageSide))让 ciContext = CIContext()var resizeBuffer: CVPixelBuffer?CVPixelBufferCreate(kCFAllocatorDefault, imageSide, imageSide, CVPixelBufferGetPixelFormatType(pixelBuffer), nil, &resizeBuffer)ciContext.render(ciImage, to: resizeBuffer!)返回调整大小缓冲区}

不幸的是,这还不足以让它发挥作用.这是捕获的错误:

发生意外错误:输入图像特征图像与模型描述不匹配.2017-07-20 AR+MLPhotoDuplicatePrediction[928:298214] [核心]错误域=com.apple.CoreML 代码=1输入图像特征图像与模型描述不符"UserInfo={NSLocalizedDescription=输入的图像特征图像与模型描述不符,NSUnderlyingError=0x1c4a49fc0 {错误域=com.apple.CoreML 代码=1图像类型不是 32-BGRA 或 32-ARGB,而是不受支持 (875704422)"UserInfo={NSLocalizedDescription=图像类型不是 32-BGRA 或 32-ARGB,而是不受支持 (875704422)}}}

不确定我能从这里做什么.

如果有更好的建议将两者结合起来,我会全力以赴.

编辑:我还尝试了 resizePixelBuffer 方法来自@dfd 建议的 YOLO-CoreML-MPSNNGraph,错误完全相同.

Edit2:所以我将像素格式更改为 kCVPixelFormatType_32BGRA(与 resizePixelBuffer 中传递的 pixelBuffer 格式不同).

let pixelFormat = kCVPixelFormatType_32BGRA//第 48 行

我不再有错误了.但是一旦我尝试进行预测,AVCaptureSession 就会停止.似乎我遇到了 Enric_SA 在 苹果开发者论坛上运行的相同问题.>

Edit3:所以我尝试实施 rickster 解决方案.与 inceptionV3 配合良好.我想尝试一个特征观察(VNClassificationObservation).目前,它无法使用 TinyYolo.边界是错误的.试图弄清楚.

解决方案

不要自己处理图像以将它们提供给 Core ML.使用 Vision.(不,不是那个.这个.)Vision 采用 ML 模型和多个图像中的任何一个类型(包括CVPixelBuffer)并自动获取将图像调整为适合模型评估的正确尺寸和纵横比以及像素格式,然后为您提供模型的结果.

这是您需要的代码的粗略骨架:

var 请求:VNRequest功能设置(){让模型 = 尝试 VNCoreMLModel(for: MyCoreMLGeneratedModelClass().model)请求 = VNCoreMLRequest(模型:模型,completionHandler:myResultsMethod)}功能分类ARFrame(){让处理程序 = VNImageRequestHandler(cvPixelBuffer: session.currentFrame.capturedImage,方向:.up)//根据您的 UI 方向进行修复handler.perform([请求])}func myResultsMethod(请求:VNRequest,错误:错误?){守卫让结果= request.result as?【VN分类观察】其他 { 致命错误(嗯")}用于结果分类{print(classification.identifier,//场景标签分类.置信度)}}

请参阅此答案到另一个问题以获取更多提示.

I am trying to combine CoreML and ARKit in my project using the given inceptionV3 model on Apple website.

I am starting from the standard template for ARKit (Xcode 9 beta 3)

Instead of intanciating a new camera session, I reuse the session that has been started by the ARSCNView.

At the end of my viewDelegate, I write:

sceneView.session.delegate = self

I then extend my viewController to conform to the ARSessionDelegate protocol (optional protocol)

// MARK: ARSessionDelegate
extension ViewController: ARSessionDelegate {

    func session(_ session: ARSession, didUpdate frame: ARFrame) {

        do {
            let prediction = try self.model.prediction(image: frame.capturedImage)
            DispatchQueue.main.async {
                if let prob = prediction.classLabelProbs[prediction.classLabel] {
                    self.textLabel.text = "\(prediction.classLabel) \(String(describing: prob))"
                }
            }
        }
        catch let error as NSError {
            print("Unexpected error ocurred: \(error.localizedDescription).")
        }
    }
}

At first I tried that code, but then noticed that inception requires a pixel Buffer of type Image. < RGB,<299,299>.

Although not recommenced, I thought I would just resize my frame then try to get a prediction out of it. I am resizing using this function (took it from https://github.com/yulingtianxia/Core-ML-Sample)

func resize(pixelBuffer: CVPixelBuffer) -> CVPixelBuffer? {
    let imageSide = 299
    var ciImage = CIImage(cvPixelBuffer: pixelBuffer, options: nil)
    let transform = CGAffineTransform(scaleX: CGFloat(imageSide) / CGFloat(CVPixelBufferGetWidth(pixelBuffer)), y: CGFloat(imageSide) / CGFloat(CVPixelBufferGetHeight(pixelBuffer)))
    ciImage = ciImage.transformed(by: transform).cropped(to: CGRect(x: 0, y: 0, width: imageSide, height: imageSide))
    let ciContext = CIContext()
    var resizeBuffer: CVPixelBuffer?
    CVPixelBufferCreate(kCFAllocatorDefault, imageSide, imageSide, CVPixelBufferGetPixelFormatType(pixelBuffer), nil, &resizeBuffer)
    ciContext.render(ciImage, to: resizeBuffer!)
    return resizeBuffer
} 

Unfortunately, this is not enough to make it work. This is the error that is catched:

Unexpected error ocurred: Input image feature image does not match model description.
2017-07-20 AR+MLPhotoDuplicatePrediction[928:298214] [core] 
    Error Domain=com.apple.CoreML Code=1 
    "Input image feature image does not match model description" 
    UserInfo={NSLocalizedDescription=Input image feature image does not match model description, 
    NSUnderlyingError=0x1c4a49fc0 {Error Domain=com.apple.CoreML Code=1 
    "Image is not expected type 32-BGRA or 32-ARGB, instead is Unsupported (875704422)" 
    UserInfo={NSLocalizedDescription=Image is not expected type 32-BGRA or 32-ARGB, instead is Unsupported (875704422)}}}

Not sure what I can do from here.

If there is any better suggestion to combine both, I'm all ears.

Edit: I also tried the resizePixelBuffer method from the YOLO-CoreML-MPSNNGraph suggested by @dfd , the error is exactly the same.

Edit2: So I changed the pixel format to be kCVPixelFormatType_32BGRA (not the same format as the pixelBuffer passed in the resizePixelBuffer).

let pixelFormat = kCVPixelFormatType_32BGRA // line 48

I do not have the error anymore. But as soon as I try to make a prediction, the AVCaptureSession stops. Seems I am running into the same issue Enric_SA is running on the apple developers forum.

Edit3: So I tried implementing rickster solution. Works well with inceptionV3. I wanted to try a a feature observation (VNClassificationObservation). At this time, it is not working using TinyYolo. The bounding are wrong. Trying to figure it out.

解决方案

Don't process images yourself to feed them to Core ML. Use Vision. (No, not that one. This one.) Vision takes an ML model and any of several image types (including CVPixelBuffer) and automatically gets the image to the right size and aspect ratio and pixel format for the model to evaluate, then gives you the model's results.

Here's a rough skeleton of the code you'd need:

var request: VNRequest

func setup() {
    let model = try VNCoreMLModel(for: MyCoreMLGeneratedModelClass().model)
    request = VNCoreMLRequest(model: model, completionHandler: myResultsMethod)
}

func classifyARFrame() {
    let handler = VNImageRequestHandler(cvPixelBuffer: session.currentFrame.capturedImage,
        orientation: .up) // fix based on your UI orientation
    handler.perform([request])
}

func myResultsMethod(request: VNRequest, error: Error?) {
    guard let results = request.results as? [VNClassificationObservation]
        else { fatalError("huh") }
    for classification in results {
        print(classification.identifier, // the scene label
              classification.confidence)
    }
}

See this answer to another question for some more pointers.

这篇关于结合 CoreML 和 ARKit的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆