ARKit 是否考虑了 iPhone 和 iPad 中的镜头失真? [英] Does ARKit consider Lens Distortion in iPhone and iPad?

查看:25
本文介绍了ARKit 是否考虑了 iPhone 和 iPad 中的镜头失真?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

ARKit 会逐帧更新 ARCamera 的许多内在(和外在)参数.我想知道它是否还考虑了 Radial Lens Distortion(例如在 ARKit 不使用的 AVCameraCalibrationData 类中),并适当地修复视频帧的失真(distort/undistort 操作)用于后置 iPhone 和 iPad 相机?

var 内在函数:simd_float3x3 { get }

众所周知,Radial Lens Distortion 在我们将未失真的 3D 对象放置在被镜头真实世界场景扭曲时会极大地影响 6 DOF 姿态估计精度.

var lensDistortionLookupTable:数据?{ 得到 }/* 描述径向的浮点值映射 *//* AVCameraCalibrationData 类中的镜头失真 */

如果 ARKit 中的 Lens Distortion's math 在 API 中可用,我在哪里可以找到它?

解决方案

虽然没有明确说明,但我确信 ARKit 正在校正非线性镜头失真.镜头畸变(和逆畸变)查找表存在于 iOS11 中,可通过 AVCameraCalibrationData 获得,但 ARKit 未公开它们,大概是因为不需要它们,因为您已经在使用校正坐标.

每个设备型号的失真模型参数是否相同(即每个 iPhone 7 的值完全相同),这是一个有趣的问题.我无法使用同一型号的多部手机,但对于使用过的人来说应该不难弄清楚.

在 Apple Vision 的帮助下,现在可以识别相机视频源中的 QR 标记并在其处于视野中时对其进行跟踪.该框架为我们提供了屏幕坐标系中 QR 标记方角的坐标.

QR 标记姿态估计

在检测到 QR 标记后,您可能想做的下一件事是从它们获取相机姿势.

要执行 QR 标记姿势估计,您需要知道相机的校准参数.这是相机矩阵和失真系数.每个相机镜头都有独特的参数,例如焦距、主点和镜头畸变模型.寻找相机内在参数的过程称为相机标定.相机校准过程对于增强现实应用很重要,因为它描述了输出图像上的透视变换和镜头失真.为了通过增强现实实现最佳用户体验,增强对象的可视化应该使用相同的透视投影来完成.

最后,校准后你得到的是相机矩阵:具有焦距和相机中心坐标(又名内参数)的 3x3 元素矩阵,以及失真系数:5 个元素或更多元素的向量模拟相机产生的失真.大多数iDevices的校准参数基本相同.

通过标记角的精确位置,我们可以估计相机与 3D 空间中的标记之间的转换.此操作称为从 2D-3D 对应关系估计姿势.姿态估计过程在相机和物体之间找到欧几里得变换(仅由旋转和平移分量组成).

C 用于表示相机中心.P1-P4 点是世界坐标系中的 3D 点,p1-p4 点是它们在相机图像平面上的投影.我们的目标是使用内在矩阵和图像平面 (P1-P4) 上的已知点投影,找​​到 3D 世界中的已知标记位置 (p1-p4) 与相机 C 之间的相对变换.

OpenCV 函数用于以最小化重投影误差的方式计算 QR 标记变换,即观察投影的图像点和投影对象点之间的平方距离之和.估计变换由旋转 (rvec) 和平移分量 (tvec) 定义.这也称为欧几里得变换或刚性变换.最后我们得到旋转四元数和二维标记的平移矩阵.

集成到 Apple 的 ARKit

最后一部分是将有关 QR 标记姿势的所有信息集成到 ARKit 创建的 3D 场景中.ARKit 使用视觉惯性里程计 (VIO) 来准确跟踪周围的世界.VIO 将相机传感器数据与 CoreMotion 数据融合在一起.这两个输入允许设备高度准确地感知它在房间内的移动方式,而无需任何额外的校准.所有渲染的东西都基于 Apple 的 Metal 和 Apple 在它上面的 SceneKit.

为了以正确的方式在我们的 QR 标记上渲染 SceneKit 的节点,我们需要根据我们从 OpenCV 获得的四元数和转换矩阵创建我们的 QR 标记的模型矩阵.下一步是将 QR 标记的模型矩阵乘以 SceneKit 场景虚拟相机的变换矩阵.结果,我们可以看到一个自定义节点(我们项目中的 Axes 节点)在现实世界中重复所有 QR 标记的移动,同时它在 iPhone 的相机的视野中,如果不是 - 它停留在最后一个更新了位置,以便我们可以检查它.

ARKit updates many intrinsic (and extrinsic) parameters of the ARCamera from frame to frame. I'd like to know if it also takes Radial Lens Distortion into consideration (like in AVCameraCalibrationData class that ARKit doesn't use), and fix the video frames' distortion appropriately (distort/undistort operations) for back iPhone and iPad cameras?

var intrinsics: simd_float3x3 { get }

As we all know, the Radial Lens Distortion greatly affects the 6 DOF pose estimation accuracy when we place undistorted 3D objects in distorted by a lens real world scene.

var lensDistortionLookupTable: Data? { get } 

/* A map of floating-point values describing radial */
/* lens distortions in AVCameraCalibrationData class */

If Lens Distortion's math in ARKit is available in API, where I can find it?

解决方案

Although it's not explicitly stated, I'm certain that ARKit is correcting for non-linear lens distortion. Lens distortion (and inverse distortion) lookup tables exist in iOS11 and are available via AVCameraCalibrationData, but they are not exposed by ARKit, presumably because there is no need for them since you're already working with rectified coordinates.

Whether or not the distortion model parameters are the same for each device model (i.e. exact same values for each iPhone 7) it's an interesting question. I don't have access to multiple phones of the same model, but this shouldn't be hard to figure out for someone who does.

source

As an exapmple from : https://github.com/verebes1/ARKit-Multiplayer QR marker detection

With the help of Apple's Vision now it's possible to recognize QR marker in camera's videofeed and track it while it is in the field of view. The framework provides us the coordinates of the QR marker square corners in the screen's coordinate system.

QR marker pose estimation

The next thing you probably want to do after detecting the QR markers is to obtain the camera pose from them.

To perform QR marker pose estimation you need to know the calibration parameters of your camera. This is the camera matrix and distortion coefficients. Each camera lens has unique parameters, such as focal length, principal point, and lens distortion model. The process of finding intrinsic camera parameters is called camera calibration. The camera calibration process is important for Augmented Reality applications because it describes the perspective transformation and lens distortion on an output image. To achieve the best user experience with Augmented Reality, visualization of an augmented object should be done using the same perspective projection.

At the end, what you get after the calibration is the camera matrix: a matrix of 3x3 elements with the focal distances and the camera center coordinates (a.k.a intrinsic parameters), and the distortion coefficients: a vector of 5 elements or more that models the distortion produced by your camera. The calibration parameters are pretty the same for most of iDevices.

With the precise location of marker corners, we can estimate a transformation between our camera and a marker in 3D space. This operation is known as pose estimation from 2D-3D correspondences. The pose estimation process finds an Euclidean transformation (that consists only of rotation and translation components) between the camera and the object.

The C is used to denote the camera center. The P1-P4 points are 3D points in the world coordinate system and the p1-p4 points are their projections on the camera's image plane. Our goal is to find relative transformation between a known marker position in the 3D world (p1-p4) and the camera C using an intrinsic matrix and known point projections on image plane (P1-P4).

OpenCV functions are used to calculate the QR marker transformation in such a way that it minimizes the reprojection error, that is the sum of squared distances between the observed projection's imagePoints and the projected objectPoints. The estimated transformation is defined by rotation (rvec) and translation components (tvec). This is also known as Euclidean transformation or rigid transformation. At the end we get rotation quaternion and a translation matrix of the QR marker.

Integration into Apple's ARKit

The final part is the integration of all the information about QR marker's pose into the 3D scene created by ARKit. ARKit uses Visual Inertial Odometry (VIO) to accurately track the world around it. VIO fuses camera sensor data with CoreMotion data. These two inputs allow the device to sense how it moves within a room with a high degree of accuracy, and without any additional calibration. All the rendering stuff is based on Apple's Metal and Apple's SceneKit above it.

In order to render SceneKit's node on our QR marker in a proper way we need to create a model matrix of our QR marker from the quaternion and translation matrix we've got from OpenCV. The next step is to multiply QR marker's model matrix by SceneKit scene virtual camera's transform matrix. As a result, we can see a custom node (Axes node in our project) that repeats all the QR marker's movements in the real world while it's in the field of view of the iPhone's camera and if it is not - it stays on the last updated position so we can examine it around.

这篇关于ARKit 是否考虑了 iPhone 和 iPad 中的镜头失真?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆