ARKit 2.0是否会考虑iPhone和iPad中的镜头失真? [英] Does ARKit 2.0 consider Lens Distortion in iPhone and iPad?

查看:201
本文介绍了ARKit 2.0是否会考虑iPhone和iPad中的镜头失真?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

ARKit 2.0在帧与帧之间更新ARCamera的许多内在(和外在)参数。我想知道是否还要考虑径向镜头失真(例如在 AVCameraCalibrationData 类中,ARKit没有'使用),并适当修复后视iPhone和iPad相机的视频帧失真(扭曲 / 无瑕疵操作) ?

  var intrinsics:simd_float3x3 {get} 


众所周知,径向镜头失真会对6当我们将未失真的3D物体放置在由镜头真实世界场景扭曲时,DOF姿势估计精度。



  var lensDistortionLookupTable:数据? {get} 

//在AVCameraCalibrationData类中描述径向
//镜头扭曲的浮点值映射

如果 ARKit 2.0中的Lens Distortion的数学可以通过API获得,我可以找到它吗?

解决方案

虽然没有明确说明,但我确信ARKit正在纠正非线性镜头失真。 iOS11中存在镜头失真(和反向失真)查找表,可通过 AVCameraCalibrationData 获得,但它们不会被ARKit公开,大概是因为没有它们,因为你'已经使用整流坐标。



每个设备型号的失真模型参数是否相同(即每个iPhone 7的值完全相同)这是一个有趣的问题。我无法访问同一型号的多部手机,但这对于那些做过的人来说应该不难理解。





在Apple的Vision的帮助下,现在可以识别相机视频中的QR标记并在现场跟踪它观点。框架为我们提供了屏幕坐标系中QR标记方角的坐标。



QR标记姿势估计



在检测到QR标记后,您可能想要做的下一件事就是从它们获取相机姿势。



执行QR标记姿势估计您需要知道相机的校准参数。这是相机矩阵和失真系数。每个相机镜头都有独特的参数,如焦距,主点和镜头失真模型。查找固有摄像机参数的过程称为摄像机校准。相机校准过程对于增强现实应用非常重要,因为它描述了输出图像上的透视变换和镜头失真。为了通过增强现实实现最佳用户体验,应使用相同的透视投影来实现增强对象的可视化。



最后,校准后得到的是相机矩阵:具有焦距和相机中心坐标(又称内部参数)的3x3元素矩阵,以及失真系数:5个或更多元素的矢量,用于模拟相机产生的失真。对于大多数iDevices,校准参数非常相同。



通过标记角的精确位置,我们可以估算出相机与3D空间中的标记之间的转换。该操作被称为来自2D-3D对应的姿势估计。姿势估计过程在相机和对象之间找到欧几里德变换(仅由旋转和平移分量组成)。





C用于表示相机中心。 P1-P4点是世界坐标系中的3D点,p1-p4点是它们在相机图像平面上的投影。我们的目标是使用内在矩阵和图像平面上的已知点投影(P1-P4)找到3D世界中已知标记位置(p1-p4)与相机C之间的相对变换。



OpenCV函数用于计算QR标记变换,使其最小化重投影误差,即观察投影的imagePoints和投影objectPoints之间的平方距离之和。估计的变换由旋转(rvec)和平移分量(tvec)定义。这也称为欧几里德变换或刚性变换。最后,我们得到了旋转四元数和QR标记的翻译矩阵。



整合到Apple的ARKit



最后一部分是将有关QR标记姿势的所有信息整合到ARKit创建的3D场景中。 ARKit使用Visual Inertial Odometry(VIO)来精确跟踪周围的世界。 VIO将相机传感器数据与CoreMotion数据融合在一起。这两个输入允许设备以高精度检测房间内的移动方式,无需任何额外校准。所有渲染内容都基于Apple的Metal和它上面的Apple的SceneKit。



为了以正确的方式在QR标记上渲染SceneKit的节点,我们需要创建一个模型我们从OpenCV得到的四元数和平移矩阵的QR标记矩阵。下一步是通过SceneKit场景虚拟摄像机的变换矩阵乘以QR标记的模型矩阵。因此,我们可以看到一个自定义节点(我们项目中的Axes节点)重复所有QR标记在现实世界中的移动,而它在iPhone的相机视野中,如果不是 - 它保持在最后一次更新因此我们可以检查它。




ARKit 2.0 updates many intrinsic (and extrinsic) parameters of the ARCamera from frame to frame. I'd like to know if it also takes Radial Lens Distortion into consideration (like in AVCameraCalibrationData class that ARKit doesn't use), and fix the video frames' distortion appropriately (distort/undistort operations) for back iPhone and iPad cameras?

var intrinsics: simd_float3x3 { get }

As we all know, the Radial Lens Distortion greatly affects the 6 DOF pose estimation accuracy when we place undistorted 3D objects in distorted by a lens real world scene.

var lensDistortionLookupTable: Data? { get } 

// A map of floating-point values describing radial 
// lens distortions in AVCameraCalibrationData class

If Lens Distortion's math in ARKit 2.0 is available thru API, where I can find it?

解决方案

Although it's not explicitly stated, I'm certain that ARKit is correcting for non-linear lens distortion. Lens distortion (and inverse distortion) lookup tables exist in iOS11 and are available via AVCameraCalibrationData, but they are not exposed by ARKit, presumably because there is no need for them since you're already working with rectified coordinates.

Whether or not the distortion model parameters are the same for each device model (i.e. exact same values for each iPhone 7) it's an interesting question. I don't have access to multiple phones of the same model, but this shouldn't be hard to figure out for someone who does.

source

As an exapmple from : https://github.com/eugenebokhan/ARKit-Multiplayer QR marker detection

With the help of Apple's Vision now it's possible to recognize QR marker in camera's videofeed and track it while it is in the field of view. The framework provides us the coordinates of the QR marker square corners in the screen's coordinate system.

QR marker pose estimation

The next thing you probably want to do after detecting the QR markers is to obtain the camera pose from them.

To perform QR marker pose estimation you need to know the calibration parameters of your camera. This is the camera matrix and distortion coefficients. Each camera lens has unique parameters, such as focal length, principal point, and lens distortion model. The process of finding intrinsic camera parameters is called camera calibration. The camera calibration process is important for Augmented Reality applications because it describes the perspective transformation and lens distortion on an output image. To achieve the best user experience with Augmented Reality, visualization of an augmented object should be done using the same perspective projection.

At the end, what you get after the calibration is the camera matrix: a matrix of 3x3 elements with the focal distances and the camera center coordinates (a.k.a intrinsic parameters), and the distortion coefficients: a vector of 5 elements or more that models the distortion produced by your camera. The calibration parameters are pretty the same for most of iDevices.

With the precise location of marker corners, we can estimate a transformation between our camera and a marker in 3D space. This operation is known as pose estimation from 2D-3D correspondences. The pose estimation process finds an Euclidean transformation (that consists only of rotation and translation components) between the camera and the object.

The C is used to denote the camera center. The P1-P4 points are 3D points in the world coordinate system and the p1-p4 points are their projections on the camera's image plane. Our goal is to find relative transformation between a known marker position in the 3D world (p1-p4) and the camera C using an intrinsic matrix and known point projections on image plane (P1-P4).

OpenCV functions are used to calculate the QR marker transformation in such a way that it minimizes the reprojection error, that is the sum of squared distances between the observed projection's imagePoints and the projected objectPoints. The estimated transformation is defined by rotation (rvec) and translation components (tvec). This is also known as Euclidean transformation or rigid transformation. At the end we get rotation quaternion and a translation matrix of the QR marker.

Integration into Apple's ARKit

The final part is the integration of all the information about QR marker's pose into the 3D scene created by ARKit. ARKit uses Visual Inertial Odometry (VIO) to accurately track the world around it. VIO fuses camera sensor data with CoreMotion data. These two inputs allow the device to sense how it moves within a room with a high degree of accuracy, and without any additional calibration. All the rendering stuff is based on Apple's Metal and Apple's SceneKit above it.

In order to render SceneKit's node on our QR marker in a proper way we need to create a model matrix of our QR marker from the quaternion and translation matrix we've got from OpenCV. The next step is to multiply QR marker's model matrix by SceneKit scene virtual camera's transform matrix. As a result we can see a custom node (Axes node in our project) that repeats all the QR marker's movements in the real world while its in the filed of view of the iPhone's camera and if it is not - it stays on the last updated position so we can examine it around.

这篇关于ARKit 2.0是否会考虑iPhone和iPad中的镜头失真?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆