Google ARCore域模型示例 [英] Google ARCore Domain Model by Example

查看:251
本文介绍了Google ARCore域模型示例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试阅读并理解Google ARCore的域模型,尤其是 Android SDK 包。目前,此SDK处于预览模式,因此没有关于如何使用此API的教程,博客,文章等。即使谷歌本身也建议只阅读源代码,源代码注释和Javadocs以了解如何使用API​​。问题是:如果你还不是计算机视觉专家,那么领域模型会感觉有点外星人和你不熟悉。



具体来说,我有兴趣了解以下课程之间的根本区别和正确用法:





根据 Anchor 的javadoc:


描述现实世界中固定的位置和方向。保持在物理空间的固定位置e,随着ARCore对空间的理解的改善,这个位置的数字描述将会更新。使用getPose()获取此锚点的当前数字位置。这个位置可能会随时更改update(),但永远不会自发更改。


所以主播有一个姿势。听起来像你掉一个锚点到相机中可见的东西上,然后ARCore跟踪那个Anchor并不断更新它的姿势以反映其屏幕坐标的性质可能?



并且来自姿势的javadoc:


表示从一个坐标系到另一个坐标系的不可变刚性变换。根据所有ARCore API的提供,Poses总是描述从对象的局部坐标系到世界坐标系的转换(见下文)......这些改变意味着每个帧都应该被认为是在一个完全唯一的世界坐标系中。


所以它听起来就像一个姿势只是相机的当前帧所特有的,并且每次更新帧时,所有锚点的所有姿势都可能重新计算?如果没有,那么之间的关系是什么一个锚,它的姿势,当前的框架和世界坐标框架?什么是姿势真的,反正是什么?姿势只是一种存储矩阵/点数据的方式,这样你就可以转换一个从当前帧锚定到世界帧?还是别的什么?



最后,我看到帧,姿势和锚之间有很强的相关性,但是那时 PointCloud 。我唯一的课程在 com.google.ar.core 里面看到使用这些是 Frame PointClouds 似乎是(x,y,z) - 与第4个属性协调,表示ARCore对x / y / z组件的信心实际上是正确的。因此,如果一个Anchor有一个Pose,我会想象一个Pose也会有一个PointCloud代表Anchor的坐标&对那些坐标的信心。但是Pose 没有拥有PointCloud,所以我必须完全误解这两个类建模的概念。






问题



我在上面提出了几个不同的问题,但它们都归结为一个简洁,可回答的问题:



Frame,Anchor,Pose和PointCloud背后的概念有什么区别?你何时使用它们(以及用于何种目的)?

解决方案

姿势是一种结构化转型。它是从一个坐标系(通常是对象本地)到另一个坐标系(通常是世界)的固定数值转换。



一个锚点表示世界上物理上固定的位置。这是 getPose()将随着对世界的理解的变化而更新。例如,假设您有一个建筑物,外面有走廊。如果你一直走在那个走廊上,传感器漂移会导致你没有在你开始时的相同坐标处收拾。但是,ARCore可以检测(使用视觉特征)它在启动它的同一空间。当发生这种情况时,它会扭曲世界,使您当前的位置和原始位置对齐。作为这种扭曲的一部分,锚点的位置也会被调整,以便它们保持在同一个物理位置。



由于这种失真,相对于世界的姿势应被视为仅在其返回的帧的持续时间内有效。一旦你下次再打电话给 update(),世界可能已经重塑了那个姿势可能没用。如果您需要保留一个比帧长的位置,请创建一个 Anchor 。只需确保 removeAnchors()您不再使用的锚点,因为每个实时锚点都有持续成本。



A Frame 立即捕获当前状态,并在两次调用 update()之间进行更改。 / p>

PointCloud 是世界上检测到的三组视觉特征点。它们位于自己的本地坐标系中,可以从 Frame.getPointCloudPose()访问。希望比平面检测更好的空间理解的开发人员可以尝试使用点云来了解有关3D世界结构的更多信息。



这有帮助吗?


I'm trying to read and make sense of Google ARCore's domain model, particularly the Android SDK packages. Currently this SDK is in "preview" mode and so there are no tutorials, blogs, articles, etc. available on understanding how to use this API. Even Google itself suggests just reading the source code, source code comments and Javadocs to understand how to use the API. Problem is: if you're not already a computer vision expert, the domain model will feel a little alien & unfamiliar to you.

Specifically I'm interested in understanding the fundamental differences between, and proper usages of, the following classes:

According to Anchor's javadoc:

"Describes a fixed location and orientation in the real world. To stay at a fixed location in physical space, the numerical description of this position will update as ARCore's understanding of the space improves. Use getPose() to get the current numerical location of this anchor. This location may change any time update() is called, but will never spontaneously change."

So Anchors have a Pose. Sounds like you "drop an Anchor" onto something thats visible in the camera, and then ARCore tracks that Anchor and constantly updates its Pose to reflect the nature of its onscreen coordinates maybe?

And from Pose's javadoc:

"Represents an immutable rigid transformation from one coordinate frame to another. As provided from all ARCore APIs, Poses always describe the transformation from object's local coordinate frame to the world coordinate frame (see below)...These changes mean that every frame should be considered to be in a completely unique world coordinate frame."

So it sounds like a Pose is something that is only unique to the "current frame" of the camera and that each time the frame is updated, all poses for all anchors are recalculated maybe? If not, then what's the relationship between an Anchor, its Pose, the current frame and the world coordinate frame? And what's a Pose really, anyways? Is a "Pose" just a way of storing matrix/point data so that you can convert an Anchor from the current frame to the world frame? Or something else?

Finally, I see a strong correlation between Frames, Poses and Anchors, but then there's PointCloud. The only class I can see inside com.google.ar.core that uses these is the Frame. PointClouds appear to be (x,y,z)-coordinates with a 4th property representing ARCore's "confidence" that the x/y/z components are actually correct. So if an Anchor has a Pose, I would have imagined that a Pose would also have a PointCloud representing the Anchor's coordinates & confidence in those coordinates. But Pose does not have a PointCloud, and so I must be completely misunderstanding the concepts that these two classes model.


The question

I've posed several different questions above, but they all boil down to a single, concise, answerable question:

What is the difference in the concepts behind Frame, Anchor, Pose and PointCloud and when do you use each of them (and for what purposes)?

解决方案

A Pose is a structured transformation. It is a fixed numerical transformation from one coordinate system (typically object local) to another (typically world).

An Anchor represents a physically fixed location in the world. It's getPose() will update as the understanding of the world changes. For example, imagine you have a building with a hallway around the outside. If you walk all the way around that hallway, sensor drift results in you not winding up at the same coordinates you started at. However, ARCore can detect (using visual features) that it is in the same space it started it. When this happens, it distorts the world so that your current location and original location line up. As part of this distortion, the location of anchors will be adjusted as well so that they stay in the same physical place.

Because of this distortion, a Pose relative to the world should be considered valid only for the duration of the frame during which it was returned. As soon as you call update() the next time, the world may have reshaped at that pose could be useless. If you need to keep a location longer than a frame, create an Anchor. Just make sure to removeAnchors() anchors that you're no longer using, as there is ongoing cost for each live anchor.

A Frame captures the current state at an instant and changes between two calls to update().

PointClouds are sets of 3D visual feature points detected in the world. They are in their own local coordinate system, which can be accessed from Frame.getPointCloudPose(). Developers looking to have better spatial understanding than the plane detection provides can try using the point clouds to learn more about the structure of the 3D world.

Does that help?

这篇关于Google ARCore域模型示例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆