Kinect - 将 (x, y) 像素坐标映射到“真实世界";使用深度坐标 [英] Kinect - Map (x, y) pixel coordinates to "real world" coordinates using depth

查看:33
本文介绍了Kinect - 将 (x, y) 像素坐标映射到“真实世界";使用深度坐标的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发一个项目,该项目使用 Kinect 和 OpenCV 将 fintertip 坐标导出到 Flash,以便在游戏和其他程序中使用.目前,我们的设置基于颜色并以 (x, y, z) 格式将指尖点导出到 Flash,其中 x 和 y 以像素为单位,z 以毫米为单位.

I'm working on a project that uses the Kinect and OpenCV to export fintertip coordinates to Flash for use in games and other programs. Currently, our setup works based on color and exports fingertip points to Flash in (x, y, z) format where x and y are in Pixels and z is in Millimeters.

但是,我们希望使用 Flash 中的 z 深度值将这些 (x, y) 坐标映射到真实世界"值,例如毫米.

But, we want map those (x, y) coordinates to "real world" values, like Millimeters, using that z depth value from within Flash.

据我所知,Kinect 3D 深度是通过沿相机的水平方向投影 X 轴、沿着相机的垂直方向投影 Y 轴以及直接从相机镜头向前投影的 Z 轴获得的.深度值是从任何给定对象到 XY 平面绘制的垂线的长度.见下面链接中的图片(从微软网站获得).

As I understand, the Kinect 3D depth is obtained via projecting the X-axis along the camera's horizontal, it's Y-axis along the camera's vertical, and it's Z-axis directly forward out of the camera's lens. Depth values are then the length of the perpendicular drawn from any given object to the XY-plane. See the picture in the below link (obtained from microsoft's website).

微软深度坐标系统示例

此外,我们知道 Kinect 的水平视野以 117 度角投射.

Also, we know that the Kinect's horizontal field of vision is projected in a 117 degree angle.

利用这些信息,我想我可以将任何给定点的深度值投影到 x=0, y=0 线上,并在该点绘制一条平行于 XY 平面的水平线,与相机的视野相交.我最终得到一个三角形,分成两半,高度为所讨论对象的深度.然后我可以使用一点三角函数来求解视场的宽度.我的等式是:

Using this information, I figured I could project the depth value of any given point onto the x=0, y=0 line and draw a horizontal line parallel to the XY-plane at that point, intersecting the camera's field of vision. I end up with a triangle, split in half, with a height of the depth of an object in question. I can then solve for the width of the field of view using a little trigonometry. My equation is:

W = tan(theta/2) * h * 2

W = tan(theta / 2) * h * 2

地点:

  • W = 视野宽度
  • theta = 水平视场角(117 度)
  • h = 深度值

(抱歉,我不能发图片,如果可以的话我会的)

(Sorry, I can't post a picture, I would if I could)

现在,求解 1000 毫米(1 米)的深度值,得出的值约为 3264 毫米.

Now, solving for a depth value of 1000mm (1 meter), gives a value of about 3264mm.

但是,当实际查看生成的相机图像时,我得到了不同的值.也就是说,我在距离相机 1 米的地方放了一个测光棒,发现画框的宽度最多为 1.6 米,而不是计算得出的 3.264 米.

However, when actually LOOKING at the camera image produced I get a different value. Namely, I placed a meter stick 1 meter away from the camera and noticed that the width of the frame was at most 1.6 meters, not the estimated 3.264 meters from calculations.

我在这里遗漏了什么吗?任何帮助将不胜感激.

Is there something I'm missing here? Any help would be appreciated.

推荐答案

深度流是正确的.您确实应该获取深度值,然后从 Kinect 传感器中,您可以轻松定位现实世界中相对于 Kinect 的点.这是通过简单的三角函数完成的,但是您必须记住,深度值是从 Kinect眼睛"到测量点的距离,因此它是长方体的对角线.

The depth stream is correct. You should indeed take the depth value, and then from the Kinect sensor, you can easily locate the point in the real world relative to the Kinect. This is done by simple trigonometry, however you must keep in mind that the depth value is the distance from the Kinect "eye" to the point measured, so it is a diagonal of a cuboid.

实际上,请点击此链接如何使用 Kinect 从不同的对象获取真实世界坐标 (x, y, z)

重写没有用,你有正确的答案.

It's no use rewriting, there you have the right answer.

这篇关于Kinect - 将 (x, y) 像素坐标映射到“真实世界";使用深度坐标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆