给定 4 个已知点的相机像素到平面世界点 [英] Camera pixels to planar world points given 4 known points

查看:29
本文介绍了给定 4 个已知点的相机像素到平面世界点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我假设的问题很简单,但由于我不久前在线性代数方面的经验,我仍然无法解决它.我已经阅读了几所大学发表的演讲,但我似乎无法遵循有些非标准化的符号.如果有人有更好的例子,将不胜感激...

My problem I am assuming is easy, but I still haven't been able to solve it due to my experience in linear algebra a while ago. I have read presentations published from several universities but I just can't seem to follow the somewhat non-standardized notation. If anyone has a better example it would be much appreciated...

问题:相机面向地板向下倾斜.给定一个像素坐标,我希望能够获得地板平面上的相应 3D 世界坐标.

Problem: The camera is angled down facing the floor. Given a pixel coordinate, I want to be able to get the respective 3D world coordinate on the plane of the floor.

已知:

  • 地板上的 4 个点,我知道 Pixel(x,y) 坐标和相关的 World(X,Y,Z=0) 坐标.
  • 相机的位置是固定的,我知道相机在 X、Y、Z 方向上的位移.

未知:

  • 相机围绕 x、y、z 轴的旋转.首先,相机以 Y & 围绕 X 轴旋转.Z 是最小旋转,但我认为应该考虑在内.
  • 失真系数,但是线条中的图像弯曲最小,并且更愿意不引入棋盘校准程序.由此产生的一些错误不是交易破坏者.

我研究了什么此处找到了一个非凡的例子. 本质上这是完全相同的问题,但有一些后续问题:

What I've looked into A phenomenal example is found here. In essence it's the exact same problem, but some follow up questions:

SolvePnP 看起来是我的朋友,但我不太确定如何处理相机矩阵或 distCoefficients.有什么方法可以避免相机矩阵和 dist 系数校准步骤,我认为这是通过棋盘过程完成的(可能会牺牲一些准确性)?或者有什么更简单的方法可以做到这一点?

SolvePnP looks to be my friend, but I'm not too sure what to do about the Camera Matrix or the distCoefficients. Is there some way I can avoid the camera matrix and dist coefficients calibration steps, which is done I think with the checkerboard process (maybe at the cost of some accuracy)? Or is there some simpler way to do this?

非常感谢您的意见!

推荐答案

试试这个方法:

从 4 个点对应计算单应性,为您提供图像平面和地平面坐标之间转换的所有信息.

compute the homography from 4 point correspondences, giving you all information to transform between image plane and ground plane coordinates.

这种方法的局限性在于它假设了一个统一参数化的图像平面(针孔相机),因此镜头失真会给您带来错误,如我的示例所示.如果您能够消除镜头失真效果,那么我猜您会很好地采用这种方法.另外你会得到一些错误,因为你的对应关系给出了稍微错误的像素坐标,如果你提供更多的对应关系,你可以获得更稳定的值.

limitation of this approach is that it assumes a uniformly parameterized image plane (pinhole camera), so lens distortion will give you errors as seen in my example. if you are able to remove lens distortion effects, you'll go very well with this approach i guess. In addition you will get some error of giving slightly wrong pixel coordinates as your correspondences, you can get more stable values if you provide more correspondences.

使用这个输入图像

我从图像处理软件中读取了一个国际象棋场的 4 个角,这对应于您知道图像中的 4 个点的事实.我选择了这些点(标记为绿色):

I've read the 4 corners of one chess field from an image manipulation software, which will correspond to the fact that you know 4 points in your image. I've chosen those points (marked green):

现在我做了两件事:首先将棋盘图案坐标转换为图像 (0,0) 、 (0,1) 等,这给出了映射质量的良好视觉印象.第二,我从形象转变为世界.读取图像位置 (87,291) 中最左角的位置,对应于棋盘坐标中的 (0,0).如果我转换那个像素位置,你会期望 (0,0) 结果.

now I've done two things: first transforming chessboard pattern coordinates to image (0,0) , (0,1) etc this gives a good visual impression of mapping quality. second I transform from image to world. reading the leftmost corner position in image location (87,291) which corresponds to (0,0) in chessboard coordinates. if i transform that pixel location you would expect (0,0) as a result.

cv::Point2f transformPoint(cv::Point2f current, cv::Mat transformation)
{
    cv::Point2f transformedPoint;
    transformedPoint.x = current.x * transformation.at<double>(0,0) + current.y * transformation.at<double>(0,1) + transformation.at<double>(0,2);
    transformedPoint.y = current.x * transformation.at<double>(1,0) + current.y * transformation.at<double>(1,1) + transformation.at<double>(1,2);
    float z = current.x * transformation.at<double>(2,0) + current.y * transformation.at<double>(2,1) + transformation.at<double>(2,2);
    transformedPoint.x /= z;
    transformedPoint.y /= z;

    return transformedPoint;
}

int main()
{
    // image from http://d20uzhn5szfhj2.cloudfront.net/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/5/2/52440-chess-board.jpg

    cv::Mat chessboard = cv::imread("../inputData/52440-chess-board.jpg");

    // known input:
    // image locations / read pixel values
    //  478,358
    //  570, 325
    //  615,382
    //  522,417

    std::vector<cv::Point2f> imageLocs;
    imageLocs.push_back(cv::Point2f(478,358));
    imageLocs.push_back(cv::Point2f(570, 325));
    imageLocs.push_back(cv::Point2f(615,382));
    imageLocs.push_back(cv::Point2f(522,417));

    for(unsigned int i=0; i<imageLocs.size(); ++i)
    {
        cv::circle(chessboard, imageLocs[i], 5, cv::Scalar(0,0,255));
    }
    cv::imwrite("../outputData/chessboard_4points.png", chessboard);

    // known input: this is one field of the chessboard. you could enter any (corresponding) real world coordinates of the ground plane here.
    // world location:
    // 3,3
    // 3,4
    // 4,4
    // 4,3

    std::vector<cv::Point2f> worldLocs;
    worldLocs.push_back(cv::Point2f(3,3));
    worldLocs.push_back(cv::Point2f(3,4));
    worldLocs.push_back(cv::Point2f(4,4));
    worldLocs.push_back(cv::Point2f(4,3));


    // for exactly 4 correspondences. for more you can use cv::findHomography
    // this is the transformation from image coordinates to world coordinates:
    cv::Mat image2World = cv::getPerspectiveTransform(imageLocs, worldLocs);
    // the inverse is the transformation from world to image.
    cv::Mat world2Image = image2World.inv();


    // create all known locations of the chessboard (0,0) (0,1) etc we will transform them and test how good the transformation is.
    std::vector<cv::Point2f> worldLocations;
    for(unsigned int i=0; i<9; ++i)
        for(unsigned int j=0; j<9; ++j)
        {
            worldLocations.push_back(cv::Point2f(i,j));
        }


    std::vector<cv::Point2f> imageLocations;

    for(unsigned int i=0; i<worldLocations.size(); ++i)
    {
        // transform the point
        cv::Point2f tpoint = transformPoint(worldLocations[i], world2Image);
        // draw the transformed point
        cv::circle(chessboard, tpoint, 5, cv::Scalar(255,255,0));
    }

    // now test the other way: image => world
    cv::Point2f imageOrigin = cv::Point2f(87,291);
    // draw it to show which origin i mean
    cv::circle(chessboard, imageOrigin, 10, cv::Scalar(255,255,255));
    // transform point and print result. expected result is "(0,0)"
    std::cout << transformPoint(imageOrigin, image2World) << std::endl;

    cv::imshow("chessboard", chessboard);
    cv::imwrite("../outputData/chessboard.png", chessboard);
    cv::waitKey(-1);


}

生成的图像是:

如您所见,数据中存在大量错误.正如我所说,这是因为作为对应关系给出的像素坐标略有错误(并且在一个小区域内!),并且因为镜头失真阻止了地平面在图像上显示为真实平面.

as you can see there is some big amount of error in the data. as I said it's because of slightly wrong pixel coordinates given as correspondences (and within a small area!) and because of lens distortion preventing the ground plane to appear as a real plane on the image.

将 (87,291) 转换为世界坐标的结果是:

results of transforming (87,291) to world coordinates are:

[0.174595, 0.144853]

预期/完美的结果应该是 [0,0]

expected/perfect result would've been [0,0]

希望这会有所帮助.

这篇关于给定 4 个已知点的相机像素到平面世界点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆