摄像头像素为平面的世界分给4已知点 [英] Camera pixels to planar world points given 4 known points

查看:685
本文介绍了摄像头像素为平面的世界分给4已知点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题,我假设是容易的,但我还是一直没能解决这个问题,由于我线性代数前一段时间的体验。我已阅读来自几所大学公布的presentations,但我似乎无法跟随有点非标准化符号。如果任何人有一个更好的例子,这将是更AP preciated ...

My problem I am assuming is easy, but I still haven't been able to solve it due to my experience in linear algebra a while ago. I have read presentations published from several universities but I just can't seem to follow the somewhat non-standardized notation. If anyone has a better example it would be much appreciated...

问题: 相机被成角度向下朝向地板上。给定一个像素坐标,我希望能够得到在地板上的平面各自的3D世界坐标。

Problem: The camera is angled down facing the floor. Given a pixel coordinate, I want to be able to get the respective 3D world coordinate on the plane of the floor.

已知的:

  • 在4点在哪里,我知道像素地板(X,Y)坐标和相关的世界(X,Y,Z = 0)坐标。
  • 在摄像机的位置是固定的,我知道在X,Y位移,相机的Z方向上。

未知:

  • 有关X,Y,Z轴相机的旋转。首先,旋转相机几乎与Y'放X轴; Z可以是最小的轮换,但我认为应该考虑的问题。
  • 失真系数,然而没有在线条图象的最少的弯曲和将多preFER不棋盘校准过程中带来。有些错误,因为这样做的结果是不是一个大忌。

我已经研究过 一个惊人的例子是找到<一href="http://stackoverflow.com/questions/12299870/computing-x-y-coordinate-3d-from-image-point?rq=1">here.它在本质上是完全一样的问题,但一些跟进的问题:

What I've looked into A phenomenal example is found here. In essence it's the exact same problem, but some follow up questions:

SolvePnP看起来是我的朋友,但我也不太清楚该怎么做相机矩阵或distCoefficients。有一些方法可以让我避免了摄像头矩阵和DIST系数校准步骤,这是做我觉得跟棋盘进程(也许在一些精度的成本)?或者是有一些简单的方法来做到这一点?

SolvePnP looks to be my friend, but I'm not too sure what to do about the Camera Matrix or the distCoefficients. Is there some way I can avoid the camera matrix and dist coefficients calibration steps, which is done I think with the checkerboard process (maybe at the cost of some accuracy)? Or is there some simpler way to do this?

许多AP preciate您的输入!

Much appreciate your input!

推荐答案

试试这个方法:

从4点对应计算单应,给你所有的信息图像平面和地平面坐标之间进行转换。

compute the homography from 4 point correspondences, giving you all information to transform between image plane and ground plane coordinates.

这种方法的局限性在于它假设均匀参数化图像平面(针孔摄像头),所以在我的例子中看到的镜头畸变会给你的错误。如果你能删除镜头畸变的影响,你会去非常好,这种方法我猜。 此外,您将得到给予稍有不妥像素坐标为您的书信,你可以,如果你提供更多的对应关系得到更稳定的值有一些错误。

limitation of this approach is that it assumes a uniformly parameterized image plane (pinhole camera), so lens distortion will give you errors as seen in my example. if you are able to remove lens distortion effects, you'll go very well with this approach i guess. In addition you will get some error of giving slightly wrong pixel coordinates as your correspondences, you can get more stable values if you provide more correspondences.

使用该输入图像

我从一个图像处理软件,这将对应于你知道4点图像的事实阅读4的一个角落象棋领域。我选择这些点(标记为绿色):

I've read the 4 corners of one chess field from an image manipulation software, which will correspond to the fact that you know 4 points in your image. I've chosen those points (marked green):

现在我已经做了两件事:一是转变棋盘图案的坐标图像(0,0),(0,1)等这使测绘质量良好的视觉IM pression。第二个我变换从图像的世界。在棋盘读取对应于(0,0)的图像位置(87291)的最左边角落位置坐标。如果我改变你所期望的像素位置(0,0)结果。

now I've done two things: first transforming chessboard pattern coordinates to image (0,0) , (0,1) etc this gives a good visual impression of mapping quality. second I transform from image to world. reading the leftmost corner position in image location (87,291) which corresponds to (0,0) in chessboard coordinates. if i transform that pixel location you would expect (0,0) as a result.

cv::Point2f transformPoint(cv::Point2f current, cv::Mat transformation)
{
    cv::Point2f transformedPoint;
    transformedPoint.x = current.x * transformation.at<double>(0,0) + current.y * transformation.at<double>(0,1) + transformation.at<double>(0,2);
    transformedPoint.y = current.x * transformation.at<double>(1,0) + current.y * transformation.at<double>(1,1) + transformation.at<double>(1,2);
    float z = current.x * transformation.at<double>(2,0) + current.y * transformation.at<double>(2,1) + transformation.at<double>(2,2);
    transformedPoint.x /= z;
    transformedPoint.y /= z;

    return transformedPoint;
}

int main()
{
    // image from http://d20uzhn5szfhj2.cloudfront.net/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/5/2/52440-chess-board.jpg

    cv::Mat chessboard = cv::imread("../inputData/52440-chess-board.jpg");

    // known input:
    // image locations / read pixel values
    //  478,358
    //  570, 325
    //  615,382
    //  522,417

    std::vector<cv::Point2f> imageLocs;
    imageLocs.push_back(cv::Point2f(478,358));
    imageLocs.push_back(cv::Point2f(570, 325));
    imageLocs.push_back(cv::Point2f(615,382));
    imageLocs.push_back(cv::Point2f(522,417));

    for(unsigned int i=0; i<imageLocs.size(); ++i)
    {
        cv::circle(chessboard, imageLocs[i], 5, cv::Scalar(0,0,255));
    }
    cv::imwrite("../outputData/chessboard_4points.png", chessboard);

    // known input: this is one field of the chessboard. you could enter any (corresponding) real world coordinates of the ground plane here.
    // world location:
    // 3,3
    // 3,4
    // 4,4
    // 4,3

    std::vector<cv::Point2f> worldLocs;
    worldLocs.push_back(cv::Point2f(3,3));
    worldLocs.push_back(cv::Point2f(3,4));
    worldLocs.push_back(cv::Point2f(4,4));
    worldLocs.push_back(cv::Point2f(4,3));


    // for exactly 4 correspondences. for more you can use cv::findHomography
    // this is the transformation from image coordinates to world coordinates:
    cv::Mat image2World = cv::getPerspectiveTransform(imageLocs, worldLocs);
    // the inverse is the transformation from world to image.
    cv::Mat world2Image = image2World.inv();


    // create all known locations of the chessboard (0,0) (0,1) etc we will transform them and test how good the transformation is.
    std::vector<cv::Point2f> worldLocations;
    for(unsigned int i=0; i<9; ++i)
        for(unsigned int j=0; j<9; ++j)
        {
            worldLocations.push_back(cv::Point2f(i,j));
        }


    std::vector<cv::Point2f> imageLocations;

    for(unsigned int i=0; i<worldLocations.size(); ++i)
    {
        // transform the point
        cv::Point2f tpoint = transformPoint(worldLocations[i], world2Image);
        // draw the transformed point
        cv::circle(chessboard, tpoint, 5, cv::Scalar(255,255,0));
    }

    // now test the other way: image => world
    cv::Point2f imageOrigin = cv::Point2f(87,291);
    // draw it to show which origin i mean
    cv::circle(chessboard, imageOrigin, 10, cv::Scalar(255,255,255));
    // transform point and print result. expected result is "(0,0)"
    std::cout << transformPoint(imageOrigin, image2World) << std::endl;

    cv::imshow("chessboard", chessboard);
    cv::imwrite("../outputData/chessboard.png", chessboard);
    cv::waitKey(-1);


}

产生的图像是:

the resulting image is:

,你可以看到有错误的数据一些大的量。我说,这是因为稍有不妥像素坐标给出对应(和一个小的区域!内),并由于镜头畸变preventing地平面出现的图像上真正的飞机。

as you can see there is some big amount of error in the data. as I said it's because of slightly wrong pixel coordinates given as correspondences (and within a small area!) and because of lens distortion preventing the ground plane to appear as a real plane on the image.

转化(87291)以世界坐标的结果是:

results of transforming (87,291) to world coordinates are:

[0.174595, 0.144853]

预期/完美的结果会一直 [0,0]

希望这会有所帮助。

这篇关于摄像头像素为平面的世界分给4已知点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆