确定图像倾斜的有效方法 [英] Efficient ways to determine tilt of an image

查看:143
本文介绍了确定图像倾斜的有效方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试编写一个程序来以编程方式确定任意图像中的倾斜或旋转角度。

I'm trying to write a program to programmatically determine the tilt or angle of rotation in an arbitrary image.

图像具有以下属性:


  • 在浅色背景上包含深色文字

  • 偶尔包含仅以90度角相交的水平或垂直线条。

  • 在-45到45度之间倾斜。

  • 参见此图片作为参考(它已倾斜2.8度)。

  • Consist of dark text on a light background
  • Occasionally contain horizontal or vertical lines which only intersect at 90 degree angles.
  • Skewed between -45 and 45 degrees.
  • See this image as a reference (its been skewed 2.8 degrees).

到目前为止,我已经提出了这个策略:从左到右画一条路线,总是选择最近的白色像素。据推测,从左到右的路线将更喜欢沿着图像倾斜的文本行之间的路径。

So far, I've come up with this strategy: Draw a route from left to right, always selecting the nearest white pixel. Presumably, the route from left to right will prefer to follow the path between lines of text along the tilt of the image.

这是我的代码:

private bool IsWhite(Color c) { return c.GetBrightness() >= 0.5 || c == Color.Transparent; }

private bool IsBlack(Color c) { return !IsWhite(c); }

private double ToDegrees(decimal slope) { return (180.0 / Math.PI) * Math.Atan(Convert.ToDouble(slope)); }

private void GetSkew(Bitmap image, out double minSkew, out double maxSkew)
{
    decimal minSlope = 0.0M;
    decimal maxSlope = 0.0M;
    for (int start_y = 0; start_y < image.Height; start_y++)
    {
        int end_y = start_y;
        for (int x = 1; x < image.Width; x++)
        {
            int above_y = Math.Max(end_y - 1, 0);
            int below_y = Math.Min(end_y + 1, image.Height - 1);

            Color center = image.GetPixel(x, end_y);
            Color above = image.GetPixel(x, above_y);
            Color below = image.GetPixel(x, below_y);

            if (IsWhite(center)) { /* no change to end_y */ }
            else if (IsWhite(above) && IsBlack(below)) { end_y = above_y; }
            else if (IsBlack(above) && IsWhite(below)) { end_y = below_y; }
        }

        decimal slope = (Convert.ToDecimal(start_y) - Convert.ToDecimal(end_y)) / Convert.ToDecimal(image.Width);
        minSlope = Math.Min(minSlope, slope);
        maxSlope = Math.Max(maxSlope, slope);
    }

    minSkew = ToDegrees(minSlope);
    maxSkew = ToDegrees(maxSlope);
}

这适用于某些图像,其他图像效果不佳,而且速度慢。

This works well on some images, not so well on others, and its slow.

是否有更有效,更可靠的方法来确定图像的倾斜度?

推荐答案

我对我的代码进行了一些修改,它确实运行得更快,但不是很准确。

I've made some modifications to my code, and it certainly runs a lot faster, but its not very accurate.

我做了以下改进:


  • 使用

  • Using Vinko's suggestion, I avoid GetPixel in favor of working with bytes directly, now the code runs at the speed I needed.

我的原始代码只使用了IsBlack和IsWhite,但这还不够精细。原始代码在图像中跟踪以下路径:

My original code simply used "IsBlack" and "IsWhite", but this isn't granular enough. The original code traces the following paths through the image:

http://img43.imageshack.us/img43/1545/tilted3degtextoriginalw.gif

请注意,有许多路径通过文本。通过将我的中心,上方和下方路径与实际亮度值进行比较并选择最亮的像素。基本上我将位图视为高度图,从左到右的路径遵循图像的轮廓,从而产生更好的路径:

Note that a number of paths pass through the text. By comparing my center, above, and below paths to the actual brightness value and selecting the brightest pixel. Basically I'm treating the bitmap as a heightmap, and the path from left to right follows the contours of the image, resulting a better path:

http://img10.imageshack.us/img10/5807/tilted3degtextbrightnes.gif

根据 Toaomalkster ,高斯模糊平滑了高度图,我得到了更好的结果:

As suggested by Toaomalkster, a Gaussian blur smooths out the height map, I get even better results:

http://img197.imageshack.us/img197/742/tilted3degtextblurredwi.gif

由于这只是原型代码,我使用GIMP模糊了图像,我没有编写自己的模糊函数。

Since this is just prototype code, I blurred the image using GIMP, I did not write my own blur function.

所选路径非常适合于贪心算法。

The selected path is pretty good for a greedy algorithm.

as Toaomalkster建议,选择最小/最大斜率是天真的。简单的线性回归可以更好地逼近路径的斜率。另外,一旦我跑掉图像的边缘,我应该缩短一条路径,否则路径将拥抱图像的顶部并给出不正确的斜率。

As Toaomalkster suggested, choosing the min/max slope is naive. A simple linear regression provides a better approximation of the slope of a path. Additionally, I should cut a path short once I run off the edge of the image, otherwise the path will hug the top of the image and give an incorrect slope.

代码

private double ToDegrees(double slope) { return (180.0 / Math.PI) * Math.Atan(slope); }

private double GetSkew(Bitmap image)
{
    BrightnessWrapper wrapper = new BrightnessWrapper(image);

    LinkedList<double> slopes = new LinkedList<double>();

    for (int y = 0; y < wrapper.Height; y++)
    {
        int endY = y;

        long sumOfX = 0;
        long sumOfY = y;
        long sumOfXY = 0;
        long sumOfXX = 0;
        int itemsInSet = 1;
        for (int x = 1; x < wrapper.Width; x++)
        {
            int aboveY = endY - 1;
            int belowY = endY + 1;

            if (aboveY < 0 || belowY >= wrapper.Height)
            {
                break;
            }

            int center = wrapper.GetBrightness(x, endY);
            int above = wrapper.GetBrightness(x, aboveY);
            int below = wrapper.GetBrightness(x, belowY);

            if (center >= above && center >= below) { /* no change to endY */ }
            else if (above >= center && above >= below) { endY = aboveY; }
            else if (below >= center && below >= above) { endY = belowY; }

            itemsInSet++;
            sumOfX += x;
            sumOfY += endY;
            sumOfXX += (x * x);
            sumOfXY += (x * endY);
        }

        // least squares slope = (NΣ(XY) - (ΣX)(ΣY)) / (NΣ(X^2) - (ΣX)^2), where N = elements in set
        if (itemsInSet > image.Width / 2) // path covers at least half of the image
        {
            decimal sumOfX_d = Convert.ToDecimal(sumOfX);
            decimal sumOfY_d = Convert.ToDecimal(sumOfY);
            decimal sumOfXY_d = Convert.ToDecimal(sumOfXY);
            decimal sumOfXX_d = Convert.ToDecimal(sumOfXX);
            decimal itemsInSet_d = Convert.ToDecimal(itemsInSet);
            decimal slope =
                ((itemsInSet_d * sumOfXY) - (sumOfX_d * sumOfY_d))
                /
                ((itemsInSet_d * sumOfXX_d) - (sumOfX_d * sumOfX_d));

            slopes.AddLast(Convert.ToDouble(slope));
        }
    }

    double mean = slopes.Average();
    double sumOfSquares = slopes.Sum(d => Math.Pow(d - mean, 2));
    double stddev = Math.Sqrt(sumOfSquares / (slopes.Count - 1));

    // select items within 1 standard deviation of the mean
    var testSample = slopes.Where(x => Math.Abs(x - mean) <= stddev);

    return ToDegrees(testSample.Average());
}

class BrightnessWrapper
{
    byte[] rgbValues;
    int stride;
    public int Height { get; private set; }
    public int Width { get; private set; }

    public BrightnessWrapper(Bitmap bmp)
    {
        Rectangle rect = new Rectangle(0, 0, bmp.Width, bmp.Height);

        System.Drawing.Imaging.BitmapData bmpData =
            bmp.LockBits(rect,
                System.Drawing.Imaging.ImageLockMode.ReadOnly,
                bmp.PixelFormat);

        IntPtr ptr = bmpData.Scan0;

        int bytes = bmpData.Stride * bmp.Height;
        this.rgbValues = new byte[bytes];

        System.Runtime.InteropServices.Marshal.Copy(ptr,
                       rgbValues, 0, bytes);

        this.Height = bmp.Height;
        this.Width = bmp.Width;
        this.stride = bmpData.Stride;
    }

    public int GetBrightness(int x, int y)
    {
        int position = (y * this.stride) + (x * 3);
        int b = rgbValues[position];
        int g = rgbValues[position + 1];
        int r = rgbValues[position + 2];
        return (r + r + b + g + g + g) / 6;
    }
}

代码 ,但不是伟大的。大量的空白会导致程序绘制相对平坦的线条,导致斜率接近0,导致代码低估图像的实际倾斜度。

The code is good, but not great. Large amounts of whitespace cause the program to draw relatively flat line, resulting in a slope near 0, causing the code to underestimate the actual tilt of the image.

没有通过选择随机采样点与对所有点进行采样,倾斜精度存在明显差异,因为随机采样选择的平坦路径的比率与整个图像中平坦路径的比率相同。

There is no appreciable difference in the accuracy of the tilt by selecting random sample points vs sampling all points, because the ratio of "flat" paths selected by random sampling is the same as the ratio of "flat" paths in the entire image.

这篇关于确定图像倾斜的有效方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆