视频压缩:什么是离散余弦变换? [英] Video Compression: What is discrete cosine transform?
问题描述
我实现了一种称为离散余弦变换的图像/视频变换技术。这种技术用于MPEG视频编码。我根据以下网址提出的想法来计算我的算法:
I've implemented an image/video transformation technique called discrete cosine transform. This technique is used in MPEG video encoding. I based my algorithm on the ideas presented at the following URL:
http://vsr.informatik.tu-chemnitz.de/~jan/MPEG/HTML/mpeg_tech.html
现在我可以转换黑白图像的8x8部分,例如:
Now I can transform an 8x8 section of a black and white image, such as:
0140 0124 0124 0132 0130 0139 0102 0088
0140 0123 0126 0132 0134 0134 0088 0117
0143 0126 0126 0133 0134 0138 0081 0082
0148 0126 0128 0136 0137 0134 0079 0130
0147 0128 0126 0137 0138 0145 0132 0144
0147 0131 0123 0138 0137 0140 0145 0137
0142 0135 0122 0137 0140 0138 0143 0112
0140 0138 0125 0137 0140 0140 0148 0143
进入这个包含所有重要信息的图像在右上角。变换后的块如下所示:
Into this an image with all the important information at the top right. The transformed block looks like this:
1041 0039 -023 0044 0027 0000 0021 -019
-050 0044 -029 0000 0009 -014 0032 -010
0000 0000 0000 0000 -018 0010 -017 0000
0014 -019 0010 0000 0000 0016 -012 0000
0010 -010 0000 0000 0000 0000 0000 0000
-016 0021 -014 0010 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 -010 0013 -014 0010 0000 0000
现在,我需要知道如何利用这种转换?我想检测相同图像(或另一个图像)中表示良好匹配的其他8x8块。
Now, I need to know how can I take advantage of this transformation? I'd like to detect other 8x8 blocks in the same image ( or another image ) that represent a good match.
此外,这个转换给我什么?为什么存储在转换图像右上角的信息很重要?
Also, What does this transformation give me? Why is the information stored in the top right of the converted image important?
推荐答案
DCT的结果是原始源进入频域。左上方的条目存储幅度,基本频率和频率都沿着水平和垂直轴增加。 DCT的结果通常是在更通常的较低频率(左上象限)处的幅度集合和在较高频率处较少的条目。正如Lassevk所提到的,通常只是将这些较高频率归零,因为它们通常构成源的非常小的部分。然而,这确实导致信息的丢失。为了完成压缩,通常在DCT'd源上使用无损压缩。这是压缩的地方,因为所有零的运行被压缩到几乎没有。
The result of a DCT is a transformation of the original source into the frequency domain. The top left entry stores the "amplitude" the "base" frequency and frequency increases both along the horizontal and vertical axes. The outcome of the DCT is usually a collection of amplitudes at the more usual lower frequencies (the top left quadrant) and less entries at the higher frequencies. As lassevk mentioned, it is usual to just zero out these higher frequencies as they typically constitute very minor parts of the source. However, this does result in loss of information. To complete the compression it is usual to use a lossless compression over the DCT'd source. This is where the compression comes in as all those runs of zeros get packed down to almost nothing.
使用DCT找到类似区域的一个可能的优点是,对低频值进行第一次匹配(左上角)。这减少了需要匹配的值的数量。
One possible advantage of using the DCT to find similar regions is that you can do a first pass match on low frequency values (top-left corner). This reduces the number of values you need to match against. If you find matches of low frequency values, you can increase into comparing the higher frequencies.
希望这有助于
这篇关于视频压缩:什么是离散余弦变换?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!