带有BT.709矩阵的H.264编码视频是否包含任何伽马调整? [英] Does H.264 encoded video with BT.709 matrix include any gamma adjustment?

查看:248
本文介绍了带有BT.709矩阵的H.264编码视频是否包含任何伽马调整?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已阅读创建了一个测试视频,其中只有两种线性灰度颜色5和26,分别对应2%和10%的水平.当同时使用ffmpeg和iMovie转换为H.264时,输出BT.709值为(YCbCr)(20 128 128)和(38 128 128),并且这些值与BT.709转换矩阵的输出完全匹配,没有任何伽马调整.

可以在 Quicktime Gamma Bug .似乎Quicktime和Adobe编码器存在一些历史问题,它们不正确地进行了不同的伽玛调整,结果使得视频流在不同播放器上看起来很糟糕.这确实令人迷惑,因为如果您与 sRGB 进行比较,则可以清楚地表明如何应用伽玛编码然后将其解码以在sRGB和线性之间转换.如果在创建h.264数据流时在矩阵步骤之后没有应用伽玛调整,为什么BT.709会详细介绍相同种类的伽玛调整曲线? h.264流中的所有色阶是否都应编码为直线线性(gamma 1.0)值?

如果要通过特定的示例输入使情况更清楚,我将附加3个彩条图像,可以使用这些图像文件在图像编辑器中显示不同颜色的确切值.

第一张图片位于sRGB色彩空间中,并标记为sRGB.

第二张图像已转换为线性RGB色彩空间,并带有线性RGB配置文件标记.

此第三张图片已通过 elles_icc_profiles .这似乎是模拟BT.709中描述的相机"伽玛所需要做的.

请注意,右下角(0x555555)的sRGB值如何变为线性RGB(0x171717),而BT.709伽玛编码的值如何变为(0x464646).尚不清楚是我应该将线性RGB值传递给ffmpeg,还是应该传递已经BT.709伽玛编码的值,然后在线性转换矩阵步骤返回到RGB之前需要在客户端中对其进行解码

更新:

根据反馈,我更新了基于C的实现和Metal着色器,并作为iOS示例项目上传到github MetalBT709Decoder .

编码归一化线性RGB值是这样实现的:

static inline
int BT709_convertLinearRGBToYCbCr(
                            float Rn,
                            float Gn,
                            float Bn,
                            int *YPtr,
                            int *CbPtr,
                            int *CrPtr,
                            int applyGammaMap)
{
  // Gamma adjustment to non-linear value

  if (applyGammaMap) {
    Rn = BT709_linearNormToNonLinear(Rn);
    Gn = BT709_linearNormToNonLinear(Gn);
    Bn = BT709_linearNormToNonLinear(Bn);
  }

  // https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.709-6-201506-I!!PDF-E.pdf

  float Ey = (Kr * Rn) + (Kg * Gn) + (Kb * Bn);
  float Eb = (Bn - Ey) / Eb_minus_Ey_Range;
  float Er = (Rn - Ey) / Er_minus_Ey_Range;

  // Quant Y to range [16, 235] (inclusive 219 values)
  // Quant Eb, Er to range [16, 240] (inclusive 224 values, centered at 128)

  float AdjEy = (Ey * (YMax-YMin)) + 16;
  float AdjEb = (Eb * (UVMax-UVMin)) + 128;
  float AdjEr = (Er * (UVMax-UVMin)) + 128;

  *YPtr = (int) round(AdjEy);
  *CbPtr = (int) round(AdjEb);
  *CrPtr = (int) round(AdjEr);

  return 0;
}

从YCbCr到线性RGB的解码方式如下:

static inline
int BT709_convertYCbCrToLinearRGB(
                             int Y,
                             int Cb,
                             int Cr,
                             float *RPtr,
                             float *GPtr,
                             float *BPtr,
                             int applyGammaMap)
{
  // https://en.wikipedia.org/wiki/YCbCr#ITU-R_BT.709_conversion
  // http://www.niwa.nu/2013/05/understanding-yuv-values/

  // Normalize Y to range [0 255]
  //
  // Note that the matrix multiply will adjust
  // this byte normalized range to account for
  // the limited range [16 235]

  float Yn = (Y - 16) * (1.0f / 255.0f);

  // Normalize Cb and CR with zero at 128 and range [0 255]
  // Note that matrix will adjust to limited range [16 240]

  float Cbn = (Cb - 128) * (1.0f / 255.0f);
  float Crn = (Cr - 128) * (1.0f / 255.0f);

  const float YScale = 255.0f / (YMax-YMin);
  const float UVScale = 255.0f / (UVMax-UVMin);

  const
  float BT709Mat[] = {
    YScale,   0.000f,  (UVScale * Er_minus_Ey_Range),
    YScale, (-1.0f * UVScale * Eb_minus_Ey_Range * Kb_over_Kg),  (-1.0f * UVScale * Er_minus_Ey_Range * Kr_over_Kg),
    YScale, (UVScale * Eb_minus_Ey_Range),  0.000f,
  };

  // Matrix multiply operation
  //
  // rgb = BT709Mat * YCbCr

  // Convert input Y, Cb, Cr to normalized float values

  float Rn = (Yn * BT709Mat[0]) + (Cbn * BT709Mat[1]) + (Crn * BT709Mat[2]);
  float Gn = (Yn * BT709Mat[3]) + (Cbn * BT709Mat[4]) + (Crn * BT709Mat[5]);
  float Bn = (Yn * BT709Mat[6]) + (Cbn * BT709Mat[7]) + (Crn * BT709Mat[8]);

  // Saturate normalzied linear (R G B) to range [0.0, 1.0]

  Rn = saturatef(Rn);
  Gn = saturatef(Gn);
  Bn = saturatef(Bn);

  // Gamma adjustment for RGB components after matrix transform

  if (applyGammaMap) {
    Rn = BT709_nonLinearNormToLinear(Rn);
    Gn = BT709_nonLinearNormToLinear(Gn);
    Bn = BT709_nonLinearNormToLinear(Bn);
  }

  *RPtr = Rn;
  *GPtr = Gn;
  *BPtr = Bn;

  return 0;
}

我相信此逻辑已正确实现,但是我很难验证结果.当我生成一个包含gamma调整后的颜色值(osxcolor_test_image_24bit_BT709.m4v)的.m4v文件时,结果如预期的那样出来.但是我发现这里的一个测试用例(bars_709_Frame01.m4v) 似乎不起作用,因为彩条值似乎被编码为线性(无伽马调整).

对于SMPTE测试图案,0.75灰度是线性RGB(191 191 191),如果此RGB在没有伽玛调整的情况下被编码为(Y Cb Cr)(180 128 128),或者位流中的值显示为γ校正后的(Y Cb Cr)(206 128 128)?

(跟进) 在对该gamma问题进行了进一步研究之后,很明显,Apple在AVFoundation中实际上正在使用1.961 gamma函数.当使用AVAssetWriterInputPixelBufferAdaptor进行编码,使用vImage或CoreVideo API进行编码时,就是这种情况.此分段伽玛函数定义如下:

#define APPLE_GAMMA_196 (1.960938f)

static inline
float Apple196_nonLinearNormToLinear(float normV) {
  const float xIntercept = 0.05583828f;

  if (normV < xIntercept) {
    normV *= (1.0f / 16.0f);
  } else {
    const float gamma = APPLE_GAMMA_196;
    normV = pow(normV, gamma);
  }

  return normV;
}

static inline
float Apple196_linearNormToNonLinear(float normV) {
  const float yIntercept = 0.00349f;

  if (normV < yIntercept) {
    normV *= 16.0f;
  } else {
    const float gamma = 1.0f / APPLE_GAMMA_196;
    normV = pow(normV, gamma);
  }

  return normV;
}

解决方案

您的原始问题:带有BT.709矩阵的H.264编码视频是否包含任何伽马调整?

编码的视频仅包含伽玛调整-如果您输入编码器的伽玛调整值.

H.264编码器不在乎传输特性. 因此,如果先线性压缩然后再解压缩-您将得到线性. 因此,如果先用gamma压缩然后再解压缩-您将获得gamma.

或者如果您的位是用Rec.编码的. 709传递函数-编码器不会更改伽玛.

但是您可以在H.264流中将传输特性指定为元数据. (ITU-T H.264建议书(04/2017)E.1.1 VUI参数语法).因此,编码后的流会携带色彩空间信息,但不会在编码或解码中使用.

我会假设8位视频始终包含非线性传递函数.否则,您会非常不明智地使用8位.

如果您转换为线性以产生效果和构图-建议您增加位深度或线性化为浮点数.

颜色空间由原色,传递函数和矩阵系数组成. 伽玛调整值在传递函数中编码(而不是在矩阵中).

I have read the BT.709 spec a number of times and the thing that is just not clear is should an encoded H.264 bitstream actually apply any gamma curve to the encoded data? Note the specific mention of a gamma like formula in the BT.709 spec. Apple provided examples of OpenGL or Metal shaders that read YUV data from CoreVideo provided buffers do not do any sort of gamma adjustment. YUV values are being read and processed as though they are simple linear values. I also examined the source code of ffmpeg and found no gamma adjustments being applied after the BT.709 scaling step. I then created a test video with just two linear grayscale colors 5 and 26 corresponding to 2% and 10% levels. When converted to H.264 with both ffmpeg and iMovie, the output BT.709 values are (YCbCr) (20 128 128) and (38 128 128) and these values exactly match the output of the BT.709 conversion matrix without any gamma adjustment.

A great piece of background on this topic can be found at Quicktime Gamma Bug. It seems that some historical issues with Quicktime and Adobe encoders were improperly doing different gamma adjustments and the results made video streams look awful on different players. This is really confusing because if you compare to sRGB, it clearly indicates how to apply a gamma encoding and then decode it to convert between sRGB and linear. Why does BT.709 go into so much detail about the same sort of gamma adjustment curve if no gamma adjustment is applied after the matrix step when creating a h.264 data stream? Are all the color steps in a h.264 stream meant to be coded as straight linear (gamma 1.0) values?

In case specific example input would make things more clear, I am attaching 3 color bar images, the exact values of different colors can be displayed in an image editor with these image files.

This first image is in the sRGB colorspace and is tagged as sRGB.

This second image has been converted to the linear RGB colorspace and is tagged with a linear RGB profile.

This third image has been converted to REC.709 profile levels with Rec709-elle-V4-rec709.icc from elles_icc_profiles . This seems to be what one would need to do to simulate "camera" gamma as described in BT.709.

Note how the sRGB value in the lower right corner (0x555555) becomes linear RGB (0x171717) and the BT.709 gamma encoded value becomes (0x464646). What is unclear is if I should be passing a linear RGB value into ffmpeg or if I should be passing an already BT.709 gamma encoded value which would then need to be decoded in the client before the linear conversion Matrix step to get back to RGB.

Update:

Based on the feedback, I have updated my C based implementation and Metal shader and uploaded to github as an iOS example project MetalBT709Decoder.

Encoding a normalized linear RGB value is implemented like this:

static inline
int BT709_convertLinearRGBToYCbCr(
                            float Rn,
                            float Gn,
                            float Bn,
                            int *YPtr,
                            int *CbPtr,
                            int *CrPtr,
                            int applyGammaMap)
{
  // Gamma adjustment to non-linear value

  if (applyGammaMap) {
    Rn = BT709_linearNormToNonLinear(Rn);
    Gn = BT709_linearNormToNonLinear(Gn);
    Bn = BT709_linearNormToNonLinear(Bn);
  }

  // https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.709-6-201506-I!!PDF-E.pdf

  float Ey = (Kr * Rn) + (Kg * Gn) + (Kb * Bn);
  float Eb = (Bn - Ey) / Eb_minus_Ey_Range;
  float Er = (Rn - Ey) / Er_minus_Ey_Range;

  // Quant Y to range [16, 235] (inclusive 219 values)
  // Quant Eb, Er to range [16, 240] (inclusive 224 values, centered at 128)

  float AdjEy = (Ey * (YMax-YMin)) + 16;
  float AdjEb = (Eb * (UVMax-UVMin)) + 128;
  float AdjEr = (Er * (UVMax-UVMin)) + 128;

  *YPtr = (int) round(AdjEy);
  *CbPtr = (int) round(AdjEb);
  *CrPtr = (int) round(AdjEr);

  return 0;
}

Decoding from YCbCr to linear RGB is implemented like so:

static inline
int BT709_convertYCbCrToLinearRGB(
                             int Y,
                             int Cb,
                             int Cr,
                             float *RPtr,
                             float *GPtr,
                             float *BPtr,
                             int applyGammaMap)
{
  // https://en.wikipedia.org/wiki/YCbCr#ITU-R_BT.709_conversion
  // http://www.niwa.nu/2013/05/understanding-yuv-values/

  // Normalize Y to range [0 255]
  //
  // Note that the matrix multiply will adjust
  // this byte normalized range to account for
  // the limited range [16 235]

  float Yn = (Y - 16) * (1.0f / 255.0f);

  // Normalize Cb and CR with zero at 128 and range [0 255]
  // Note that matrix will adjust to limited range [16 240]

  float Cbn = (Cb - 128) * (1.0f / 255.0f);
  float Crn = (Cr - 128) * (1.0f / 255.0f);

  const float YScale = 255.0f / (YMax-YMin);
  const float UVScale = 255.0f / (UVMax-UVMin);

  const
  float BT709Mat[] = {
    YScale,   0.000f,  (UVScale * Er_minus_Ey_Range),
    YScale, (-1.0f * UVScale * Eb_minus_Ey_Range * Kb_over_Kg),  (-1.0f * UVScale * Er_minus_Ey_Range * Kr_over_Kg),
    YScale, (UVScale * Eb_minus_Ey_Range),  0.000f,
  };

  // Matrix multiply operation
  //
  // rgb = BT709Mat * YCbCr

  // Convert input Y, Cb, Cr to normalized float values

  float Rn = (Yn * BT709Mat[0]) + (Cbn * BT709Mat[1]) + (Crn * BT709Mat[2]);
  float Gn = (Yn * BT709Mat[3]) + (Cbn * BT709Mat[4]) + (Crn * BT709Mat[5]);
  float Bn = (Yn * BT709Mat[6]) + (Cbn * BT709Mat[7]) + (Crn * BT709Mat[8]);

  // Saturate normalzied linear (R G B) to range [0.0, 1.0]

  Rn = saturatef(Rn);
  Gn = saturatef(Gn);
  Bn = saturatef(Bn);

  // Gamma adjustment for RGB components after matrix transform

  if (applyGammaMap) {
    Rn = BT709_nonLinearNormToLinear(Rn);
    Gn = BT709_nonLinearNormToLinear(Gn);
    Bn = BT709_nonLinearNormToLinear(Bn);
  }

  *RPtr = Rn;
  *GPtr = Gn;
  *BPtr = Bn;

  return 0;
}

I believe this logic is implemented correctly, but I am having a very difficult time validating the results. When I generate a .m4v file that contains gamma adjusted color values (osxcolor_test_image_24bit_BT709.m4v), the result come out as expected. But a test case like (bars_709_Frame01.m4v) that I found here does not seem to work as the color bar values seem to be encoded as linear (no gamma adjustment).

For a SMPTE test pattern, the 0.75 graylevel is linear RGB (191 191 191), should this RGB be encoded with no gamma adjustment as (Y Cb Cr) (180 128 128) or should the value in the bitstream appear as the gamma adjusted (Y Cb Cr) (206 128 128)?

(follow up) After doing additional research into this gamma issue, it has become clear that what Apple is actually doing in AVFoundation is using a 1.961 gamma function. This is the case when encoding with AVAssetWriterInputPixelBufferAdaptor, when using vImage, or with CoreVideo APIs. This piecewise gamma function is defined as follows:

#define APPLE_GAMMA_196 (1.960938f)

static inline
float Apple196_nonLinearNormToLinear(float normV) {
  const float xIntercept = 0.05583828f;

  if (normV < xIntercept) {
    normV *= (1.0f / 16.0f);
  } else {
    const float gamma = APPLE_GAMMA_196;
    normV = pow(normV, gamma);
  }

  return normV;
}

static inline
float Apple196_linearNormToNonLinear(float normV) {
  const float yIntercept = 0.00349f;

  if (normV < yIntercept) {
    normV *= 16.0f;
  } else {
    const float gamma = 1.0f / APPLE_GAMMA_196;
    normV = pow(normV, gamma);
  }

  return normV;
}

解决方案

Your original question: Does H.264 encoded video with BT.709 matrix include any gamma adjustment?

The encoded video only contains gamma adjustment - if you feed the encoder gamma adjusted values.

A H.264 encoder doesn't care about the transfer characteristics. So if you compress linear and then decompress - you'll get linear. So if you compress with gamma and then decompress - you'll get gamma.

Or if your bits are encoded with a Rec. 709 transfer function - the encoder won't change the gamma.

But you can specify the transfer characteristic in the H.264 stream as metadata. (Rec. ITU-T H.264 (04/2017) E.1.1 VUI parameters syntax). So the encoded streams carries the color space information around but it is not used in encoding or decoding.

I would assume that 8 bit video always contains a non linear transfer function. Otherwise you would use the 8 bit fairly unwisely.

If you convert to linear to do effects and composition - I'd recommend increasing the bit depth or linearizing into floats.

A color space consists of primaries, transfer function and matrix coefficients. The gamma adjustment is encoded in the transfer function (and not in the matrix).

这篇关于带有BT.709矩阵的H.264编码视频是否包含任何伽马调整?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆