EchoNest API的getTimbre向量是什么意思? [英] What is the meaning of the EchoNest API's getTimbre vector?

查看:147
本文介绍了EchoNest API的getTimbre向量是什么意思?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

回声巢 分析器文档声明了有关音色的以下内容:

The EchoNest Analyzer Documentation states the following regarding timbre:

音色是区分音符或声音的质量 不同类型的乐器或声音.这很复杂 概念也称为声音颜色,纹理或音质,以及 源自分段的光谱时空表面的形状, 与音高和响度无关. Echo Nest Analyzer的音色 特征是一个矢量,其中包含12个大致居中的无界值 大约为0.这些值是频谱的高级抽象 表面,按重要程度排序.为了完整起见, 第一个维度表示片段的平均响度; 第二个强调亮度;第三是与 声音的平坦度;第四点听起来更有力.等等. 下图代表12个基本函数(即模板) 段).该段的实际音色最好描述为 这12个基函数的线性组合,由 系数值:音色= c1 x b1 + c2 x b2 + ... + c12 x b12, 其中c1到c12代表12个系数,b1到b12代表12个系数 基本功能如下所示.音色矢量最适合用于 彼此比较.

timbre is the quality of a musical note or sound that distinguishes different types of musical instruments, or voices. It is a complex notion also referred to as sound color, texture, or tone quality, and is derived from the shape of a segment’s spectro-temporal surface, independently of pitch and loudness. The Echo Nest Analyzer’s timbre feature is a vector that includes 12 unbounded values roughly centered around 0. Those values are high level abstractions of the spectral surface, ordered by degree of importance. For completeness however, the first dimension represents the average loudness of the segment; second emphasizes brightness; third is more closely correlated to the flatness of a sound; fourth to sounds with a stronger attack; etc. See an image below representing the 12 basis functions (i.e. template segments). The actual timbre of the segment is best described as a linear combination of these 12 basis functions weighted by the coefficient values: timbre = c1 x b1 + c2 x b2 + ... + c12 x b12, where c1 to c12 represent the 12 coefficients and b1 to b12 the 12 basis functions as displayed below. Timbre vectors are best used in comparison with each other.

我的理解是b向量({b1...b12})是API的getTimbre方法返回的内容.但是,{c1...c12}系数从何而来呢?我不明白如何从矢量音色中获取标量音色(主要是因为您的分析API是封闭源).你能帮我这个忙吗?

My understanding is that the b vector ({b1...b12}) is what is being returned by your API's getTimbre method. But then where are the {c1...c12} coefficients coming from? I don't understand how to acquire a scalar timbre from a vector timbre (primarily because your analysis API is closed source). Can you help me out with this?

推荐答案

请注意,此网站上的答案来自志愿者.要获得对该库的官方支持,您需要直接与出版商联系.

Note that answers on this website come from volunteers. To get official support for the library, you need to contact the publisher directly.

b1…b12不是音频分析的结果,它只是描述分析的结果.它们是固定常数,如下图所示:

b1 … b12 is not the result of the audio analysis, it is merely descriptive of what the analysis does. They are fixed constants as shown in the diagram:

分析器产生的标量为c1…c12的向量.当然,声音不能仅用12个数字来完美描述.将标量乘以函数将不会再现原始音乐,因为那里没有足够的数据.这只是一个近似值.不过,可能的是,每个细分受众群都会获得类似的心情",因此尝试聆听可能会很有趣.

The vector of scalars c1 … c12 is what the analyzer produces. Of course, the sound cannot be perfectly described by only 12 numbers. Multiplying the scalars by the functions won't reproduce the original music because there's not enough data there; it's only an approximation. Possibly, though, you'll get a similar "mood" from each segment, so it could be interesting to try and listen.

这篇关于EchoNest API的getTimbre向量是什么意思?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆