使用directx,visual studio c ++ mfc和MIDI将音乐转换为音符 [英] Convert music to musical notes using directx, visual studio c++ mfc and MIDI

查看:143
本文介绍了使用directx,visual studio c ++ mfc和MIDI将音乐转换为音符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何使用MIDI将录制的声音转换为音符。

解决方案

对不起,你的问题和评论(见上文)表明这样的问题远高于你的头脑。你需要选择更简单的作业。



让我们从MIDI以外的输入开始。输入是扬声器线圈中的电流与时间之间的数字化(采样)依赖性。在此输出中,本身没有音符,没有偶数频率。在真实的音乐样本中,情况甚至远离一组混合频率。这是一堆乱七八糟的不同声音和噪音,几乎没有一个声音长时间响起,所以你可以简单地分析一组频率。即使声音没有被数字化,你已经拥有了完美的频谱分析工具,你就不会拥有一组频率和相位有限的频谱。相反,你会得到一个连续的频谱,没有离散的频率。这是众所周知的傅里叶变换理论的结果。即使您尝试创建完美的正弦声音,如果您尝试及时限制它,它将具有无限频率的连续频谱。数字化使这个问题变得更加困难。您需要尝试将所有这些混乱识别为对应于平等气质(或任何其他系统)的纯音乐音调的一组频率。你需要从很多细节中抽象出一张图片,抑制/忽略噪音等等。从理论上讲,这并不总是可行的(不是每种噪音都可以被解释为音乐)。即使从人类感知的角度来看,这是非常困难的。这是傅里叶分析和图像识别的一个非常困难的组合, http://en.wikipedia.org/wiki/Fourier_analysis [ ^ ],http://en.wikipedia.org/wiki/Image_recognition#Recognition [ ^ ]。



你熟悉这些领域吗?每一项都是一整套教育,每一项都不仅仅是阅读一些文章,甚至还有一本书。即使你受过这方面的教育,这还不足以解决这个问题。前段时间我尝试了几个试图解决这个问题的开源软件,发现它们的质量非常差,它们只能分析很简单的记录,其中存在很多错误。我可以想象一个非常高质量的产品可能存在并且在很多非平凡的样品上都能很好地工作,但这应该是一种顶尖的技术。你没有意识到它,甚至没有关闭,因为你一直在谈论要求和A,B,C#......。



另一端是MIDI 。 MIDI文件序列没有声音。它实际上已经由笔记组成。更确切地说,这是对事件序列的描述。想象一下钢琴演奏的描述。每个MIDI事件基本上描述了在什么时间按下或释放的钢琴键以及播放的大小。整个剧本结合了几个同时演奏的乐器,可以包括更复杂的细节,如弯曲(如吉他或电子钢琴轮),打击乐器等。您需要做的只是MIDI格式的知识和解析文件的能力;它还需要基本的音乐理论知识,只需要它的微不足道的部分。与上述识别问题相比,这个问题无关紧要。



我觉得我浪费时间。我只希望一些合理的读者可以找到这个问题的基本介绍。



-SA


< blockquote>在音乐方面......你可能将音乐流分成直接与键或音阶相关的单独频率,希望你需要的数据占据该频段的相当一部分。

或者如果你真的很幸运,你需要的数据可能是一个立体声频道的合理比例。但每个案例仍然是大海捞针。程序存在但不是很有效。通常,您必须将非常低的噪声和非常高频率的咔嗒声和打击乐器带走,有时会先删除不需要的立体声或四声道。然后删除低音量数据,希望你有旋律的本质。

Ian


How to convert the sound recorded in to musical notes using MIDI.

解决方案

Sorry, your questions and comments (please see above) suggest that such problem is well above your head. You need to pick much simpler assignment.

Let's start from the input which is other than MIDI. The input is the digitized (sampled) dependency between the current in the coil of the speaker and time. In this output, there are no notes per se, there are no even frequencies. In the real-like musical samples, the situation is even very far from a set of mixed frequencies. It's a mess of different sounds and noises and virtually none of them sound for a prolonged period of time so you could simply analyze the set of frequencies. Even it the sounds were not digitized and you already had an instrument of perfect spectrum analysis, you would not have a spectrum with a finite set of frequencies and phases. Instead, you would get a continues spectrum, without discrete frequencies. This is a result of well-known theory of Fourier transform. Even if you try to create a perfect sine sound, it will have a continues spectrum with infinite set of frequency if you try to limit it in time. Digitization makes this problem more difficult. You need to try to recognize all this mess into a set of frequencies corresponding to the pure musical tones of equal temperament (or any other system). You need to get a picture abstracted from a lot of detail, suppress/ignore the noises, etc. Theoretically, this is not always possible (not every noise could be interpreted as musical). And even when it is possible from the point of view of human perception, it is extremely difficult. This is a very difficult combination of Fourier Analysis and image recognition, http://en.wikipedia.org/wiki/Fourier_analysis[^], http://en.wikipedia.org/wiki/Image_recognition#Recognition[^].

Are you familiar with any of these fields? Each is a whole piece of education, each is much more than reading of some articles and even a book. And even if you are educated in this fields, this is not enough to approach this problem. A while ago I tried several pieces of Open Source software trying to solve this problem and found that their quality is very poor, they could analyze only very simple record which a lot of errors. I can imaging that a very high quality product might exist and work well on many non-trivial samples, but that should really be a top-notch technology. You don't realize it, not even close, as you keep talking about "requirement" and "A, B, C#…".

The opposite end is MIDI. MIDI sequence of file has no sounds. It is practically already composed of notes. More exactly, this is description of the sequence of events. Imagine the description of piano play. Each MIDI event essentially describes which a piano key is pressed or released at what time and how loudly it is played. The whole play combines several instruments playing at the same time and can include more complex detail such as bending (like with a guitar or an electronic piano wheel), percussion and more. All you need to do is just the knowledge of MIDI format and ability to parse the file; it also requires basic knowledge of musical theory, just the trivial part of it. This problem is nothing compared to the recognition problem described above.

I have a feeling that I waste my time. I only hope some reasonable readers could find this elementary introduction to the problem interesting.

—SA


In musical terms....you might split the music stream into separate frequencies which relate directly to the key or scale hoping the data you require takes up a fair proportion of that band.
Or if you are really lucky the data you require might be a fair proportion of one stereo channel. But each case is still a needle in a haystack. Programs exist for this but not very effective. Usually you have to take out unwanted noise very low and very high frequencies clicks and percussion and sometimes remove unwanted stereo or quad channels first. Then remove low volume data so that hopefully you have the essence of the melody.
Ian


这篇关于使用directx,visual studio c ++ mfc和MIDI将音乐转换为音符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆