是SpeechSynthesizer的SpeakProgressEventArgs不准确? [英] Are the SpeakProgressEventArgs of the SpeechSynthesizer inaccurate?

查看:589
本文介绍了是SpeechSynthesizer的SpeakProgressEventArgs不准确?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在.net 3.5使用System.Speech.Synthesis.SpeechSynthesizer班,SpeakProgressEventArgs的AudioPosition财产似乎是不准确的。

Using the System.Speech.Synthesis.SpeechSynthesizer class in .Net 3.5, the AudioPosition property of the SpeakProgressEventArgs appears to be inaccurate.

下面的代码产生以下输出:

The following code produces the following output:

代码:

using System;
using System.Speech.Synthesis;
using System.Threading;

namespace SpeechTest
{
    class Program
    {
    	static ManualResetEvent speechDoneEvent = new ManualResetEvent(false);

    	static void Main(string[] args)
    	{
    		SpeechSynthesizer synthesizer = new SpeechSynthesizer();

    		synthesizer.SpeakProgress += new EventHandler<SpeakProgressEventArgs>(synthesizer_SpeakProgress);

    		synthesizer.SpeakCompleted += new EventHandler<SpeakCompletedEventArgs>(synthesizer_SpeakCompleted);

    		synthesizer.SetOutputToWaveFile("Test.wav");

    		synthesizer.SpeakAsync("This holiday season, support the music you love by shopping at Made in Washington, online and at one of five local stores. Made in Washington chocolates, bountiful gift baskets and ornaments are the perfect holiday gifts for family, friends and co-workers.");

    		speechDoneEvent.WaitOne();
    	}

    	static void synthesizer_SpeakCompleted(object sender, SpeakCompletedEventArgs e)
    	{
    		speechDoneEvent.Set();
    	}

    	static void synthesizer_SpeakProgress(object sender, SpeakProgressEventArgs e)
    	{
    		Console.WriteLine("SpeakProgress: AudioPosition=" + e.AudioPosition + ",\tCharacterPosition=" + e.CharacterPosition + ",\tCharacterCount=" + e.CharacterCount + ",\tText=" + e.Text);
    	}
    }
}



输出:

Output:

SpeakProgress: AudioPosition=00:00:00.0043750,  CharacterPosition=0,    CharacterCount=4,       Text=This
SpeakProgress: AudioPosition=00:00:00.2925625,  CharacterPosition=5,    CharacterCount=7,       Text=holiday
SpeakProgress: AudioPosition=00:00:00.9086250,  CharacterPosition=13,   CharacterCount=6,       Text=season
SpeakProgress: AudioPosition=00:00:01.9421250,  CharacterPosition=21,   CharacterCount=7,       Text=support
SpeakProgress: AudioPosition=00:00:02.5621250,  CharacterPosition=29,   CharacterCount=3,       Text=the
SpeakProgress: AudioPosition=00:00:02.6760625,  CharacterPosition=33,   CharacterCount=5,       Text=music
SpeakProgress: AudioPosition=00:00:03.2648125,  CharacterPosition=39,   CharacterCount=3,       Text=you
SpeakProgress: AudioPosition=00:00:03.5199375,  CharacterPosition=43,   CharacterCount=4,       Text=love
SpeakProgress: AudioPosition=00:00:03.8435625,  CharacterPosition=48,   CharacterCount=2,       Text=by
SpeakProgress: AudioPosition=00:00:04.0701875,  CharacterPosition=51,   CharacterCount=8,       Text=shopping
SpeakProgress: AudioPosition=00:00:04.6840625,  CharacterPosition=60,   CharacterCount=2,       Text=at
SpeakProgress: AudioPosition=00:00:04.8036250,  CharacterPosition=63,   CharacterCount=4,       Text=Made
SpeakProgress: AudioPosition=00:00:05.0698125,  CharacterPosition=68,   CharacterCount=2,       Text=in
SpeakProgress: AudioPosition=00:00:05.2521250,  CharacterPosition=71,   CharacterCount=10,      Text=Washington
SpeakProgress: AudioPosition=00:00:06.2961875,  CharacterPosition=83,   CharacterCount=6,       Text=online
SpeakProgress: AudioPosition=00:00:07.0540625,  CharacterPosition=90,   CharacterCount=3,       Text=and
SpeakProgress: AudioPosition=00:00:07.3331250,  CharacterPosition=94,   CharacterCount=2,       Text=at
SpeakProgress: AudioPosition=00:00:07.6818750,  CharacterPosition=97,   CharacterCount=3,       Text=one
SpeakProgress: AudioPosition=00:00:08.0598750,  CharacterPosition=101,  CharacterCount=2,       Text=of
SpeakProgress: AudioPosition=00:00:08.2163750,  CharacterPosition=104,  CharacterCount=4,       Text=five
SpeakProgress: AudioPosition=00:00:08.5971875,  CharacterPosition=109,  CharacterCount=5,       Text=local
SpeakProgress: AudioPosition=00:00:09.0243750,  CharacterPosition=115,  CharacterCount=6,       Text=stores
SpeakProgress: AudioPosition=00:00:10.5325625,  CharacterPosition=123,  CharacterCount=4,       Text=Made
SpeakProgress: AudioPosition=00:00:10.7700625,  CharacterPosition=128,  CharacterCount=2,       Text=in
SpeakProgress: AudioPosition=00:00:10.9377500,  CharacterPosition=131,  CharacterCount=10,      Text=Washington
SpeakProgress: AudioPosition=00:00:11.6708125,  CharacterPosition=142,  CharacterCount=10,      Text=chocolates
SpeakProgress: AudioPosition=00:00:12.9798750,  CharacterPosition=154,  CharacterCount=9,       Text=bountiful
SpeakProgress: AudioPosition=00:00:13.6303125,  CharacterPosition=164,  CharacterCount=4,       Text=gift
SpeakProgress: AudioPosition=00:00:14.0959375,  CharacterPosition=169,  CharacterCount=7,       Text=baskets
SpeakProgress: AudioPosition=00:00:14.7848125,  CharacterPosition=177,  CharacterCount=3,       Text=and
SpeakProgress: AudioPosition=00:00:15.0507500,  CharacterPosition=181,  CharacterCount=9,       Text=ornaments
SpeakProgress: AudioPosition=00:00:15.7195000,  CharacterPosition=191,  CharacterCount=3,       Text=are
SpeakProgress: AudioPosition=00:00:15.9872500,  CharacterPosition=195,  CharacterCount=3,       Text=the
SpeakProgress: AudioPosition=00:00:16.1488750,  CharacterPosition=199,  CharacterCount=7,       Text=perfect
SpeakProgress: AudioPosition=00:00:16.7275000,  CharacterPosition=207,  CharacterCount=7,       Text=holiday
SpeakProgress: AudioPosition=00:00:17.3336875,  CharacterPosition=215,  CharacterCount=5,       Text=gifts
SpeakProgress: AudioPosition=00:00:17.9813125,  CharacterPosition=221,  CharacterCount=3,       Text=for
SpeakProgress: AudioPosition=00:00:18.2216875,  CharacterPosition=225,  CharacterCount=6,       Text=family
SpeakProgress: AudioPosition=00:00:19.0973750,  CharacterPosition=233,  CharacterCount=7,       Text=friends
SpeakProgress: AudioPosition=00:00:19.7726250,  CharacterPosition=241,  CharacterCount=3,       Text=and
SpeakProgress: AudioPosition=00:00:19.9655625,  CharacterPosition=245,  CharacterCount=10,      Text=co-workers
SpeakProgress: AudioPosition=00:00:20.2518750,  CharacterPosition=245,  CharacterCount=10,      Text=co-workers

但是,所生产的.wav文件的持续时间为15.69秒。发生同样的行为,如果你输出到Stream或为空。

However, the duration of the .wav file produced is 15.69 seconds. The same behavior occurs if you output to a Stream or to null.

为属性文档表示,属性是一个TimeSpan对象,表示音频输出流中的事件的时间位置。

The documentation for the property says the property is "A TimeSpan object that represents the time position of the event in the audio output stream".

它应该是指示字启动或完成了输出文件讲时间的精确时间,还是我误解了?

Should it be an accurate time indicating the time the word is started or finished speaking in the output file, or am I misinterpreting it?

推荐答案

audioPosition 依赖于语音合成器的选择的语音。对于一些微软的声音,如安娜,济拉,大卫,榛,我所经历的,所支持的音频格式是PCM 16000,。因此,以下解决方案可以纠正auido位置:

the audioPosition depends on the selected voice of the speech synthesizer. For some Microsoft voices, such as Anna, Zira, David, Hazel, as I have experienced, the supported audio format is a 16000Hz PCM. So the following solution can correct the auido position:

var format = 
new System.Speech.AudioFormat.SpeechAudioFormatInfo(EncodingFormat.Pcm, 
                                                    16000, 16, 1, 32000, 2, null);
synthesizer.SetOutputToWaveFile("Test.wav", format);

如果你注意,默认采样率 SetOutputToWaveFile 是22050,而正确的时间(15.69)由 AudipPosition (20.25)时间之比约为0.77。如果您通过22050乘以这个比例你得到16000左右,这是正确的采样率。

if you note, the default sample rate of the SetOutputToWaveFile is 22050, and the ratio of the correct time (15.69) to the time shown by AudipPosition (20.25) is about 0.77. If you multiply this ratio by 22050 you get about 16000, which is the correct sample rate.

这篇关于是SpeechSynthesizer的SpeakProgressEventArgs不准确?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆