块如何从声音(音频)中提取并转换为文本? [英] how blocks extract from sound (audio) and convert to text?

查看:92
本文介绍了块如何从声音(音频)中提取并转换为文本?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

实际上我有将小文件转换为文本的代码,但这里String Builder将完整的文件转换为字符串然后显示文本,而我想分别在显示一个或另一个的块中提取音频,这样暂停将reduce.kindly帮我解决这个问题...

actually i have code which is converted small files to text but here String Builder convert the complete file in to string and then display text while i want to extract audio in the blocks which display one than another respectively, in this way pause will reduce.kindly help me to resolve this issue...

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;
using System.Speech;
using System.Speech.Recognition;

namespace spechToText
{
    public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();
        }

        private void button1_Click(object sender, EventArgs e)
        {
            OpenFileDialog open = new OpenFileDialog();
           // open.Filter = "AUDIO file|* .wav |file audio|* .wma";
            if(open.ShowDialog() == DialogResult.OK)
            {
              SpeechRecognitionEngine sre = new SpeechRecognitionEngine();
                Grammar gr = new DictationGrammar();
                sre.LoadGrammar(gr);
                sre.SetInputToWaveFile(open.FileName);
                sre.BabbleTimeout = new TimeSpan(Int32.MaxValue);
                sre.InitialSilenceTimeout = new TimeSpan(Int32.MaxValue);
                sre.EndSilenceTimeout = new TimeSpan(100000000);
                sre.EndSilenceTimeoutAmbiguous = new TimeSpan(100000000);

                StringBuilder sb = new StringBuilder();
                while (true)
                {
                    try
                    {
                        var recText = sre.Recognize();
                        if (recText == null)
                        {
                            break;
                        }

                        sb.Append(recText.Text);
                    }
                    catch (Exception ex)
                    {
                        //handle exception      
                        //...

                        break;
                    }
                }
                richTextBox1.AppendText( sb.ToString());

               
            
            
            
            }
        }

        private void Form1_Load(object sender, EventArgs e)
        {

        }

        private void richTextBox1_TextChanged(object sender, EventArgs e)
        {

        }
    }
}

推荐答案

您需要的是文本和音频位置的一致性。对于每个识别的文本块(话语),还有开始位置和结束位置。查看您使用的语音识别引擎的文档。

然后,当然,不仅要保存文本,还要保存一致的对象列表/数组。
What you need is the concordance of text and audio position. For each recognized block of text ("utterance"), there is also a start position and an end position. Look at the documentation for the Speech Recognition engine you use.
Then, of course, do not merely save the text, but an ordered List/Array of the concordance objects.

这篇关于块如何从声音(音频)中提取并转换为文本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆