在java中使用Sphinx从语音转换为文本不会产生正确的结果 [英] conversion from speech to text using Sphinx in java does not produce correct result

查看:79
本文介绍了在java中使用Sphinx从语音转换为文本不会产生正确的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试使用java中的Sphinx包将语音转换为文本...但我无法理解为什么它没有正确生成令牌....

下面是代码



.java文件

I have been trying to convert speech to text using the Sphinx package in java...but i am unable to understand why is it not correctly producing the tokens....
Below is the code

.java file

package speechtotext;

import edu.cmu.sphinx.frontend.util.Microphone;
import edu.cmu.sphinx.recognizer.Recognizer;
import edu.cmu.sphinx.result.Result;
import edu.cmu.sphinx.util.props.ConfigurationManager;
import java.awt.*;
import java.awt.event.ActionEvent;
import java.awt.event.ActionListener;
import javax.swing.*;


public class HelloWorld extends JApplet  implements ActionListener{
 
  private JButton b1 = new JButton("SPEAK"), b2 = new JButton("STOP");
  JTextArea textArea = new JTextArea(7,30);
  Result result;
  ConfigurationManager cm;
  Recognizer recognizer;
  Microphone microphone;
  String resultText ;
  
  public void init() {
    Container cp = getContentPane();
    cp.setLayout(new FlowLayout());
    Image image = Toolkit.getDefaultToolkit().createImage("C:\\Users\\arsa\\Desktop\\1.png");
    Image scaled = image.getScaledInstance(300, 550, Image.SCALE_SMOOTH);
    JLabel label = new JLabel(new ImageIcon(scaled));
    cp.add(label,BorderLayout.CENTER);
    textArea.setText("");
    textArea.setLineWrap(true);
    textArea.setEditable(false);
    add(textArea,"Center");
    cp.add(b2,FlowLayout.LEFT);
    cp.add(b1,FlowLayout.LEFT);
    cp.add(textArea);
    b1.addActionListener(this);
    b2.addActionListener(this);
    cm = new ConfigurationManager(HelloWorld.class.getResource("helloworld.config.xml"));
    recognizer = (Recognizer) cm.lookup("recognizer");
    System.out.println("Successful1 allocation");
    recognizer.allocate();
    System.out.println("Successful1 allocation1");
    microphone = (Microphone) cm.lookup("microphone");
     if (!microphone.startRecording()) {
            System.out.println("Cannot start microphone.");
            recognizer.deallocate();
            System.exit(1);
        }
  }
@Override
    public void actionPerformed(ActionEvent e) {
    String str=e.getActionCommand();
     if (e.getSource() == b1)
     {
         result = recognizer.recognize();                
     }   
     else  if (e.getSource() == b2)
     {
         if (result != null) {
             resultText = result.getBestPronunciationResult();
             if(resultText!=null)
                textArea.setText("You said: " + resultText + '\n');
            else if(resultText==null)
                textArea.setText("I couldn't hear what you said.\n");
         }
         else if(result==null)
             textArea.setText("Cheater!! Cheater!! you didn't say anything....\n");
     }
   }





.gram文件



.gram file

#JSGF V1.0;

/**
 * JSGF Grammar for Hello World example
 */

grammar hello;

public <greet> = (Good morning | Hello) ( Bhiksha | Evandro | Paul | Philip | Rita | Will );

推荐答案

问题可能是你的口音。但您可以通过修改默认声学模型(每个单词的音素列表)来解决此问题。

在Sphinx中,声学模型可以作为文本文件找到。它包括一些像下面的行,

The problem might be your accent. But you can solve this by modifying the default acoustic model (list of phonemes of each word).
In Sphinx the acoustic model can be found as a text file. It includes some thing like following lines,
HELLO	HH AH L OW
HELLO(2)	HH EH L OW
THANKS	TH AE NG K S
YOUR	Y AO R
YOUR(2)	Y UH R





TH AE NG KS 是单词THANKS的音素集。您可以修改这些音素以适合您的发音。



1.首先找到WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar文件并解压缩。

2.转到edu\cmu\sphinx \ model \sound¯\\WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz\dict文件夹并在该文件夹中打开cmudict.0.6d文件。

3.修改内容,因为它适合您的发音并保存。

4.按原样压缩提取的层次结构,并命名Zip文件应与JAR文件相同。

5.从Project的CLASSPATH中删除WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar文件并添加WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz。 zip而不是它。



您可以使用以下工具添加更多单词。

http://www.speech.cs.cmu.edu/tools/lmtool-new.html [ ^ ]



TH AE NG K S is the set of phonemes for the word "THANKS". You can modify these phonemes to suit to your pronunciation.

1. First find WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar file and extract it.
2. Go to edu\cmu\sphinx\model\acoustic\WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz\dict folder and open "cmudict.0.6d" file in that folder.
3. Modify the content as it will suit to your pronunciation and save.
4. Zip the extracted hierarchy back as it was and Zip file named should be same as JAR file.
5. Remove "WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar" file from Project’s CLASSPATH and add "WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.zip" instead of it.

You can add more words using the following tool.
http://www.speech.cs.cmu.edu/tools/lmtool-new.html[^]


这篇关于在java中使用Sphinx从语音转换为文本不会产生正确的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆