使Sphinx4使用自定义.gram文件识别所有数字 [英] Make Sphinx4 Recognize all the numbers using custom .gram file
问题描述
说明
使用 Sphinx4库的Java语言识别计算器.
A speech recognizer calculator in Java Using Sphinx4 library exists.
github上的完整代码:此处
The full code on github: here
The gram file i am using is the below(on github):
#JSGF V1.0;
/**
* JSGF Grammar
*/
grammar grammar;
public <syntax> = (one | two | three| four| five | six | seven | eight | nine | ten | eleven | twelve | thirteen | fourteen | fifteen | sixteen | seventeen | eighteen | nineteen | twenty)
(plus | minus | multiply | division)
(one | two | three| four| five | six | seven | eight | nine | ten | eleven | twelve | thirteen | fourteen | fifteen | sixteen | seventeen | eighteen | nineteen | twenty);
问题:
I want the program to be able to recognize numbers from 0 to 1 million in English Language
.
在您所看到的当前状态下,它可以识别数字(one | two | three| four| five | six | seven | eight | nine | ten | eleven | twelve | thirteen | fourteen | fifteen | sixteen | seventeen | eighteen | nineteen | twenty)
,因为我已经将它们手动写入了gram file
.
In the current state as you can see it can recognize the numbers (one | two | three| four| five | six | seven | eight | nine | ten | eleven | twelve | thirteen | fourteen | fifteen | sixteen | seventeen | eighteen | nineteen | twenty)
, as i have written them manually into the gram file
.
我的意思是我必须将它们全部手动写入gram file
(我可以创建一个程序来生成该文件),但是再次看来这是不可能的(可能存在某种模式),文件将超过GB.
I mean i have to write them all manually into the gram file
(i can create a program to produce that file) but again it is seems impossible(some pattern may exist),the file will be too much gigabytes.
最后:
有什么智能解决方案吗?感谢您的努力:)
Is there any smart solution?Thanks for the effort :)
Nikolay
解决方案之后的新语法是:
The new grammar after Nikolay
Solution is:
public <number> = (one | two | three | four | five | six | seven | nine | ten
| eleven | twelve | thirteen | fourteen | fifteen | sixteen | seventeen | eighteen | nineteen | twenty
| thirty | forty | fifty | sixty | seventy | eighty | ninety | hundred | thousand | million | billion)+;
public <syntax> = <number>{1} (plus | minus | multiply | division){1} <number>{1};
推荐答案
最聪明的解决方案是首先识别文本字符串.语法不应太复杂,而应仅列出数字中使用的单词:
The smartest solution is to recognize a text string first. Grammar should not be complex, it should just list the words used in numbers:
grammar number;
public <number> = (one | two | three | four | five | six | seven |
nine | ten | eleven | twelve | thirteen | fourteen | fifteen |
sixteen | seventeen | eighteen | nineteen | twenty | thirty | forty |
fifty | sixty | seventy | eighty | ninety | hundred | thousand |
million | and )*;
一旦识别出文本,请将其转换为数字.您可以查看如何将单词转换为数字?以获得详细信息.
Once text is recognized, convert it to numbers. You can check How to convert words to a number? for details.
这篇关于使Sphinx4使用自定义.gram文件识别所有数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!