语音识别算法是如何识别同音字的? [英] How do speech recognition algorithms recognize homophones?

查看:29
本文介绍了语音识别算法是如何识别同音字的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我之前也在思考这个问题.现代算法(特别是那些将语音转换为文本的算法)使用哪些线索来确定说的是哪个同音字(例如,to、too 或两个?)

I was pondering this question earlier. What clues do modern algorithms (specifically those that convert voice to text) use to determine which homophone was said (E.g. to, too, or two?)

他们是否使用上下文线索?句子的结构?也许每个单词的发音方式略有不同(例如,我通常在 two 中比在 to 中保持 o 音的时间更长).前两者的组合似乎最合理.

Do they use contextual clues? Sentence structure? Perhaps there are slight differences in the way each word is usually pronounced (for example, I usually hold the o sound longer in two than in to). A combination of the first two seems most plausible.

推荐答案

他们是否使用上下文线索?

Do they use contextual clues?

是的,ASR 系统使用填字游戏上下文.例如,如果前一个词是going",下一个词可能是to"而不是two".ASR 系统会考虑概率并选择最可能的解码变体.

Yes, ASR systems use cross-word context. For example if previous word is "going" the next word will likely to be "to" not "two". ASR systems account for probabilities and select the best probable decoding variant.

句子结构?

是的,ASR 系统也使用更高级的语言模型来预测给定上下文的可能词.

Yes, ASR systems use more advanced language models as well to predict probable words given the context.

也许每个单词通常的发音方式略有不同(例如,我通常将 o 保持在两个中而不是在 to 中).

Perhaps there are slight differences in the way each word is usually pronounced (for example, I usually hold the o sound longer in two than in to).

那也是.实际上,too"和to"的发音完全不同.to"通常被简化为 shwa.

That too. Actually "too" and "to" are pronounced quite differently. "to" is often reduced to shwa.

如果您对语音识别算法感兴趣,阅读 ASR 书籍或查看在线课程可能很有意义.详情请看

If you are interested in speech recognition algorithms, it may have sense to read ASR book or check online course. See for details

https://sourceforge.net/p/cmusphinx/discussion/语音识别/线程/3ea89abf/

这篇关于语音识别算法是如何识别同音字的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆