Python:如何将字符串“ub"添加到字符串中的每个发音元音之前? [英] Python: How to prepend the string 'ub' to every pronounced vowel in a string?

查看:22
本文介绍了Python:如何将字符串“ub"添加到字符串中的每个发音元音之前?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

示例:说 -> Spubeak,更多信息在这里

Example: Speak -> Spubeak, more info here

不要给我一个解决方案,而是给我指明正确的方向或告诉我可以使用哪个 python 库?我正在考虑使用正则表达式,因为我必须找到一个元音,但是我可以使用哪种方法在元音前插入 'ub'?

Don't give me a solution, but point me in the right direction or tell which which python library I could use? I am thinking of regex since I have to find a vowel, but then which method could I use to insert 'ub' in front of a vowel?

推荐答案

它比简单的正则表达式更复杂 例如,

It is more complex then just a simple regex e.g.,

"Hi, how are you?" → "Hubi, hubow ubare yubou?"

简单的正则表达式不会发现 eare 中不发音.

Simple regex won't catch that e is not pronounced in are.

您需要一个提供发音词典的库,例如nltk.corpus.cmudict:

You need a library that provides a pronunciation dictionary such as nltk.corpus.cmudict:

from nltk.corpus import cmudict # $ pip install nltk
# $ python -c "import nltk; nltk.download('cmudict')"

def spubeak(word, pronunciations=cmudict.dict()):
    istitle = word.istitle() # remember, to preserve titlecase
    w = word.lower() #note: ignore Unicode case-folding
    for syllables in pronunciations.get(w, []):
        parts = []
        for syl in syllables:
            if syl[:1] == syl[1:2]:
                syl = syl[1:] # remove duplicate
            isvowel = syl[-1].isdigit()
            # pronounce the word
            parts.append('ub'+syl[:-1] if isvowel else syl)
        result = ''.join(map(str.lower, parts))
        return result.title() if istitle else result
    return word # word not found in the dictionary

示例:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import re

sent = "Hi, how are you?"
subent = " ".join(["".join(map(spubeak, re.split("(W+)", nonblank)))
                   for nonblank in sent.split()])
print('"{}" → "{}"'.format(sent, subent))

输出

"Hi, how are you?" → "Hubay, hubaw ubar yubuw?"

注意:它与第一个示例不同:每个单词都替换为其音节.

Note: It is different from the first example: each word is replaced with its syllables.

这篇关于Python:如何将字符串“ub"添加到字符串中的每个发音元音之前?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆