如何使用Python中的已部署端点创建自定义语音文本到语音 [英] How to create custom voice text to speech using deployed endpoint in Python

查看:82
本文介绍了如何使用Python中的已部署端点创建自定义语音文本到语音的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想创建一个可以在Python中使用我自己的语音更改文本到语音的系统。我为我的声音部署了一个语音字体和一个端点。这是使用默认语音在Python中进行文本到语音的代码。


'''
设置订阅密钥后,从您的应用程序运行此应用程序使用以下命令工作

目录:python TTSSample.py

''''
import os,requests,time

来自xml.etree import ElementTree



#此代码是Python 2.7所必需的。
try:input = raw_input
$ b除了NameError之外$ b:传递



''''
如果您愿意,可以将订阅密钥硬编码为字符串并删除

提供的条件语句。但是,我们建议使用环境

变量来保护您的订阅密钥。在我们的样本中,环境变量为
设置为SPEECH_SERVICE_KEY。



例如:

'''

subscription_key =" f32eb9fbbd4748f19bf48226512bc0dd"
$


''''
如果os.environ中的'eeb2266854954f35963ac66a3c146ca1': br />
    subscription_key = os.environ ['eeb2266854954f35963ac66a3c146ca1']
$
else:

    print('订阅密钥的环境变量未设置'。)

    exit()

''''
class TextToSpeech(object):

    def __init __(self,subscription_key):

        self.subscription_key = subscription_key

        self.tts = input("你希望转换为演讲的内容:")

        self.timestr = time.strftime("%Y%m%d-%H%M")

        self.access_token =无



    '''
    TTS端点需要访问令牌。这种方法可以兑换你的
   有效期为十分钟的访问令牌的订阅密钥。

    '''
    def get_token(self):

        fetch_token_url =" https://westus.api.cognitive.microsoft.com/sts/v1.0/issueToken"

        headers = {

            'Ocp-Apim-Subscription-Key':self.subscription_key

        }¥b $ b        response = requests.post(fetch_token_url,headers = headers)

        self.access_token = str(response.text)



    def save_audio(self):

        base_url ='https://westus.tts.speech.microsoft.com/'

        path ='cognitiveservices / v1'

        construct_url = base_url + path

        headers = {

            '授权':'持票人'+ self.access_token,

            'Content-Type':'application / ssml + xml',

            'X-Microsoft-OutputFormat':'riff-24khz-16bit-mono-pcm',

            'User-Agent':'FYP2'

        }¥b $ b        xml_body = ElementTree.Element('speak',version ='1.0')

        xml_body.set('{http://www.w3.org/XML/1998/namespace} lang','en-us')

        voice = ElementTree.SubElement(xml_body,'voice')

        voice.set('{http://www.w3.org/XML/1998/namespace} lang','en-US')

        voice.set('name','Microsoft Server Speech Text to Speech Voice(en-US,Guy24KRUS)')
        voice.text = self.tts

        body = ElementTree.tostring(xml_body)



        response = requests.post(construct_url,headers = headers,data = body)

        '''
       如果返回成功响应,则会写入二进制音频

       在您的工作目录中存档。它以样本开头,并且是
       包括日期。

        '''
       如果response.status_code == 200:

           打开('sample-'+ self.timestr +'。wav','wb')作为音频:

                audio.write(response.content)

                print(&\Status code:" + str(response.status_code)+" \ n您的TTS已准备好播放。\ n")

       否则:

            print(&\ nStatus code:" + str(response.status_code)+" \ nSomething出错。请检查您的订阅密钥和标题。\ n")



if __name__ ==" __ main __":

    app = TextToSpeech(subscription_key)

    app.get_token()

    app.save_audio()



我应该在哪里更改或添加一些代码,以便它可以使用我自己的声音将文本更改为演讲。 


解决方案

您好Icarros,


你已经录制了录音吗?如果没有,在TTS实现中使用自己的语音之前,您需要完成一些事情。请查看以下教程: 


使用Python创建自己的基于语音的应用程序


我希望这会有所帮助,因为我不知道在哪里你正处于这个过程中你期望的功能,但是,让你自己的声音成为翻译的声音并不是一个简单的配置改变。 


Mike


Hi, I would like to create a system that can change text to speech using my own voice in Python. I deployed a voice font and a endpoint for my voice. This is the code for text to speech in Python using default voice.

'''
After you've set your subscription key, run this application from your working
directory with this command: python TTSSample.py
'''
import os, requests, time
from xml.etree import ElementTree

# This code is required for Python 2.7
try: input = raw_input
except NameError: pass

'''
If you prefer, you can hardcode your subscription key as a string and remove
the provided conditional statement. However, we do recommend using environment
variables to secure your subscription keys. The environment variable is
set to SPEECH_SERVICE_KEY in our sample.

For example:
'''
subscription_key = "f32eb9fbbd4748f19bf48226512bc0dd"

'''
if 'eeb2266854954f35963ac66a3c146ca1' in os.environ:
    subscription_key = os.environ['eeb2266854954f35963ac66a3c146ca1']
else:
    print('Environment variable for your subscription key is not set.')
    exit()
'''
class TextToSpeech(object):
    def __init__(self, subscription_key):
        self.subscription_key = subscription_key
        self.tts = input("What would you like to convert to speech: ")
        self.timestr = time.strftime("%Y%m%d-%H%M")
        self.access_token = None

    '''
    The TTS endpoint requires an access token. This method exchanges your
    subscription key for an access token that is valid for ten minutes.
    '''
    def get_token(self):
        fetch_token_url = "https://westus.api.cognitive.microsoft.com/sts/v1.0/issueToken"
        headers = {
            'Ocp-Apim-Subscription-Key': self.subscription_key
        }
        response = requests.post(fetch_token_url, headers=headers)
        self.access_token = str(response.text)

    def save_audio(self):
        base_url = 'https://westus.tts.speech.microsoft.com/'
        path = 'cognitiveservices/v1'
        constructed_url = base_url + path
        headers = {
            'Authorization': 'Bearer ' + self.access_token,
            'Content-Type': 'application/ssml+xml',
            'X-Microsoft-OutputFormat': 'riff-24khz-16bit-mono-pcm',
            'User-Agent': 'FYP2'
        }
        xml_body = ElementTree.Element('speak', version='1.0')
        xml_body.set('{http://www.w3.org/XML/1998/namespace}lang', 'en-us')
        voice = ElementTree.SubElement(xml_body, 'voice')
        voice.set('{http://www.w3.org/XML/1998/namespace}lang', 'en-US')
        voice.set('name', 'Microsoft Server Speech Text to Speech Voice (en-US, Guy24KRUS)')
        voice.text = self.tts
        body = ElementTree.tostring(xml_body)

        response = requests.post(constructed_url, headers=headers, data=body)
        '''
        If a success response is returned, then the binary audio is written
        to file in your working directory. It is prefaced by sample and
        includes the date.
        '''
        if response.status_code == 200:
            with open('sample-' + self.timestr + '.wav', 'wb') as audio:
                audio.write(response.content)
                print("\nStatus code: " + str(response.status_code) + "\nYour TTS is ready for playback.\n")
        else:
            print("\nStatus code: " + str(response.status_code) + "\nSomething went wrong. Check your subscription key and headers.\n")

if __name__ == "__main__":
    app = TextToSpeech(subscription_key)
    app.get_token()
    app.save_audio()

Where should I change or add some code on it so that it can use my own voice to change the text to speech. 

解决方案

Hi Icarros,

Have you already made the voice recordings? If not, there are some things you need to complete before you can use your own voice in a TTS implementation. Please take a look at the following tutorial: 

Create your own Voice based application using Python

I hope this helps, as I am not sure where you are at in the process what functionality you are expecting but, to have your own voice be the voice of the translator is not a simple configuration change. 

Mike


这篇关于如何使用Python中的已部署端点创建自定义语音文本到语音的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆