文本到node.js上的音频文件 [英] text to audio file on node.js

查看:110
本文介绍了文本到node.js上的音频文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种优化的合法方法,可以从nodejs上的文本创建音频文件.

Im looking for an optimized legal way to create an audio file from text on nodejs.

现在我看到5个变体:

1)向Google发送简单的hhtp请求,将文本转换为api. 这个变体不好,因为每个请求都需要生成的令牌 例如'tk:729008.879154' 没有这个,它可能会失败.除此之外,此选项为非法".

1) simple hhtp request to google translate text-to-sppeach api. This variant isn't good, because each request demands on generated token e.g. 'tk:729008.879154' Without this it can fail. Besides that, this option is 'illegal'.

2)谷歌向http请求从控制台浏览器"翻译文本到语音合成api- puppeteer

2) http request to google translate text-to-sppeach api from 'console browser' - puppeteer

有没有一种方法可以生成正确的令牌密钥以使该请求合法"?

Is there a way to generate right token key to make this request 'legal'?

3)在puppeteer中使用Web Speech Api获取二进制数据并将其保存到文件吗? 还是有一种使用Chromium/Chrome源代码的方法?

3) use Web Speech Api in puppeteer to get binary data and save it to file? Or is there a way to work with Chromium/Chrome source code ?

4)在具有nodejs的计算机上使用任何其他技术/语言库,并使用js作为解释器来调用此技术/程序中的命令. 有什么想法吗?

4) Using any other technology/language library on machine with nodejs and to use js as interpreter to call commands in this technology/programm. Any ideas?

5)是否有任何支持不同语言的免费公共api(dream api)?

5) Any free public api with different languages support (dream api)?

任何建议将不胜感激.

推荐答案

一种可能的方法是包装eSpeak命令行工具(Windows和Linux) http://espeak.sourceforge.net/.然后,您可以使用Node.js包装.

One possible approach is to wrap the eSpeak command line tool (Windows & Linux) http://espeak.sourceforge.net/. You can then wrap with Node.js.

const { exec } = require('child_process');

var outputFile = process.argv[2] || "output.wav";
var voice = process.argv[3] || "en-uk-north";
var text = process.argv[4] || "hello there buddy";
var command = `espeak.exe -v ${voice} -w ${outputFile} "${text}"`;

exec(command, (err, stdout, stderr) => {
  if (err) {
    console.log("Error occurred: ", err);
    return;
  }
});

这会产生相当低的质量输出.

This gives a fairly low quality output.

我还使用了Bing Speech API,并且输出非常好,我创建了一个Node.js示例.您将需要注册一个API密钥,但这非常简单(您可以

I've also played with the Bing Speech API and the output is very good, I've created a Node.js example. You would need to sign up for an API key but this is very easy (you to to https://azure.microsoft.com/en-us/try/cognitive-services/ and select "Speech").

const key = 'your api key here';

function synthesizeSpeech(apiKey)
{
    const fs = require('fs');
    const request = require('request');
    const xmlbuilder = require('xmlbuilder');
    const text = process.argv[2] || "The fault, dear Brutus, is not in our stars, But in ourselves, that we are underlings.";
    const outputFile = process.argv[3] || "speech.wav";

    var ssml_doc = xmlbuilder.create('speak')
        .att('version', '1.0')
        .att('xml:lang', 'en-au')
        .ele('voice')
        .att('xml:lang', 'en-au')
        .att('xml:gender', 'Female')
        .att('name', 'Microsoft Server Speech Text to Speech Voice (en-AU, HayleyRUS)')
        .txt(text)
        .end();
    var post_speak_data = ssml_doc.toString();

    console.log('Synthesizing speech: ', text);
    request.post({
        url: 'https://api.cognitive.microsoft.com/sts/v1.0/issueToken',
        headers: {
            'Ocp-Apim-Subscription-Key' : apiKey
        }
    }, function (err, resp, access_token) {
        if (err || resp.statusCode != 200) {
            console.log(err, resp.body);
        } else {
            try {
                request.post({
                    url: 'https://speech.platform.bing.com/synthesize',
                    body: post_speak_data,
                    headers: {
                        'content-type' : 'application/ssml+xml',
                        'X-Microsoft-OutputFormat' : 'riff-16khz-16bit-mono-pcm',
                        'Authorization': 'Bearer ' + access_token,
                        'X-Search-AppId': '9FCF779F0EFB4E8E8D293EEC544221E9',
                        'X-Search-ClientID': '0A13B7717D0349E683C00A6AEA9E8B6D',
                        'User-Agent': 'Node.js-Demo'
                    },
                    encoding: null
                }, function (err, resp, data) {
                    if (err || resp.statusCode != 200) {
                        console.log(err, resp.body);
                    } else {
                        try {
                            console.log('Saving output to file: ', outputFile);
                            fs.writeFileSync(outputFile, data);
                        } catch (e) {
                            console.log(e.message);
                        }
                    }
                });
            } catch (e) {
                console.log(e.message);
            }
        }
    });
}

synthesizeSpeech(key);

也可以在此处查看MARY项目: http://mary.dfki.de/,是可以安装的开源服务器,声音输出非常好,您可以从node.js调用服务器.

Also check out the MARY project here: http://mary.dfki.de/, this is an open source server that you can install, the voice output is very good, you could make calls to the server from node.js.

如果您安装Mary语音引擎(非常简单):

If you install the Mary Speech engine (quite easy):

"use strict";

const fs = require('fs');
const request = require('request');
const text = process.argv[2] || "The fault, dear Brutus, is not in our stars, But in ourselves, that we are underlings.";
const outputFile = process.argv[3] || "speech_mary_output.wav";

const options = {
    url: `http://localhost:59125/process?INPUT_TEXT=${text}!&INPUT_TYPE=TEXT&OUTPUT_TYPE=AUDIO&AUDIO=WAVE_FILE&LOCALE=en_US&VOICE=cmu-slt-hsmm`,
    encoding: null // Binary data.
}

console.log('Synthesizing speech (using Mary engine): ', text);
console.log('Calling: ', options.url);
request.get(options, function (err, resp, data) {
    if (err || resp.statusCode != 200) {
        console.log(err, resp.body);
    } else {
        try {
            console.log(`Saving output to file: ${outputFile}, length: ${data.length} byte(s)`);
            fs.writeFileSync(outputFile, data, { encoding: 'binary'});
        } catch (e) {
            console.log(e.message);
        }
    }
});

这将为您合成语音.无需API密钥!

This will synthesize speech for you. No API key required!

这篇关于文本到node.js上的音频文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆