如何在Chrome浏览器中创建文本或将文本转换为音频? [英] How to create or convert text to audio at chromium browser?

查看:166
本文介绍了如何在Chrome浏览器中创建文本或将文本转换为音频?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

尝试确定如何使用Web Speech API的解决方案时在铬?发现

var voices = window.speechSynthesis.getVoices();

voices 标识符返回一个空数组。

returns an empty array for voices identifier.

不确定铬浏览器缺乏支持是否与此问题有关不行,谷歌:Chromium语音扩展在间谍问题后被撤消

Not certain if lack of support at chromium browser is related to this issue Not OK, Google: Chromium voice extension pulled after spying concerns?

问题:

1)是否有任何变通办法可以实现在Chrome浏览器中创建或转换文本音频的要求?

1) Are there any workarounds which can implement the requirement of creating or converting audio from text at chromium browser?

2)开发人员社区如何创建一个反映常见和非常用词的音频文件的开源数据库;提供适当的 CORS 标题?

2) How can we, the developer community, create an open source database of audio files reflecting both common and uncommon words; served with appropriate CORS headers?

推荐答案

有几种可行的解决方法可以找到了提供从文本创建音频的能力;其中两个需要请求外部资源,另一个使用@masswerk meSpeak.js

There are several possible workarounds that have found which provide the ability to create audio from text; two of which require requesting an external resource, the other uses meSpeak.js by @masswerk.

使用从Google下载单词的音频发音,如果没有编写shell脚本或执行 HEAD 请求以检查是否存在网络错误发生。例如,下面使用的资源中没有单词do。

Using approach described at Download the Audio Pronunciation of Words from Google, which suffers from not being able to pre-determine which words actually exist as a file at the resource without writing a shell script or performing a HEAD request to check if a network error occurs. For example, the word "do" is not available at the resource used below.

window.addEventListener("load", () => {

  const textarea = document.querySelector("textarea");

  const audio = document.createElement("audio");

  const mimecodec = "audio/webm; codecs=opus";

  audio.controls = "controls";

  document.body.appendChild(audio);

  audio.addEventListener("canplay", e => {
    audio.play();
  });

  let words = textarea.value.trim().match(/\w+/g);

  const url = "https://ssl.gstatic.com/dictionary/static/sounds/de/0/";

  const mediatype = ".mp3";

  Promise.all(
    words.map(word =>
      fetch(`https://query.yahooapis.com/v1/public/yql?q=select * from data.uri where url="${url}${word}${mediatype}"&format=json&callback=`)
      .then(response => response.json())
      .then(({query: {results: {url}}}) =>
        fetch(url).then(response => response.blob())
        .then(blob => blob)
      )
    )
  )
  .then(blobs => {
    // const a = document.createElement("a");
    audio.src = URL.createObjectURL(new Blob(blobs, {
                  type: mimecodec
                }));
    // a.download = words.join("-") + ".webm";
    // a.click()
  })
  .catch(err => console.log(err));
});

<textarea>what it does my ninja?</textarea>

维基共享资源类别:公共领域的资源是不必在同一目录中提供,请参阅如何检索维基词典内容? wikionary API - meani单词

Resources at Wikimedia Commons Category:Public domain are not necessary served from same directory, see How to retrieve Wiktionary word content?, wikionary API - meaning of words.

如果资源的确切位置已知,则可以请求音频,但URL可能包含除单词本身以外的前缀。

If the precise location of the resource is known, the audio can be requested, though the URL may include prefixes other than the word itself.

fetch("https://upload.wikimedia.org/wikipedia/commons/c/c5/En-uk-hello-1.ogg")
.then(response => response.blob())
.then(blob => new Audio(URL.createObjectURL(blob)).play());

不完全确定如何使用 Wikipedia API 如何使用维基百科的API获取维基百科内容?是否有一个干净的维基百科API仅用于检索内容摘要?以仅获取音频文件。需要针对以 .ogg 结尾的文本解析 JSON 响应,然后需要进行第二次请求对于资源本身。

Not entirely sure how to use the Wikipedia API, How to get Wikipedia content using Wikipedia's API?, Is there a clean wikipedia API just for retrieve content summary? to get only the audio file. The JSON response would need to be parsed for text ending in .ogg, then a second request would need to be made for the resource itself.

fetch("https://en.wiktionary.org/w/api.php?action=parse&format=json&prop=text&callback=?&page=hello")
.then(response => response.text())
.then(data => {
  new Audio(location.protocol + data.match(/\/\/upload\.wikimedia\.org\/wikipedia\/commons\/[\d-/]+[\w-]+\.ogg/).pop()).play()
})
// "//upload.wikimedia.org/wikipedia/commons/5/52/En-us-hello.ogg\"

记录

Fetch API cannot load https://en.wiktionary.org/w/api.php?action=parse&format=json&prop=text&callback=?&page=hello. No 'Access-Control-Allow-Origin' header is present on the requested resource

未请求时相同的起源。我们需要再次尝试使用 YQL ,但不确定如何制定查询以避免错误。

when not requested from same origin. We would need to try to use YQL again, though not certain how to formulate the query to avoid errors.

第三种方法使用稍微修改的 meSpeak.js 版本来生成音频而不进行外部请求。修改是为 .loadConfig()方法

The third approach uses a slightly modified version of meSpeak.js to generate the audio without making an external request. The modification was to create a proper callback for .loadConfig() method

fetch("https://gist.githubusercontent.com/guest271314/f48ee0658bc9b948766c67126ba9104c/raw/958dd72d317a6087df6b7297d4fee91173e0844d/mespeak.js")
  .then(response => response.text())
  .then(text => {
    const script = document.createElement("script");
    script.textContent = text;
    document.body.appendChild(script);

    return Promise.all([
      new Promise(resolve => {
        meSpeak.loadConfig("https://gist.githubusercontent.com/guest271314/8421b50dfa0e5e7e5012da132567776a/raw/501fece4fd1fbb4e73f3f0dc133b64be86dae068/mespeak_config.json", resolve)
      }),
      new Promise(resolve => {
        meSpeak.loadVoice("https://gist.githubusercontent.com/guest271314/fa0650d0e0159ac96b21beaf60766bcc/raw/82414d646a7a7ef11bb04ddffe4091f78ef121d3/en.json", resolve)
      })
    ])
  })
  .then(() => {
    // takes approximately 14 seconds to get here
    console.log(meSpeak.isConfigLoaded());
    meSpeak.speak("what it do my ninja", {
      amplitude: 100,
      pitch: 5,
      speed: 150,
      wordgap: 1,
      variant: "m7"
    });
})
.catch(err => console.log(err));

上述方法的一个警告是它在播放音频之前,要加载三个文件需要大约14秒半。但是,避免外部请求。

one caveat of the above approach being that it takes approximately 14 and a half seconds for the three files to load before the audio is played back. However, avoids external requests.

对于其中一个或两个都是积极的1)创建 FOSS ,开发人员维护数据库或声音目录,包括常见和非常见的单词; 2)进一步开发 meSpeak.js 以减少三个必要文件的加载时间;并使用 Promise 为基础的方法提供文件加载进度和应用程序准备情况的通知。

It would be a positive to either or both 1) create a FOSS, developer maintained database or directory of sounds for both common and uncommon words; 2) perform further development of meSpeak.js to reduce load time of the three necessary files; and use Promise based approaches to provide notifications of the progress of of the loading of the files and readiness of the application.

在这个用户的估计中,如果开发人员自己创建并贡献了一个文件的在线数据库,并使用特定单词的音频文件进行响应,那么它将是一个有用的资源。不完全确定 github 是否适合托管音频文件?如果显示对此类项目的兴趣,则必须考虑可能的选项。

In this users' estimation, it would be a useful resource if developers themselves created and contributed to an online database of files which responded with an audio file of the specific word. Not entirely sure if github is the appropriate venue to host audio files? Will have to consider the possible options if interest in such a project is shown.

这篇关于如何在Chrome浏览器中创建文本或将文本转换为音频?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆