如何在 Chrome 浏览器中创建或将文本转换为音频? [英] How to create or convert text to audio at chromium browser?

查看:31
本文介绍了如何在 Chrome 浏览器中创建或将文本转换为音频?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在尝试确定如何使用 Web Speech API 的解决方案时在铬?发现

var voices = window.speechSynthesis.getVoices();

voices 标识符返回一个空数组.

不确定 Chrome 浏览器缺乏支持是否与此问题有关 不行,谷歌:Chromium 语音扩展在间谍问题后被撤消?

问题:

1) 是否有任何解决方法可以实现在 Chrome 浏览器中从文本创建或转换音频的要求?

2) 作为开发者社区,我们如何创建一个包含常用和不常用词的音频文件的开源数据库;提供适当的 CORS 标头?

解决方案

已经发现了几种可能的解决方法,它们提供了从文本创建音频的能力;其中两个需要请求外部资源,另一个使用@masswerk 的 meSpeak.js.>

使用 从 Google 下载单词的音频发音中描述的方法,如果没有 编写shell 脚本 或执行HEAD 请求以检查是否发生网络错误.例如,下面使用的资源中没有do"这个词.

window.addEventListener("load", () => {const textarea = document.querySelector("textarea");const audio = document.createElement("audio");const mimecodec = "audio/webm; codecs=opus";audio.controls = "控制";document.body.appendChild(audio);audio.addEventListener("canplay", e => {音频播放();});let words = textarea.value.trim().match(/w+/g);const url = "https://ssl.gstatic.com/dictionary/static/sounds/de/0/";const mediatype = ".mp3";承诺.all(word.map(word =>fetch(`https://query.yahooapis.com/v1/public/yql?q=select * from data.uri where url="${url}${word}${mediatype}"&format=json&回调=`).then(response => response.json()).then(({query: {results: {url}}}) =>fetch(url).then(response => response.blob()).then(blob => blob)))).then(blob => {//const a = document.createElement("a");audio.src = URL.createObjectURL(new Blob(blob, {类型: mimecodec}));//a.download = words.join("-") + ".webm";//a.click()}).catch(err => console.log(err));});

<textarea>它对我的忍者有什么作用?</textarea>

Wikimedia Commons Category:Public domain 中的资源不必从同一目录提供,请参阅如何检索维基词典单词内容?wikionary API - 词的含义.

如果知道资源的精确位置,则可以请求音频,尽管 URL 可能包含单词本身以外的前缀.

fetch("https://upload.wikimedia.org/wikipedia/commons/c/c5/En-uk-hello-1.奥格").then(response => response.blob()).then(blob => new Audio(URL.createObjectURL(blob)).play());

不完全确定如何使用 维基百科 API如何使用维基百科的API获取维基百科内容?, 是否有一个干净的维基百科API只用于检索内容摘要? 仅获取音频文件.JSON 响应需要解析以 .ogg 结尾的文本,然后需要为资源本身发出第二个请求.

fetch("https://en.wiktionary.org/w/api.php?action=parse&format=json&prop=text&callback=?&page=hello").then(response => response.text()).then(数据=> {新音频(location.protocol + data.match(///upload.wikimedia.org/wikipedia/commons/[d-/]+[w-]+.ogg/).弹出()).播放()})//"//upload.wikimedia.org/wikipedia/commons/5/52/En-us-hello.ogg"

哪些日志

Fetch API 无法加载 https://en.wiktionary.org/w/api.php?action=parse&format=json&prop=text&callback=?&page=hello.请求的资源上不存在Access-Control-Allow-Origin"标头

当不是来自同一来源的请求时.我们需要再次尝试使用 YQL,尽管不确定如何制定查询以避免错误.

第三种方法使用稍微修改过的 meSpeak.js 版本来生成音频,而无需发出外部请求.修改是为 .loadConfig() 方法创建一个适当的回调

fetch("https://gist.githubusercontent.com/guest271314/f48ee0658bc9b948766c67126ba9104c/raw/958dd72d317a608971dmes/958dd72d317a60897100000000000000000".then(response => response.text()).then(文本=> {const script = document.createElement("script");script.textContent = 文本;document.body.appendChild(script);返回 Promise.all([新承诺(解决 => {meSpeak.loadConfig("https://gist.githubusercontent.com/guest271314/8421b50dfa0e5e7e5012da132567776a/raw/501fece4fd1fbb4e73f3f0dc133b64be86dae068/mes_resolve}),新承诺(解决 => {meSpeak.loadVoice("https://gist.githubusercontent.com/guest271314/fa0650d0e0159ac96b21beaf60766bcc/raw/82414d646a7a7ef11bb04ddffe4091f78ef121d3/en" resolve.})])}).then(() => {//大约需要 14 秒到达这里console.log(meSpeak.isConfigLoaded());meSpeak.speak("它对我的忍者有什么作用", {振幅:100,音高:5,速度:150,字距:1,变体:m7"});}).catch(err => console.log(err));

上述方法的一个警告是,在播放音频之前加载三个文件需要大约 14 秒半的时间.但是,避免外部请求.

1) 创建一个 FOSS,开发人员维护常用词和不常用词的声音数据库或目录;2) 进一步开发 meSpeak.js 以减少三个必要文件的加载时间;并使用基于 Promise 的方法来提供有关文件加载进度和应用程序准备情况的通知.

在此用户的估计中,如果开发人员自己创建并贡献给在线文件数据库,该数据库以特定单词的音频文件作为响应,这将是一个有用的资源.不确定 github 是否适合托管音频文件?如果显示出对此类项目的兴趣,则必须考虑可能的选项.

While trying to determine a solution to How to use Web Speech API at chromium? found that

var voices = window.speechSynthesis.getVoices();

returns an empty array for voices identifier.

Not certain if lack of support at chromium browser is related to this issue Not OK, Google: Chromium voice extension pulled after spying concerns?

Questions:

1) Are there any workarounds which can implement the requirement of creating or converting audio from text at chromium browser?

2) How can we, the developer community, create an open source database of audio files reflecting both common and uncommon words; served with appropriate CORS headers?

解决方案

There are several possible workarounds that have found which provide the ability to create audio from text; two of which require requesting an external resource, the other uses meSpeak.js by @masswerk.

Using approach described at Download the Audio Pronunciation of Words from Google, which suffers from not being able to pre-determine which words actually exist as a file at the resource without writing a shell script or performing a HEAD request to check if a network error occurs. For example, the word "do" is not available at the resource used below.

window.addEventListener("load", () => {

  const textarea = document.querySelector("textarea");

  const audio = document.createElement("audio");

  const mimecodec = "audio/webm; codecs=opus";

  audio.controls = "controls";

  document.body.appendChild(audio);

  audio.addEventListener("canplay", e => {
    audio.play();
  });

  let words = textarea.value.trim().match(/w+/g);

  const url = "https://ssl.gstatic.com/dictionary/static/sounds/de/0/";

  const mediatype = ".mp3";

  Promise.all(
    words.map(word =>
      fetch(`https://query.yahooapis.com/v1/public/yql?q=select * from data.uri where url="${url}${word}${mediatype}"&format=json&callback=`)
      .then(response => response.json())
      .then(({query: {results: {url}}}) =>
        fetch(url).then(response => response.blob())
        .then(blob => blob)
      )
    )
  )
  .then(blobs => {
    // const a = document.createElement("a");
    audio.src = URL.createObjectURL(new Blob(blobs, {
                  type: mimecodec
                }));
    // a.download = words.join("-") + ".webm";
    // a.click()
  })
  .catch(err => console.log(err));
});

<textarea>what it does my ninja?</textarea>

Resources at Wikimedia Commons Category:Public domain are not necessary served from same directory, see How to retrieve Wiktionary word content?, wikionary API - meaning of words.

If the precise location of the resource is known, the audio can be requested, though the URL may include prefixes other than the word itself.

fetch("https://upload.wikimedia.org/wikipedia/commons/c/c5/En-uk-hello-1.ogg")
.then(response => response.blob())
.then(blob => new Audio(URL.createObjectURL(blob)).play());

Not entirely sure how to use the Wikipedia API, How to get Wikipedia content using Wikipedia's API?, Is there a clean wikipedia API just for retrieve content summary? to get only the audio file. The JSON response would need to be parsed for text ending in .ogg, then a second request would need to be made for the resource itself.

fetch("https://en.wiktionary.org/w/api.php?action=parse&format=json&prop=text&callback=?&page=hello")
.then(response => response.text())
.then(data => {
  new Audio(location.protocol + data.match(///upload.wikimedia.org/wikipedia/commons/[d-/]+[w-]+.ogg/).pop()).play()
})
// "//upload.wikimedia.org/wikipedia/commons/5/52/En-us-hello.ogg"

which logs

Fetch API cannot load https://en.wiktionary.org/w/api.php?action=parse&format=json&prop=text&callback=?&page=hello. No 'Access-Control-Allow-Origin' header is present on the requested resource

when not requested from same origin. We would need to try to use YQL again, though not certain how to formulate the query to avoid errors.

The third approach uses a slightly modified version of meSpeak.js to generate the audio without making an external request. The modification was to create a proper callback for .loadConfig() method

fetch("https://gist.githubusercontent.com/guest271314/f48ee0658bc9b948766c67126ba9104c/raw/958dd72d317a6087df6b7297d4fee91173e0844d/mespeak.js")
  .then(response => response.text())
  .then(text => {
    const script = document.createElement("script");
    script.textContent = text;
    document.body.appendChild(script);

    return Promise.all([
      new Promise(resolve => {
        meSpeak.loadConfig("https://gist.githubusercontent.com/guest271314/8421b50dfa0e5e7e5012da132567776a/raw/501fece4fd1fbb4e73f3f0dc133b64be86dae068/mespeak_config.json", resolve)
      }),
      new Promise(resolve => {
        meSpeak.loadVoice("https://gist.githubusercontent.com/guest271314/fa0650d0e0159ac96b21beaf60766bcc/raw/82414d646a7a7ef11bb04ddffe4091f78ef121d3/en.json", resolve)
      })
    ])
  })
  .then(() => {
    // takes approximately 14 seconds to get here
    console.log(meSpeak.isConfigLoaded());
    meSpeak.speak("what it do my ninja", {
      amplitude: 100,
      pitch: 5,
      speed: 150,
      wordgap: 1,
      variant: "m7"
    });
})
.catch(err => console.log(err));

one caveat of the above approach being that it takes approximately 14 and a half seconds for the three files to load before the audio is played back. However, avoids external requests.

It would be a positive to either or both 1) create a FOSS, developer maintained database or directory of sounds for both common and uncommon words; 2) perform further development of meSpeak.js to reduce load time of the three necessary files; and use Promise based approaches to provide notifications of the progress of of the loading of the files and readiness of the application.

In this users' estimation, it would be a useful resource if developers themselves created and contributed to an online database of files which responded with an audio file of the specific word. Not entirely sure if github is the appropriate venue to host audio files? Will have to consider the possible options if interest in such a project is shown.

这篇关于如何在 Chrome 浏览器中创建或将文本转换为音频?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆