如何在Chrome浏览器中创建文本或将文本转换为音频? [英] How to create or convert text to audio at chromium browser?
问题描述
尝试确定如何使用Web Speech API的解决方案时在铬?发现
var voices = window.speechSynthesis.getVoices();
为 voices
标识符返回一个空数组。
returns an empty array for voices
identifier.
不确定铬浏览器缺乏支持是否与此问题有关不行,谷歌:Chromium语音扩展在间谍问题后被撤消?
Not certain if lack of support at chromium browser is related to this issue Not OK, Google: Chromium voice extension pulled after spying concerns?
问题:
1)是否有任何变通办法可以实现在Chrome浏览器中创建或转换文本音频的要求?
1) Are there any workarounds which can implement the requirement of creating or converting audio from text at chromium browser?
2)开发人员社区如何创建一个反映常见和非常用词的音频文件的开源数据库;提供适当的 CORS
标题?
2) How can we, the developer community, create an open source database of audio files reflecting both common and uncommon words; served with appropriate CORS
headers?
推荐答案
有几种可行的解决方法可以找到了提供从文本创建音频的能力;其中两个需要请求外部资源,另一个使用@masswerk meSpeak.js 。
There are several possible workarounds that have found which provide the ability to create audio from text; two of which require requesting an external resource, the other uses meSpeak.js by @masswerk.
使用从Google下载单词的音频发音,如果没有编写shell脚本或执行 HEAD
请求以检查是否存在网络错误发生。例如,下面使用的资源中没有单词do。
Using approach described at Download the Audio Pronunciation of Words from Google, which suffers from not being able to pre-determine which words actually exist as a file at the resource without writing a shell script or performing a HEAD
request to check if a network error occurs. For example, the word "do" is not available at the resource used below.
window.addEventListener("load", () => {
const textarea = document.querySelector("textarea");
const audio = document.createElement("audio");
const mimecodec = "audio/webm; codecs=opus";
audio.controls = "controls";
document.body.appendChild(audio);
audio.addEventListener("canplay", e => {
audio.play();
});
let words = textarea.value.trim().match(/\w+/g);
const url = "https://ssl.gstatic.com/dictionary/static/sounds/de/0/";
const mediatype = ".mp3";
Promise.all(
words.map(word =>
fetch(`https://query.yahooapis.com/v1/public/yql?q=select * from data.uri where url="${url}${word}${mediatype}"&format=json&callback=`)
.then(response => response.json())
.then(({query: {results: {url}}}) =>
fetch(url).then(response => response.blob())
.then(blob => blob)
)
)
)
.then(blobs => {
// const a = document.createElement("a");
audio.src = URL.createObjectURL(new Blob(blobs, {
type: mimecodec
}));
// a.download = words.join("-") + ".webm";
// a.click()
})
.catch(err => console.log(err));
});
<textarea>what it does my ninja?</textarea>
维基共享资源类别:公共领域的资源是不必在同一目录中提供,请参阅如何检索维基词典内容?, wikionary API - meani单词。
Resources at Wikimedia Commons Category:Public domain are not necessary served from same directory, see How to retrieve Wiktionary word content?, wikionary API - meaning of words.
如果资源的确切位置已知,则可以请求音频,但URL可能包含除单词本身以外的前缀。
If the precise location of the resource is known, the audio can be requested, though the URL may include prefixes other than the word itself.
fetch("https://upload.wikimedia.org/wikipedia/commons/c/c5/En-uk-hello-1.ogg")
.then(response => response.blob())
.then(blob => new Audio(URL.createObjectURL(blob)).play());
不完全确定如何使用 Wikipedia API ,如何使用维基百科的API获取维基百科内容?,是否有一个干净的维基百科API仅用于检索内容摘要?以仅获取音频文件。需要针对以 .ogg
结尾的文本解析 JSON
响应,然后需要进行第二次请求对于资源本身。
Not entirely sure how to use the Wikipedia API, How to get Wikipedia content using Wikipedia's API?, Is there a clean wikipedia API just for retrieve content summary? to get only the audio file. The JSON
response would need to be parsed for text ending in .ogg
, then a second request would need to be made for the resource itself.
fetch("https://en.wiktionary.org/w/api.php?action=parse&format=json&prop=text&callback=?&page=hello")
.then(response => response.text())
.then(data => {
new Audio(location.protocol + data.match(/\/\/upload\.wikimedia\.org\/wikipedia\/commons\/[\d-/]+[\w-]+\.ogg/).pop()).play()
})
// "//upload.wikimedia.org/wikipedia/commons/5/52/En-us-hello.ogg\"
记录
Fetch API cannot load https://en.wiktionary.org/w/api.php?action=parse&format=json&prop=text&callback=?&page=hello. No 'Access-Control-Allow-Origin' header is present on the requested resource
未请求时相同的起源。我们需要再次尝试使用 YQL
,但不确定如何制定查询以避免错误。
when not requested from same origin. We would need to try to use YQL
again, though not certain how to formulate the query to avoid errors.
第三种方法使用稍微修改的 meSpeak.js
版本来生成音频而不进行外部请求。修改是为 .loadConfig()
方法
The third approach uses a slightly modified version of meSpeak.js
to generate the audio without making an external request. The modification was to create a proper callback for .loadConfig()
method
fetch("https://gist.githubusercontent.com/guest271314/f48ee0658bc9b948766c67126ba9104c/raw/958dd72d317a6087df6b7297d4fee91173e0844d/mespeak.js")
.then(response => response.text())
.then(text => {
const script = document.createElement("script");
script.textContent = text;
document.body.appendChild(script);
return Promise.all([
new Promise(resolve => {
meSpeak.loadConfig("https://gist.githubusercontent.com/guest271314/8421b50dfa0e5e7e5012da132567776a/raw/501fece4fd1fbb4e73f3f0dc133b64be86dae068/mespeak_config.json", resolve)
}),
new Promise(resolve => {
meSpeak.loadVoice("https://gist.githubusercontent.com/guest271314/fa0650d0e0159ac96b21beaf60766bcc/raw/82414d646a7a7ef11bb04ddffe4091f78ef121d3/en.json", resolve)
})
])
})
.then(() => {
// takes approximately 14 seconds to get here
console.log(meSpeak.isConfigLoaded());
meSpeak.speak("what it do my ninja", {
amplitude: 100,
pitch: 5,
speed: 150,
wordgap: 1,
variant: "m7"
});
})
.catch(err => console.log(err));
上述方法的一个警告是它在播放音频之前,要加载三个文件需要大约14秒半。但是,避免外部请求。
one caveat of the above approach being that it takes approximately 14 and a half seconds for the three files to load before the audio is played back. However, avoids external requests.
对于其中一个或两个都是积极的1)创建 FOSS ,开发人员维护数据库或声音目录,包括常见和非常见的单词; 2)进一步开发 meSpeak.js
以减少三个必要文件的加载时间;并使用 Promise
为基础的方法提供文件加载进度和应用程序准备情况的通知。
It would be a positive to either or both 1) create a FOSS, developer maintained database or directory of sounds for both common and uncommon words; 2) perform further development of meSpeak.js
to reduce load time of the three necessary files; and use Promise
based approaches to provide notifications of the progress of of the loading of the files and readiness of the application.
在这个用户的估计中,如果开发人员自己创建并贡献了一个文件的在线数据库,并使用特定单词的音频文件进行响应,那么它将是一个有用的资源。不完全确定 github 是否适合托管音频文件?如果显示对此类项目的兴趣,则必须考虑可能的选项。
In this users' estimation, it would be a useful resource if developers themselves created and contributed to an online database of files which responded with an audio file of the specific word. Not entirely sure if github is the appropriate venue to host audio files? Will have to consider the possible options if interest in such a project is shown.
这篇关于如何在Chrome浏览器中创建文本或将文本转换为音频?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!