如何在我的 Flask 应用程序中连接浏览器的麦克风? [英] How do I connect browser's microphone in my flask app?

查看:16
本文介绍了如何在我的 Flask 应用程序中连接浏览器的麦克风?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用语音识别模块通过语音识别搜索查询,然后打开谷歌浏览器页面显示查询结果.基本上,它是谷歌语音搜索的替代品,但它是通过终端启动的.但我想把它变成一个网络应用程序.我创建了烧瓶应用程序:

I am using speech_recognition module to identify a search query through voice and then open a google chrome page showing the result for the query. Basically, it's a replacement of the google voice search but it's initiated through the terminal. But I want to make this into a web-app. I created the flask app:

-搜索(目录)

-search.py (opens a tab using terminal directly/works independently)

-app.py (main flask app)

-static(directory)

-templates (directory)

但由于应用程序托管在服务器上,我的 search.py​​ 从服务器麦克风获取输入(在这种情况下,它是我 PC 的麦克风/但在 AWS 上,它不起作用).如何从客户端浏览器获取输入并在 Speech.py​​ 中使用它?我应该删除这个文件并直接在我的主应用程序中使用它吗?执行此功能的最有效方法是什么?

But since the app is hosted on the server, my search.py takes input from the server mic(in this case it's my PC's mic/ but on AWS, it won't work). How do I take input from the client browser and use it in speech.py? Should I delete this file and use it directly in my main app? What is the most effective way to execute this functionality?

如果有人想知道,这是我的 search.py​​ 脚本:它通过终端工作.

Here is my search.py script if anyone wants to know: It works through the terminal.

import subprocess

import speech_recognition as sr

browser_exe_path = "..."

r=sr.Recognizer()
with sr.Microphone() as source:
    print("Listening!")
    audio=r.listen(source)

    try:
        s_name=r.recognize_google(audio)
        """
        Code to open browser and search the query
        """
    except:
        print("Error!")

推荐答案

这两个可能是最好的方法:

These two would probably be the best ways:

  • 制作您自己的语音识别工具的模块/包并将其导入您的烧瓶应用
  • 将功能本身集成到应用中.

如果您打算再次使用它,最好将语音识别与网络应用程序分开,因为这样您就可以再次使用它.但是,如果您将其与应用程序的视图功能等集成,则可以对其进行更多自定义.此外,您可能应该将所有 search.py​​ 逻辑放在一个函数或类中,以便您可以调用它.否则,如果你现在按原样导入它,它会立即运行.

If you plan on using it again, it might be a good idea to keep the speech recognition separate from the web app, because then you can use it again. But you can customise it much more if you integrate it with, for example, the view functions for your application. Also, you should probably put all your search.py logic in one function or class, so that you can call it. Otherwise, if you import it as it is now, it will immediately run.

不管怎样,你都需要一个看起来像这样的语音结构:

Either way, you need a speech structure that looks something like this:

  1. 用户提交了一些演讲,可以是直播的、录制的,也可以是文件.我们将此语音文件称为 speech.wav(或任何其他文件类型,您选择)
  2. speech.wav 由您的语音识别工具读取和解析.它可能会返回一个单词列表,或者可能只是一个字符串.我们称之为output.
  3. output 返回到网页并呈现为供用户阅读的内容.
  1. The user submits some speech, either live, recorded, or as a file. We'll call this speech file speech.wav (or any other file type, your choice)
  2. speech.wav is read and parsed by your speech recognition tool. It might return a list of words, or maybe just a string. We'll call this output.
  3. output is returned to the webpage and rendered as something for the user to read.

我建议从表单提交开始,如果您可以使用它,您可以尝试使用 AJAX 进行实时语音识别.从基本开始,只需要求用户添加音频文件或录制一个.如果在桌面上,以下脚本将打开文件浏览器,如果在 iOS 或 Android 上,则让用户进行记录.

I suggest starting with a form submission and if you can get that to work, you can try a live speech recognition with AJAX. Start basic and just ask the user to add an audio file or record one. The following script will open up the file browser if on desktop, or get the user to record if on iOS or Android.

  <input name="audio-recording" type="file" accept="audio/*" id="audio-recording" capture>
  <label for="audio-recording">Add Audio</label>

  <p id="output"></p>

因此,一旦他们获得了文件,您就需要访问它.您可能想要自定义它,但这里有一个基本脚本,它将控制上述音频.此脚本归功于 Google 开发者.

So once they've got a file there you need to access it. You may want to customise it, but here is a basic script which will take control of the above audio. Credit for this script goes to google developers.

<script>
  const recorder = document.getElementById('audio-recording');

  recorder.addEventListener('change', function(e) {
    const file = e.target.files[0];
    const url = URL.createObjectURL(file);
    // Do something with the audio file.
    
  });
</script>

//对音频文件进行处理 的地方,发出 AJAX GET 请求可能是一个很酷的主意,它将返回句子.但这就是它变得非常棘手的地方,因为您需要将信息提供给参数中的烧瓶,而不是音频文件.但是因为我们已经在脚本中的常量 url 中存储了文件所在的位置,所以我们可以使用它作为参数,例如:

Where it says // Do something with the audio file, it might be a cool idea to make an AJAX GET request, which will return the sentence. But this is where it gets really tricky, because you need to give the information to flask in arguments, not an audio file. But because we've stored the place where the file exists at the constant url in our script, we can use that as the argument, for example:

from flask import request, jsonify
import search # this is your own search.py that you mentioned in your question.

@app.route("/process_audio")
def process_audio():
    url = request.args.get("url")
    text = search.a_function(url) #returns the text from the audio, which you've done, so I've omitted code
    if text != None
        return jsonify(result="success",text=text)
    else:
        return jsonify(result="fail")

这将返回一种叫做 JSON 格式的数据,它就像客户端 js 和服务器端 python 之间的桥梁.它可能看起来像这样:

This'll return data in something called JSON format, which is like the bridge between client side js and server side python. It might look something like this:

{
 "result":"success",
 "text":"This is a test voice recording"
}

然后,您需要有一些 jQuery(或任何其他 js 库,但 jQuery 很好且简单)来管理 AJAX 调用:

Then, you need to have some jQuery (or any other js library, but jQuery is nice and easy) to manage the AJAX call:

<script src="//ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js"></script>
    <script type=text/javascript>
        const recorder = document.getElementById('audio-recording');

  recorder.addEventListener('change', function(e) {
    const file = e.target.files[0];
    const url = URL.createObjectURL(file);
    $.getJSON('/process_audio', {
          url: url 
        }, function(data) {
          $("#output").text(data.text);
            });
            return false;
          
    </script>

对于此处的任何括号错误,我们深表歉意.所以这应该向/audio_process"的 URL 发送一个对一些 JSON 的 GET 请求,它将返回我们之前看到的内容,然后它会将 JSON 的 text" 输出到#output" HTML 选择器.

Apologies for any bracketing errors there. So that should send a GET request for some JSON to the URL of "/audio_process", which will return what we saw earlier, and then it will output the "text" of the JSON to the "#output" HTML selector.

可能需要进行一些调试,但这似乎可以解决问题.

There may be some debugging needed, but that seems to do the trick.

这篇关于如何在我的 Flask 应用程序中连接浏览器的麦克风?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆