如何在烧瓶应用程序中连接浏览器的麦克风? [英] How do I connect browser's microphone in my flask app?

查看:46
本文介绍了如何在烧瓶应用程序中连接浏览器的麦克风?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用speech_recognition模块通过语音识别搜索查询,然后打开显示查询结果的google chrome页面.基本上,它是Google语音搜索的替代品,但它是通过终端启动的.但我想将其制作为网络应用程序.我创建了烧瓶应用程序:

I am using speech_recognition module to identify a search query through voice and then open a google chrome page showing the result for the query. Basically, it's a replacement of the google voice search but it's initiated through the terminal. But I want to make this into a web-app. I created the flask app:

-搜索(目录)

-search.py (opens a tab using terminal directly/works independently)

-app.py (main flask app)

-static(directory)

-templates (directory)

但是,由于该应用程序托管在服务器上,因此我的search.py​​从服务器麦克风输入(在这种情况下,这是我的PC的麦克风/,但在AWS上,它将无法工作).如何从客户端浏览器获取输入并将其用于Speech.py​​?我应该删除此文件并直接在主应用程序中使用它吗?执行此功能的最有效方法是什么?

But since the app is hosted on the server, my search.py takes input from the server mic(in this case it's my PC's mic/ but on AWS, it won't work). How do I take input from the client browser and use it in speech.py? Should I delete this file and use it directly in my main app? What is the most effective way to execute this functionality?

如果有人想知道,这是我的search.py​​脚本:它通过终端运行.

Here is my search.py script if anyone wants to know: It works through the terminal.

import subprocess

import speech_recognition as sr

browser_exe_path = "..."

r=sr.Recognizer()
with sr.Microphone() as source:
    print("Listening!")
    audio=r.listen(source)

    try:
        s_name=r.recognize_google(audio)
        """
        Code to open browser and search the query
        """
    except:
        print("Error!")

推荐答案

以下两种可能是最好的方法:

These two would probably be the best ways:

  • 制作您自己的语音识别工具的模块/程序包,并将其导入到flask应用程序中
  • 将功能本身集成到应用程序中.

如果您打算再次使用它,将语音识别与Web应用程序分开可能是一个好主意,因为这样您就可以再次使用它.但是,如果将其与应用程序的视图功能集成在一起,则可以进行更多自定义.另外,您可能应该将所有search.py​​逻辑放在一个函数或类中,以便可以调用它.否则,如果您现在导入它,它将立即运行.

If you plan on using it again, it might be a good idea to keep the speech recognition separate from the web app, because then you can use it again. But you can customise it much more if you integrate it with, for example, the view functions for your application. Also, you should probably put all your search.py logic in one function or class, so that you can call it. Otherwise, if you import it as it is now, it will immediately run.

无论哪种方式,您都需要一个看起来像这样的语音结构:

Either way, you need a speech structure that looks something like this:

  1. 用户提交一些语音,无论是现场语音,录制语音还是文件语音.我们将此语音文件称为 speech.wav (或您选择的任何其他文件类型)
  2. 您的语音识别工具会读取并解析
  3. speech.wav .它可能会返回一个单词列表,或者可能只是一个字符串.我们将其称为 output .
  4. 输出返回到网页,并呈现为供用户阅读的内容.
  1. The user submits some speech, either live, recorded, or as a file. We'll call this speech file speech.wav (or any other file type, your choice)
  2. speech.wav is read and parsed by your speech recognition tool. It might return a list of words, or maybe just a string. We'll call this output.
  3. output is returned to the webpage and rendered as something for the user to read.

我建议从表单提交开始,如果您可以使用它,可以尝试使用AJAX进行实时语音识别.从基本开始,仅要求用户添加音频文件或录制一个音频文件.如果是在台式机上,以下脚本将打开文件浏览器;如果是iOS或Android,则以下脚本将使用户记录文件.

I suggest starting with a form submission and if you can get that to work, you can try a live speech recognition with AJAX. Start basic and just ask the user to add an audio file or record one. The following script will open up the file browser if on desktop, or get the user to record if on iOS or Android.

  <input name="audio-recording" type="file" accept="audio/*" id="audio-recording" capture>
  <label for="audio-recording">Add Audio</label>

  <p id="output"></p>

因此,一旦他们有了文件,就需要访问它.您可能需要对其进行自定义,但这是一个基本脚本,它将控制上述音频.此脚本的功劳归Google开发人员所有.

So once they've got a file there you need to access it. You may want to customise it, but here is a basic script which will take control of the above audio. Credit for this script goes to google developers.

<script>
  const recorder = document.getElementById('audio-recording');

  recorder.addEventListener('change', function(e) {
    const file = e.target.files[0];
    const url = URL.createObjectURL(file);
    // Do something with the audio file.
    
  });
</script>

在说出//对音频文件做些事情的地方,发出AJAX GET请求,这将返回句子,这可能是一个很不错的主意.但这确实很棘手,因为您需要使用参数而不是音频文件将信息提供给flask.但是因为我们已经在脚本中的 url 常量中存储了文件所在的位置,所以我们可以将其用作参数,例如:

Where it says // Do something with the audio file, it might be a cool idea to make an AJAX GET request, which will return the sentence. But this is where it gets really tricky, because you need to give the information to flask in arguments, not an audio file. But because we've stored the place where the file exists at the constant url in our script, we can use that as the argument, for example:

from flask import request, jsonify
import search # this is your own search.py that you mentioned in your question.

@app.route("/process_audio")
def process_audio():
    url = request.args.get("url")
    text = search.a_function(url) #returns the text from the audio, which you've done, so I've omitted code
    if text != None
        return jsonify(result="success",text=text)
    else:
        return jsonify(result="fail")

这将以JSON格式返回数据,就像客户端js和服务器端python之间的桥梁一样.它可能看起来像这样:

This'll return data in something called JSON format, which is like the bridge between client side js and server side python. It might look something like this:

{
 "result":"success",
 "text":"This is a test voice recording"
}

然后,您需要使用一些jQuery(或其他任何js库,但是jQuery很简单)来管理AJAX调用:

Then, you need to have some jQuery (or any other js library, but jQuery is nice and easy) to manage the AJAX call:

<script src="//ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js"></script>
    <script type=text/javascript>
        const recorder = document.getElementById('audio-recording');

  recorder.addEventListener('change', function(e) {
    const file = e.target.files[0];
    const url = URL.createObjectURL(file);
    $.getJSON('/process_audio', {
          url: url 
        }, function(data) {
          $("#output").text(data.text);
            });
            return false;
          
    </script>

很抱歉在那里出现任何包围曝光错误.因此,这应该将针对JSON的GET请求发送到"/audio_process"的URL,这将返回我们之前看到的内容,然后将JSON的"text" 输出到#output" HTML选择器.

Apologies for any bracketing errors there. So that should send a GET request for some JSON to the URL of "/audio_process", which will return what we saw earlier, and then it will output the "text" of the JSON to the "#output" HTML selector.

可能需要进行一些调试,但这似乎可以解决问题.

There may be some debugging needed, but that seems to do the trick.

这篇关于如何在烧瓶应用程序中连接浏览器的麦克风?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆