如何通过麦克风使用语音为我网页上的用户自动填写表单? [英] How can I automate the form filling process for a user on my webpage with voice via their microphone?

查看:42
本文介绍了如何通过麦克风使用语音为我网页上的用户自动填写表单?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有 Flask 网络表单的网页.目前,用户需要手动将他们的信息输入到网页中.然后它被附加到一个表中,一旦单击提交,它们就会被重定向到该表.设置基本上是:视频自动播放并询问用户问题,用户手动填写答案,点击提交后,他们会看到他们的答案附加到表格中.

我想减少页面的混乱,让用户可以口头回答视频问题.我读过有关 getusermedia、websockets 和 WebRTCs 的文章,但对它们感到困惑.我已经到处浏览了,youtube,reddit 等等.具体来说,这里这里此处这里运气不佳.

我正在考虑使用带有不同变量的语音识别器的简单 for 循环,然后按原样传递数据,但我不确定如何将麦克风动作与前端连接起来.前端不是所有数据都驻留,所以我们需要一个http请求来获取并分析它吗?这是我的代码:

main.py:

from flask import render_template, Flask, request导入操作系统从 vaderSentiment.vaderSentiment 导入 SentimentIntensityAnalyzer 作为 SIA导入 nltk导入 io导入操作系统从 nltk.corpus 导入停用词导入语音识别为 srapp = Flask(__name__, static_folder = 'static')# # 设置停用词为英文版本# stop_words = set(stopwords.words("english"))# # 创建识别器# r = sr.Recognizer()# # 定义麦克风# mic = sr.Microphone(device_index=0)# r.energy_threshold = 300# # vader 情感分析器,用于分析文本的情感# sid = SIA()# 用户 = []# 位置 = []# 状态 = []# info = [用户、位置、状态]# # 患者姓名?@app.route("/", methods=["GET", "POST"])定义家():user = request.values.get('name')location = request.values.get('location')state = request.values.get('state')# 如果 request.method == POST":# 以麦克风为来源:# 持有人 = []# for x in info:# audio_data = r.listen(source)# r.adjust_for_ambient_noise(source)# text = r.recognize_google(audio_data, language = 'en-IN')#holder.append(text.lower())# 如果 x == 状态":# ss = sid.polarity_scores(holder)# 如果 ss == "neg":# x.append(str(悲伤"))#             别的:# x.append(str(不悲伤"))#         别的:#filtered_words = [words for words in holder if not words in stop_words] # 这会过滤掉停用词# x.append(filtered_words.lower())# return redirect(url_for('care', user = user))return render_template('index.html', user = user, location=location, state=state)@app.route("/care", methods=["POST"])定义护理():user = request.values.get('name')location = request.values.get('location')state = request.values.get('state')return render_template('list.html', user = user, location=location, state=state)如果 __name__ == __main__":#app.run(debug=True)app.run(调试=真,线程=真)

index.html:

{% 扩展base.html";%}{% 块内容 %}<!---------治疗师科---------><section id=治疗师"><div class="container";id="therapist_container"><脚本>window.onload = 函数(){<div id="按钮"><按钮类型=按钮"类=btn btn-primary";id=治疗师按钮"数据切换=模态"data-target=#myModal">与 Delphi 对话</button>

<!-- 模态--><div class="模态淡入淡出";id=myModal"tabindex="-1";角色=对话"aria-labelledby="vid1Title";咏叹调隐藏=真"><div class="modal-dialog modal-dialog-center";角色=文档"><div class="modal-content"><div class="modal-body"><视频宽度=100%"id=视频1"><source src="./static/movie.mp4";类型=视频/mp4"></视频><form action="/care";方法=POST"><输入类型=文本"名称=名称"placeholder="你叫什么名字?";id=名称"><输入类型=文本"名称=位置"placeholder =你在哪里?"id=位置"><输入类型=文本"名称=状态"placeholder="我有什么可以帮忙的吗?";id=状态"><input id="buttonInput";class="btn btn-success form-control";类型=提交"值=发送"></表单>

<脚本>$('#myModal').on('shown.bs.modal', function () {$('#video1')[0].play();})$('#myModal').on('hidden.bs.modal', function () {$('#video1')[0].pause();})video = document.getElementById('video1');video.addEventListener('结束',function(){window.location.pathname = '/care';})函数回调(流){var context = new webkitAudioContext();var mediaStreamSource = context.createMediaStreamSource(stream);}$(document).ready(function() {navigator.webkitGetUserMedia({audio:true}, callback);}

</节>{% 端块内容 %}

list.html:

{% 扩展base.html";%}{% 块内容 %}<!----列表------><section id="care_list"><div class="container";id="care_list_container"><h1 class="jumbotron text-center";id=care_list_title">{{用户}}护理记录<div class="container"><table class="table table-hover"><头><tr><th scope="col">会话#</th><th范围=col">长度</th><th范围=col">位置</th><th scope=col">State</th></tr></thead><tr>第 1 个范围=行">第 1 个<td>{{长度}}</td><td>{{ location }}</td><td>{{ state }}</td></tr><tr>第 2 个范围=行">第 2 个<td></td><td></td><td></td></tr><tr>第 3 范围=行">3<td colspan=2"></td><td></td></tr></tbody><ul class="list-group list-group-flush";id="care_list"><li class="list-group-item">请发送电子邮件至 tom@vrifyhealth.com 寻求帮助.</li>

</节>{% 端块含量 %}

解决方案

它比我们想象的更容易就像我们创建 new Array() 一样,有一个代码 new SpeechRecognition() 来创建语音到文本转换器.不需要外部库来执行此操作.这是代码:-

/* JS 来了 */函数 SpeechRecog() {var output = document.getElementById("输出");var action = document.getElementById("action");var SpeechRecognition = SpeechRecognition ||webkitSpeechRecognition;var 识别 = new SpeechRecognition();//当语音识别服务启动时运行识别.onstart = function() {action.innerHTML = "听着,请说...</small>";};识别.onpeechend = 函数(){action.innerHTML = "<small>停止听,希望你完成了...</small>";识别.停止();}//当语音识别服务返回结果时运行识别.onresult = 函数(事件){var 成绩单 = event.results[0][0].transcript;var 信心 = event.results[0][0].confidence;输出值=成绩单;};//开始识别识别开始();}

button{白颜色;背景:蓝色;边界:无;填充:10px;边距:5px;边界半径:1em;}输入{填充:.5em;边距:.5em;}

<p>我是 Aakash1282,<br>你懒惰吗,这里是你的配音师</p><p><button type="button" onclick="SpeechRecog()">用语音书写</button>&nbsp;<span id="action"></span></p><input type="text" id="output">

这些代码在堆栈溢出中出现了一些问题,但在本地文件上运行良好,这里是 Codepen 工作代码:https://codepen.io/aakash1282/pen/xxqeQyM

参考它并根据需要制作表格.

I have a webpage with a web form with flask. Currently, users will need to manually enter their information into the webpage. Then it's appended to a table that they are redirected to once clicking submit. The setup is basically: video is autoplayed and asks user questions, the user fills out their answers manually, once clicking submit, they see their answers are appended to a table.

I want to reduce the clutter of the page and make it so that the user can verbally give their responses to the video questions. I've read about getusermedia, websockets, and WebRTCs, but am getting confused about them. I've looked all over here, youtube, reddit, and the like. Specifically, here, here, here, and here without much luck.

I'm thinking a simply for loop with speech recognizer with the different variable in a dict and then passing the data as is, but i'm not sure how to connect that microphone action with the frontend in particular. Isn't the front end where all of the data resides, so we need an http request to obtain it and analyze it? Here's my code:

main.py:

from flask import render_template, Flask, request
import os
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer as SIA
import nltk
import io
import os
from nltk.corpus import stopwords
import speech_recognition as sr

app = Flask(__name__, static_folder = 'static')

# # set the stopwords to be the english version
# stop_words = set(stopwords.words("english"))
# # create the recognizer
# r = sr.Recognizer()
# # define the microphone
# mic = sr.Microphone(device_index=0)
# r.energy_threshold = 300
# # vader sentiment analyzer for analyzing the sentiment of the text
# sid = SIA()
# user = []
# location = []
# state = []
# info = [user, location, state]
# # patient.name?


@app.route("/", methods=["GET", "POST"])
def home():
    user = request.values.get('name')
    location = request.values.get('location')
    state = request.values.get('state')
    # if request.method == "POST":
        # with mic as source:
        #     holder = []
        #     for x in info:
        #         audio_data = r.listen(source)
        #         r.adjust_for_ambient_noise(source)
        #         text = r.recognize_google(audio_data, language = 'en-IN')
        #         holder.append(text.lower())
        #         if x == "state":
        #             ss = sid.polarity_scores(holder)
        #             if ss == "neg":
        #                 x.append(str("sad"))
        #             else:
        #                 x.append(str("not sad"))
        #         else:
        #             filtered_words = [words for words in holder if not words in stop_words] # this filters out the stopwords
        #             x.append(filtered_words.lower())

        # return redirect(url_for('care', user = user))

    return render_template('index.html', user = user, location=location, state=state)

@app.route("/care", methods=["POST"])
def care():
    user = request.values.get('name')
    location = request.values.get('location')
    state = request.values.get('state')
    return render_template('list.html', user = user, location=location, state=state)


if __name__ == "__main__":
    #app.run(debug=True)    
    app.run(debug=True, threaded=True)

index.html:

{% extends "base.html" %}
{% block content %}

<!---------Therapist Section--------->
    <section id="therapist">
        <div class="container" id="therapist_container">
            <script>
              window.onload = function() {
            </script>
            <div id="button">
              <button type="button" class="btn btn-primary" id="therapist-button" data-toggle="modal" data-target="#myModal">Talk with Delphi</button>
            </div>
            
            <!-- Modal -->
            <div class="modal fade" id="myModal" tabindex="-1" role="dialog" aria-labelledby="vid1Title" aria-hidden="true">
              <div class="modal-dialog modal-dialog-centered" role="document">
                <div class="modal-content">
                  <div class="modal-body">
                    <video width="100%" id="video1">
                      <source src="./static/movie.mp4" type="video/mp4">
                    </video>
                    <form action="/care" method="POST">
                      <input type="text" name="name" placeholder="what's your name?" id="name">
                      <input type="text" name="location" placeholder="Where are you?" id="location">
                      <input type="text" name="state" placeholder="how can I help?" id="state">
                      <input id="buttonInput" class="btn btn-success form-control" type="submit" value="Send">
                    </form>
                  </div>
                </div>
              </div>
            </div>
            <script>
              $('#myModal').on('shown.bs.modal', function () {
              $('#video1')[0].play();
              })
              $('#myModal').on('hidden.bs.modal', function () {
                $('#video1')[0].pause();
              })
              video = document.getElementById('video1');
              video.addEventListener('ended',function(){       
              window.location.pathname = '/care';})

              function callback(stream) {
                  var context = new webkitAudioContext();
                  var mediaStreamSource = context.createMediaStreamSource(stream);
              }

              $(document).ready(function() {
                  navigator.webkitGetUserMedia({audio:true}, callback);
              }

            </script>
        </div>
    </section>
{% endblock content %}

list.html:

{% extends "base.html" %}
{% block content %}

<!----LIST------>
<section id="care_list">
    <div class="container" id="care_list_container">
        <h1 class="jumbotron text-center" id="care_list_title">{{ user }} Care Record</h1>
        <div class="container">
            <table class="table table-hover"> 
                <thead>
                  <tr>
                    <th scope="col">Session #</th>
                    <th scope="col">Length</th>
                    <th scope="col">Location</th>
                    <th scope="col">State</th> 
                  </tr>
                </thead>
                <tbody>
                  <tr>
                    <th scope="row">1</th>
                    <td>{{ length }}</td>
                    <td>{{ location }}</td>
                    <td>{{ state }}</td>
                  </tr>
                  <tr>
                    <th scope="row">2</th>
                    <td></td>
                    <td></td>
                    <td></td>
                  </tr>
                  <tr>
                    <th scope="row">3</th>
                    <td colspan="2"></td>
                    <td></td>
                  </tr>
                </tbody>
              </table>
        <ul class="list-group list-group-flush" id="care_list">
            <li class="list-group-item">Please email tom@vrifyhealth.com for help.</li>
        </ul>
    </div> 
</section> 
{% endblock content %}

解决方案

Its more easy than we think as like we create new Array(), there is a code new SpeechRecognition() to create a voice to text converter. No external library is need to do this. Here is code:-

            /* JS comes here */
            function SpeechRecog() {
                var output = document.getElementById("output");
                var action = document.getElementById("action");
                var SpeechRecognition = SpeechRecognition || webkitSpeechRecognition;
                var recognition = new SpeechRecognition();
            
                // This runs when the speech recognition service starts
                recognition.onstart = function() {
                    action.innerHTML = "<small>listening, please speak...</small>";
                };
                
                recognition.onspeechend = function() {
                    action.innerHTML = "<small>stopped listening, hope you are done...</small>";
                    recognition.stop();
                }
              
                // This runs when the speech recognition service returns result
                recognition.onresult = function(event) {
                    var transcript = event.results[0][0].transcript;
                    var confidence = event.results[0][0].confidence;
                    output.value=transcript;
                };
              
                 // start recognition
                 recognition.start();
            }

button{
  color:white;
  background:blue;
  border:none;
  padding:10px;margin:5px;
  border-radius:1em;
}
input{
  padding:.5em;margin:.5em;
}

<p>I'm Aakash1282,<br> Are you lazy, here is voice writer for your</p>
<p><button type="button" onclick="SpeechRecog()">Write By Voice</button> &nbsp; <span id="action"></span></p>
        <input type="text" id="output">

These codes giving some problem in Stack Overflow, but working perfectly at local files, here is Codepen working codes : https://codepen.io/aakash1282/pen/xxqeQyM

Take its reference and make your form as per you want.

这篇关于如何通过麦克风使用语音为我网页上的用户自动填写表单?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆