如何在Python中使用谷歌语音识别api? [英] How to use google speech recognition api in python?
问题描述
堆栈溢出可能不是问这个问题的最佳位置,但我需要帮助。我有一个mp3文件,我想使用谷歌的语音识别来获取该文件的文本。任何想法,我可以找到文件或例子,将不胜感激。
/cloud.google.com/speech/\"> Google Cloud Speech API ,使开发人员能够将音频转换为文本[...] API可识别超过80种语言和变体[...]
You可以创建一个免费帐户来获取有限的API请求。
如何:
您首先需要安装 gcloud python模块& google-api-python-client 模块:
pip install --upgrade gcloud
pip install --upgrade google-api-python-client
然后在Cloud Platform Console中,转到项目页面并选择或创建新项目。在需要为项目启用结算功能后,请启用云语音API 。
启用Google Cloud Speech API后,请点击转至凭证按钮以设置您的Cloud Speech API凭据。
有关如何向云语音API服务授权的信息,请参阅设置服务帐户从你的代码中获得
你应该获得一个服务帐户密钥文件(以JSON格式)和一个GOOGLE_APPLICATION_CREDENTIALS环境变量,这将允许你对Speech API进行认证
全部完成后,下载音频原始文件来自Google以及 speech-discovery_google_rest_v1。 json from google
修改之前下载的JSON文件以设置您的凭证密钥
,然后确保您已将GOOGLE_APPLICATION_CREDENTIALS环境变量设置为.json文件的完整路径:
$ b
export GOOGLE_APPLICATION_CREDENTIALS = / path / to / service_account_file.json
code>
还有
确保您已将GCLOUD_PROJECT环境变量设置为您的Google Cloud项目的ID:
export GCLOUD_PROJECT =您的项目ID
$ c $您可以创建一个 tutorial.py 。其中包含:
import argparse
import base64
import json
from googleapiclient导入发现
从oauth2client.client导入httplib2
导入GoogleCredentials
$ b DISCOVERY_URL =('https:// {api} .googleapis.com / $ DIS covery / rest?'
'version = {apiVersion}')
$ b $ def get_speech_service():
credentials = GoogleCredentials.get_application_default()。create_scoped(
['https://www.googleapis.com/auth/cloud-platform'])
http = httplib2.Http()
credentials.authorize(http)
返回discovery.build(
'speech','v1beta1',http = http,discoveryServiceUrl = DISCOVERY_URL)
def main(speech_file):
转录给定的音频文件。
参数:
语音文件:音频文件的名称。
with open(speech_file,'rb')as speech:
speech_content = base64.b64encode(speech.read())
service = get_speech_service ()
service_request = service.speech()。syncrecognize(
body = {
'config':{$ b $'encoding':'LINEAR16',#raw 16-bit signed LE样本
'sampleRate':16000,#16 khz
'languageCode':'en-US',#a BCP-47语言标记
},
'audio': {
'content':speech_content.decode('UTF-8')
}
})
response = service_request.execute()
print(json.dumps (响应))
if __name__ =='__main__':
parser = argparse.ArgumentParser()
parser.add_argument(
'speech_file',help ='要识别的音频文件的完整路径')
args = parser.parse_args()
然后运行:
python tutorial.py audio.raw
Stack overflow might not be the best place to ask this question but i need help. I have an mp3 file and i want to use google's speech recognition to get the text out of that file. Any ideas where i can find documentation or examples will be appreciated.
解决方案 Take a look at Google Cloud Speech API that enables developers to convert audio to text [...] The API recognizes over 80 languages and variants [...]
You can create a free account to get a limited amount of API request.
HOW TO:
You need first to install gcloud python module & google-api-python-client module with:
pip install --upgrade gcloud
pip install --upgrade google-api-python-client
Then in the Cloud Platform Console, go to the Projects page and select or create a new project. After you need to enable billing for your project, then enable Cloud Speech API.
After enabling the Google Cloud Speech API, click the Go to Credentials button to set up your Cloud Speech API credentials
See Set Up a Service Account for information on how to authorize to the Cloud Speech API service from your code
You should obtain both a service account key file (in JSON) and a GOOGLE_APPLICATION_CREDENTIALS environment variable that will allow you to authenticate to the Speech API
Once all done, download the audio raw file from google and also the speech-discovery_google_rest_v1.json from google
Modify previous downloaded JSON file to set your credentials key
then make sure that you have set your the GOOGLE_APPLICATION_CREDENTIALS environment variable to the full path of the .json file with:
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service_account_file.json
also
Make sure that you have set your GCLOUD_PROJECT environment variable to the ID of your Google Cloud project with :
export GCLOUD_PROJECT=your-project-id
assuming all done, you can create a tutorial.py file which contain:
import argparse
import base64
import json
from googleapiclient import discovery
import httplib2
from oauth2client.client import GoogleCredentials
DISCOVERY_URL = ('https://{api}.googleapis.com/$discovery/rest?'
'version={apiVersion}')
def get_speech_service():
credentials = GoogleCredentials.get_application_default().create_scoped(
['https://www.googleapis.com/auth/cloud-platform'])
http = httplib2.Http()
credentials.authorize(http)
return discovery.build(
'speech', 'v1beta1', http=http, discoveryServiceUrl=DISCOVERY_URL)
def main(speech_file):
"""Transcribe the given audio file.
Args:
speech_file: the name of the audio file.
"""
with open(speech_file, 'rb') as speech:
speech_content = base64.b64encode(speech.read())
service = get_speech_service()
service_request = service.speech().syncrecognize(
body={
'config': {
'encoding': 'LINEAR16', # raw 16-bit signed LE samples
'sampleRate': 16000, # 16 khz
'languageCode': 'en-US', # a BCP-47 language tag
},
'audio': {
'content': speech_content.decode('UTF-8')
}
})
response = service_request.execute()
print(json.dumps(response))
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument(
'speech_file', help='Full path of audio file to be recognized')
args = parser.parse_args()
main(args.speech_file)
Then run:
python tutorial.py audio.raw
这篇关于如何在Python中使用谷歌语音识别api?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!