使用Java API通过AWS服务进行文本语音转换 [英] Speech to text by AWS service using Java API

查看:70
本文介绍了使用Java API通过AWS服务进行文本语音转换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用AWS服务和AWS Java-SDK将语音转换为文本,但是我无法在AWS Java-SDK中找到任何API.请问有什么服务吗?我已使用AWS Polly服务使用AWS Java-sdk将文本转换为语音,但没有反向转换(从语音转换为文本).该怎么办?

I would like to convert speech to text using an AWS service and the AWS java-sdk, but I am unable to find any API in the AWS java-sdk. Is there any service which does this? I have used AWS Polly service to convert text to speech using AWS java-sdk, but not the reverse (speech to text). How could this be done?

推荐答案

最近,我已经成功构建了一个Java客户端,在为此花费时间之前,必须要说的是,自本出版物发行之日起,获得包含是"的音频文本大约需要1分钟. 鉴于这种性能,我选择了Google服务.

Recently I have managed to build a Java client, before investing time in this it is important to say that as of the date of this publication the time it takes to obtain a text of an audio that contains a "Yes" is approximately 1 min. Given that performance, I opted for the Google service.

那是我分享的代码,因为它旨在执行可行性测试,因此是可改进的.

That said I share the code which is improvable since it was intended to perform a feasibility test.

此服务要求将音频存储在存储桶中,然后指示转录uri,然后启动工作并以类似的方式以json格式获得结果.

This service requires that the audio be housed in a bucket and then it is indicated to transcribe the uri, then the work is launched and in a similar way the result is obtained in json format.

在该示例中,我们选择等待工作完成,然后获得结果.

In the example, we choose to wait for the work to finish and then obtain the result.

主要依赖项是:

    <!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-transcribe -->
<dependency>
    <groupId>com.amazonaws</groupId>
    <artifactId>aws-java-sdk-transcribe</artifactId>
    <version>1.11.313</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-s3 -->
<dependency>
    <groupId>com.amazonaws</groupId>
    <artifactId>aws-java-sdk-s3</artifactId>
    <version>1.11.313</version>
</dependency>

我选择的凭据:

static{
    System.setProperty("aws.accessKeyId", "yourAccessK");
    System.setProperty("aws.secretKey"  , "shhhhhhhhhh");
}

在源代码中,我们将创建S3和tanscribe客户端,将区域替换为与存储桶相对应的区域.

In the source we will create the S3 and tanscribe client, replace the region with the one that corresponds to the bucket.

private AmazonS3 s3 = AmazonS3ClientBuilder.standard().withRegion("us-east-1").withClientConfiguration(new ClientConfiguration()).withCredentials(new DefaultAWSCredentialsProviderChain() ).build();
private AmazonTranscribe client = AmazonTranscribeClient.builder().withRegion("us-east-1").build();

然后将音频文件上传到存储桶

then we upload the audio file to the bucket

s3.putObject(BUCKET_NAME, fileName, new File(fullFileName));

BUCKET_NAME是带有存储桶名称的常量. fileName:不一定要是文件名,它可以是我们要使用的任何标识符.

BUCKET_NAME is the constant with the name of the bucket. fileName: it is not necessary that it be the name of the file, it can be any identifier that we want to use.

将音频上传到存储桶后,我们将创建转录任务.

Once we upload the audio to the bucket we will create the transcribe job.

    StartTranscriptionJobRequest request = new StartTranscriptionJobRequest();

    request.withLanguageCode(LanguageCode.EsUS);

    Media media = new Media();

    media.setMediaFileUri(s3.getUrl(BUCKET_NAME, fileName).toString());

    request.withMedia(media).withMediaSampleRateHertz(8000);

查看语言选项和MediaSampleRateHertz.

Review the language options and MediaSampleRateHertz.

为工作创建一个名称.

String transcriptionJobName = "myJob"; // consider a unique name as an id.

完成请求并开始工作

request.setTranscriptionJobName(transcriptionJobName);
request.withMediaFormat("wav");

client.startTranscriptionJob(request);

在这种情况下,循环等待答案,还有其他更有效的选择.

In this case a loop to wait for the answer, there are other more efficient options.

GetTranscriptionJobRequest jobRequest = new GetTranscriptionJobRequest();
jobRequest.setTranscriptionJobName(transcriptionJobName);
TranscriptionJob transcriptionJob;

while( true ){
    transcriptionJob = client.getTranscriptionJob(jobRequest).getTranscriptionJob();
    if( transcriptionJob.getTranscriptionJobStatus().equals(TranscriptionJobStatus.COMPLETED.name()) ){

        transcription = this.download( transcriptionJob.getTranscript().getTranscriptFileUri(), fileName);

        break;

    }else if( transcriptionJob.getTranscriptionJobStatus().equals(TranscriptionJobStatus.FAILED.name()) ){

            break;
    }
    // to not be so anxious
    synchronized ( this ) {
        try {
            this.wait(50);
        } catch (InterruptedException e) { }
    }

}

transcriptionJob.getTranscript().getTranscriptFileUri()返回一个uri,可与任何HTTP客户端(Apache HttpClient或我更喜欢的JODD)一起使用( https://jodd.org/http/)

transcriptionJob.getTranscript().getTranscriptFileUri() return a uri to use with any http client either Apache HttpClient or as in my case I prefer JODD (https://jodd.org/http/)

下载:

private AmazonTranscription download( String uri, String fileName ){
    HttpResponse response = HttpRequest.get(uri).send();
    String result = response.charset("UTF-8").bodyText();
    // result is a json 
    return gson.fromJson(result, AmazonTranscription.class);
}

AmazonTranscription是我创建的包含json的类. 我共享了包含json解析的必要类,避免了该设置,并且不要过于广泛.

AmazonTranscription is a class that I created to contain the json. I share the necessary classes to contain the json parsing, I avoid the set and get to not be so extensive.

public class AmazonTranscription {

    private String jobName;
    private String accountId;
    private Result results;
    private String status;
}

public class Item {

    private String start_time;
    private String end_time;
    private List<Alternative> alternatives = new ArrayList<Alternative>();
    private String type;
}

public class Result {

    private List<Transcript> transcripts = new ArrayList<Transcript>();
    private List<Item>       items       = new ArrayList<Item>();
}

public class Transcript {

    private String transcript;
}

只需在需要的地方添加try/catch.

Just add the try / catch where required.

我希望我不会忽略任何东西,并且它会有用,我花了一些时间来了解这种Amazon模型,并且希望那时能避免其他人.

I hope I have not overlooked anything and that it will be useful, it took me some time to understand this Amazon model and I hope to avoid others that time.

对不起,如果写作有误,但这不是我的母语.

Sorry if there are errors in the writing but this is not my native language.

这篇关于使用Java API通过AWS服务进行文本语音转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆