对话流,来自音频的检测意图 [英] Dialogflow, detection intent from audio

查看:87
本文介绍了对话流,来自音频的检测意图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将音频文件发送到dialogflow API以进行意图检测.我已经有一个代理工作得很好,但只能处理文本.我正在尝试添加音频功能,但是没有运气.

I'm trying to send an audio file to dialogflow API for intent detection. I already have an agent working quite well but only with text. I'm trying to add the the audio feature but with no luck.

我正在使用此页面中提供的示例(Java):

I'm using the example (Java) provided in this page:

https://cloud. google.com/dialogflow-enterprise/docs/detect-intent-audio#detect-intent-text-java

这是我的代码:

public  DetectIntentResponse detectIntentAudio(String projectId, byte [] bytes, String sessionId,
                                         String languageCode)
            throws Exception {


            // Set the session name using the sessionId (UUID) and projectID (my-project-id)
            SessionName session = SessionName.of(projectId, sessionId);
            System.out.println("Session Path: " + session.toString());

            // Note: hard coding audioEncoding and sampleRateHertz for simplicity.
            // Audio encoding of the audio content sent in the query request.
            AudioEncoding audioEncoding = AudioEncoding.AUDIO_ENCODING_LINEAR_16;
            int sampleRateHertz = 16000;

            // Instructs the speech recognizer how to process the audio content.
            InputAudioConfig inputAudioConfig = InputAudioConfig.newBuilder()
                    .setAudioEncoding(audioEncoding) // audioEncoding = AudioEncoding.AUDIO_ENCODING_LINEAR_16
                    .setLanguageCode(languageCode) // languageCode = "en-US"
                    .setSampleRateHertz(sampleRateHertz) // sampleRateHertz = 16000
                    .build();

            // Build the query with the InputAudioConfig
            QueryInput queryInput = QueryInput.newBuilder().setAudioConfig(inputAudioConfig).build();

            // Read the bytes from the audio file
            byte[] inputAudio = Files.readAllBytes(Paths.get("/home/rmg/Audio/book_a_room.wav"));

            byte[] encodedAudio = Base64.encodeBase64(inputAudio);
            // Build the DetectIntentRequest
            DetectIntentRequest request = DetectIntentRequest.newBuilder()
                    .setSession("projects/"+projectId+"/agent/sessions/" + sessionId)
                    .setQueryInput(queryInput)
                    .setInputAudio(ByteString.copyFrom(encodedAudio))
                    .build();

            // Performs the detect intent request
            DetectIntentResponse response = sessionsClient.detectIntent(request);

            // Display the query result
            QueryResult queryResult = response.getQueryResult();
            System.out.println("====================");
            System.out.format("Query Text: '%s'\n", queryResult.getQueryText());
            System.out.format("Detected Intent: %s (confidence: %f)\n",
                    queryResult.getIntent().getDisplayName(), queryResult.getIntentDetectionConfidence());
            System.out.format("Fulfillment Text: '%s'\n", queryResult.getFulfillmentText());

            return response;

    }

我尝试了几种格式,wav(PCM 16位,几种采样率)和FLAC,并且还按照下面所述的两种不同方式(通过代码或控制台)将字节转换为base64:

I have tried with several formats, wav (PCM 16 bits several sample rates) and FLAC, and also converting the bytes to base64 in two different ways as described here (by code or console):

https://dialogflow.com/docs/reference/text-to-speech

我甚至已经测试了本示例中提供的.wav,并在我的代理中创建了一个新意图,即用该训练短语预订房间".它可以使用dialogflow控制台中的文本和音频来工作,但只能用于文本,而不能用于我的代码中的音频...并且我要发送的声音与他们提供的相同! (上面的代码)

I have even tested with the .wav provided in this example creating a new intent in my agent called "book a room" with that training phrase. It works using text and audio from the dialogflow console but only works with text, not audio from my code... and I'm sending the same wav they provide! (code above)

我总是收到相同的响应(QueryResult):

I always receive the same response (QueryResult):

我需要一个线索或其他东西,我完全被困在这里.没有日志,响应中没有错误...但是不起作用.

I need a clue or something, I'm totally stuck here. No logs, no errors in the response... but does not work.

谢谢

推荐答案

我写了对dialogflow支持,并用一段有效的代码回答了我.基本上与上面发布的内容相同,唯一的区别是base64编码,没有必要这样做.

I wrote to the dialogflow support and the replied my with a working piece of code. It is basically the same posted above, the only difference is the base64 encoding, it is not necessary to do that.

所以我删除了:

byte[] encodedAudio = Base64.encodeBase64(inputAudio);

(并直接使用inputAudio)

(And used inputAudio directly)

现在它正在按预期方式工作...

Now It is working as expected...

这篇关于对话流,来自音频的检测意图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆