如何混合(合并)视频和音频,以便在持续时间过短的情况下音频会在输出视频中循环播放? [英] How to mux (merge) video&audio, so that the audio will loop in the output video in case it's too short in duration?

查看:205
本文介绍了如何混合(合并)视频和音频,以便在持续时间过短的情况下音频会在输出视频中循环播放?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要将视频文件和音频文件合并为一个视频文件,以便:

I'm required to merge a video file and an audio file to a single video file, so that:

  1. 输出视频文件的持续时间与输入视频文件的持续时间
  2. 输出文件中的音频将仅是输入音频文件中的音频.如果太短,它将循环到结尾(如果需要,可以在结尾停止).这意味着一旦音频完成播放而视频没有播放完,我应该一次又一次地播放,直到视频结束(音频连接)为止.

如我所读,此合并操作的技术术语称为混合".

The technical term of this merging operation is called "muxing", as I've read.

作为示例,假设我们有一个10秒的输入视频和一个4秒的音频文件,输出视频将是10秒(始终与输入视频相同),并且音频将播放2.5次(前2秒覆盖前8秒,其余2秒覆盖2秒).

As an example, suppose we have an input video of 10 seconds, and an audio file of 4 seconds, the output video would be of 10 seconds (always the same as the input video), and the audio will play 2.5 times (first 2 cover the first 8 seconds, and then 2 seconds out of 4 for the rest) .

虽然我找到了一种解决方案,如何将视频和音频混合( 此处 ),我遇到了多个问题:

While I have found a solution of how to mux a video and an audio (here), I've come across multiple issues:

  1. 我无法弄清楚在需要时如何循环编写音频内容.无论我尝试什么,它总是给我一个错误

  1. I can't figure out how to loop the writing of the audio content when needed. It keeps giving me an error, no matter what I try

输入文件必须具有特定的文件格式.否则,它可能会引发异常,或者(在极少数情况下)更糟:创建具有黑色内容的视频文件.甚至更多:有时例如可以使用".mkv"文件,有时则无法接受(并且两者都可以在视频播放器应用程序上播放).

The input files must be of specific file formats. Otherwise, it might throw an exception, or (in very rare cases) worse: create a video file that has black content. Even more: Sometimes a '.mkv' file (for example) could be fine, and sometimes it won't be accepted (and both can be played on a video player app).

当前代码处理缓冲区,而不是实际持续时间.这意味着在很多情况下,即使我不这样做,我也可能会停止混合音频,并且即使视频足够长,与原始视频相比,输出视频文件的音频内容也会更短.

The current code handles buffers and not real duration. This means that in many cases, I might stop muxing the audio even though I shouldn't, and the output video file will have a shorter audio content , compared to the original, even though the video is long enough.

我尝试过的

  • 我尝试使用以下方法使音频的MediaExtractor每次到达末尾时都开始播放:

    What I've tried

    • I tried to make the MediaExtractor of the audio to go to its beginning each time it reached the end, by using:

              if (audioBufferInfo.size < 0) {
                  Log.d("AppLog", "reached end of audio, looping...")
                  audioExtractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
                  audioBufferInfo.size = audioExtractor.readSampleData(audioBuf, 0)
              }
      

    • 为检查文件类型,我尝试使用 MediaMetadataRetriever ,然后检查mime类型.我认为受支持的文档可以在文档中找到( 此处)标记为编码器"的代码.对此不确定.我也不知道那是哪一种MIME类型.

    • For checking the types of the files, I tried using MediaMetadataRetriever and then checking the mime-type. I think the supported ones are available on the docs (here) as those marked with "Encoder". Not sure about this. I also don't know which mime type is of which type that is mentioned there.

      我还尝试过重新初始化与音频有关的所有内容,但也没有用.

      I also tried to re-initialize all that's related to the audio, but it didn't work either.

      这是我当前的多路复用本身的代码(完整的示例项目 此处 ):

      Here's my current code for the muxing itself (full sample project available here) :

      object VideoAndAudioMuxer {
          //   based on:  https://stackoverflow.com/a/31591485/878126
          @WorkerThread
          fun joinVideoAndAudio(videoFile: File, audioFile: File, outputFile: File): Boolean {
              try {
                  //            val videoMediaMetadataRetriever = MediaMetadataRetriever()
                  //            videoMediaMetadataRetriever.setDataSource(videoFile.absolutePath)
                  //            val videoDurationInMs =
                  //                videoMediaMetadataRetriever.extractMetadata(MediaMetadataRetriever.METADATA_KEY_DURATION).toLong()
                  //            val videoMimeType =
                  //                videoMediaMetadataRetriever.extractMetadata(MediaMetadataRetriever.METADATA_KEY_MIMETYPE)
                  //            val audioMediaMetadataRetriever = MediaMetadataRetriever()
                  //            audioMediaMetadataRetriever.setDataSource(audioFile.absolutePath)
                  //            val audioDurationInMs =
                  //                audioMediaMetadataRetriever.extractMetadata(MediaMetadataRetriever.METADATA_KEY_DURATION).toLong()
                  //            val audioMimeType =
                  //                audioMediaMetadataRetriever.extractMetadata(MediaMetadataRetriever.METADATA_KEY_MIMETYPE)
                  //            Log.d(
                  //                "AppLog",
                  //                "videoDuration:$videoDurationInMs audioDuration:$audioDurationInMs videoMimeType:$videoMimeType audioMimeType:$audioMimeType"
                  //            )
                  //            videoMediaMetadataRetriever.release()
                  //            audioMediaMetadataRetriever.release()
                  outputFile.delete()
                  outputFile.createNewFile()
                  val muxer = MediaMuxer(outputFile.absolutePath, MediaMuxer.OutputFormat.MUXER_OUTPUT_MPEG_4)
                  val sampleSize = 256 * 1024
                  //video
                  val videoExtractor = MediaExtractor()
                  videoExtractor.setDataSource(videoFile.absolutePath)
                  videoExtractor.selectTrack(0)
                  videoExtractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
                  val videoFormat = videoExtractor.getTrackFormat(0)
                  val videoTrack = muxer.addTrack(videoFormat)
                  val videoBuf = ByteBuffer.allocate(sampleSize)
                  val videoBufferInfo = MediaCodec.BufferInfo()
      //            Log.d("AppLog", "Video Format $videoFormat")
                  //audio
                  val audioExtractor = MediaExtractor()
                  audioExtractor.setDataSource(audioFile.absolutePath)
                  audioExtractor.selectTrack(0)
                  audioExtractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
                  val audioFormat = audioExtractor.getTrackFormat(0)
                  val audioTrack = muxer.addTrack(audioFormat)
                  val audioBuf = ByteBuffer.allocate(sampleSize)
                  val audioBufferInfo = MediaCodec.BufferInfo()
      //            Log.d("AppLog", "Audio Format $audioFormat")
                  //
                  muxer.start()
      //            Log.d("AppLog", "muxing video&audio...")
                  //            val minimalDurationInMs = Math.min(videoDurationInMs, audioDurationInMs)
                  while (true) {
                      videoBufferInfo.size = videoExtractor.readSampleData(videoBuf, 0)
                      audioBufferInfo.size = audioExtractor.readSampleData(audioBuf, 0)
                      if (audioBufferInfo.size < 0) {
                          //                    Log.d("AppLog", "reached end of audio, looping...")
                          //TODO somehow start from beginning of the audio again, for looping till the video ends
                          //                    audioExtractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
                          //                    audioBufferInfo.size = audioExtractor.readSampleData(audioBuf, 0)
                      }
                      if (videoBufferInfo.size < 0 || audioBufferInfo.size < 0) {
      //                    Log.d("AppLog", "reached end of video")
                          videoBufferInfo.size = 0
                          audioBufferInfo.size = 0
                          break
                      } else {
                          //                    val donePercentage = videoExtractor.sampleTime / minimalDurationInMs / 10L
                          //                    Log.d("AppLog", "$donePercentage")
                          // video muxing
                          videoBufferInfo.presentationTimeUs = videoExtractor.sampleTime
                          videoBufferInfo.flags = videoExtractor.sampleFlags
                          muxer.writeSampleData(videoTrack, videoBuf, videoBufferInfo)
                          videoExtractor.advance()
                          // audio muxing
                          audioBufferInfo.presentationTimeUs = audioExtractor.sampleTime
                          audioBufferInfo.flags = audioExtractor.sampleFlags
                          muxer.writeSampleData(audioTrack, audioBuf, audioBufferInfo)
                          audioExtractor.advance()
                      }
                  }
                  muxer.stop()
                  muxer.release()
      //            Log.d("AppLog", "success")
                  return true
              } catch (e: Exception) {
                  e.printStackTrace()
      //            Log.d("AppLog", "Error " + e.message)
              }
              return false
          }
      }
      

      • 我还尝试使用FFMPEG库( 此处 此处 ),以了解如何执行此操作.它工作正常,但存在一些可能的问题:该库似乎占用了很多空间,令人讨厌的许可条款,并且由于某种原因,该示例无法播放我必须创建的输出文件,除非我删除了其中的内容.命令,将使转换速度大大降低.我真的更喜欢使用内置的API,而不是使用这个库,即使它是一个非常强大的库...而且,对于某些输入文件,它似乎没有循环...
        1. 我如何混合视频和音频文件,以便在音频比视频短(持续时间)的情况下循环音频?

        1. How can I mux the video&audio files so that the audio will loop in case the audio is shorter (in duration) compared to the video?

        我该怎么做,以便在视频结束时准确地剪切音频(视频和音频中都没有剩余)?

        How can I do it so that the audio will get cut precisely when the video ends (no remainders on either video&audio) ?

        如何在调用此函数之前检查当前设备是否可以处理给定的输入文件并实际对其进行多路复用?有没有一种方法可以在运行时检查这种操作所支持的方法,而不是依赖将来可能会更改的文档列表?

        How can I check before calling this function, if the current device can handle the given input files and actually mux them ? Is there a way to check during runtime, which are supported for this kind of operation, instead of relying on a list on the docs that might change in the future?

        推荐答案

        我有同一个场景.

        • 1:当 audioBufferInfo.size <0,寻求开始.但是请记住,您需要累积 presentationTimeUs .

        • 1: When audioBufferInfo.size < 0, seek to start. But remember, you need accumulate presentationTimeUs.

        2:获取视频时长,当音频循环到该时长时(也使用 presentationTimeUs ),剪切.

        2: Get video duration, when audio loop to the duration (use presentationTimeUs too), cut.

        3:音频文件必须为 MediaFormat.MIMETYPE_AUDIO_AMR_NB MediaFormat.MIMETYPE_AUDIO_AMR_WB MediaFormat.MIMETYPE_AUDIO_AAC .在我的测试机上,它工作正常.

        3: The audio file need to be MediaFormat.MIMETYPE_AUDIO_AMR_NB or MediaFormat.MIMETYPE_AUDIO_AMR_WB or MediaFormat.MIMETYPE_AUDIO_AAC. On my testing machines, it worked fine.

        这是代码:

        private fun muxing(musicName: String) {
            val saveFile = File(DirUtils.getPublicMediaPath(), "$saveName.mp4")
            if (saveFile.exists()) {
                saveFile.delete()
                PhotoHelper.sendMediaScannerBroadcast(saveFile)
            }
            try {
                // get the video file duration in microseconds
                val duration = getVideoDuration(mSaveFile!!.absolutePath)
        
                saveFile.createNewFile()
        
                val videoExtractor = MediaExtractor()
                videoExtractor.setDataSource(mSaveFile!!.absolutePath)
        
                val audioExtractor = MediaExtractor()
                val afdd = MucangConfig.getContext().assets.openFd(musicName)
                audioExtractor.setDataSource(afdd.fileDescriptor, afdd.startOffset, afdd.length)
        
                val muxer = MediaMuxer(saveFile.absolutePath, MediaMuxer.OutputFormat.MUXER_OUTPUT_MPEG_4)
        
                videoExtractor.selectTrack(0)
                val videoFormat = videoExtractor.getTrackFormat(0)
                val videoTrack = muxer.addTrack(videoFormat)
        
                audioExtractor.selectTrack(0)
                val audioFormat = audioExtractor.getTrackFormat(0)
                val audioTrack = muxer.addTrack(audioFormat)
        
                var sawEOS = false
                val offset = 100
                val sampleSize = 1000 * 1024
                val videoBuf = ByteBuffer.allocate(sampleSize)
                val audioBuf = ByteBuffer.allocate(sampleSize)
                val videoBufferInfo = MediaCodec.BufferInfo()
                val audioBufferInfo = MediaCodec.BufferInfo()
        
                videoExtractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
                audioExtractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
        
                muxer.start()
        
                val frameRate = videoFormat.getInteger(MediaFormat.KEY_FRAME_RATE)
                val videoSampleTime = 1000 * 1000 / frameRate
        
                while (!sawEOS) {
                    videoBufferInfo.offset = offset
                    videoBufferInfo.size = videoExtractor.readSampleData(videoBuf, offset)
        
                    if (videoBufferInfo.size < 0) {
                        sawEOS = true
                        videoBufferInfo.size = 0
        
                    } else {
                        videoBufferInfo.presentationTimeUs += videoSampleTime
                        videoBufferInfo.flags = videoExtractor.sampleFlags
                        muxer.writeSampleData(videoTrack, videoBuf, videoBufferInfo)
                        videoExtractor.advance()
                    }
                }
        
                var sawEOS2 = false
                var sampleTime = 0L
                while (!sawEOS2) {
        
                    audioBufferInfo.offset = offset
                    audioBufferInfo.size = audioExtractor.readSampleData(audioBuf, offset)
        
                    if (audioBufferInfo.presentationTimeUs >= duration) {
                        sawEOS2 = true
                        audioBufferInfo.size = 0
                    } else {
                        if (audioBufferInfo.size < 0) {
                            sampleTime = audioBufferInfo.presentationTimeUs
                            audioExtractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC)
                            continue
                        }
                    }
                    audioBufferInfo.presentationTimeUs = audioExtractor.sampleTime + sampleTime
                    audioBufferInfo.flags = audioExtractor.sampleFlags
                    muxer.writeSampleData(audioTrack, audioBuf, audioBufferInfo)
                    audioExtractor.advance()
                }
        
                muxer.stop()
                muxer.release()
                videoExtractor.release()
                audioExtractor.release()
                afdd.close()
            } catch (e: Exception) {
                LogUtils.e(TAG, "Mixer Error:" + e.message)
            }
        }
        

        这篇关于如何混合(合并)视频和音频,以便在持续时间过短的情况下音频会在输出视频中循环播放?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆