Django / Python:如何读取文件并验证它是一个音频文件? [英] Django/Python: How to read a file and validate that it is an audio file?

查看:343
本文介绍了Django / Python:如何读取文件并验证它是一个音频文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


可能重复:

Django(音频)文件验证

我正在构建一个网页应用程序,用户可以上传媒体内容,包括音频文件。

I'm building a web app, where users are able to upload media content, including audio files.

我的AudioFileUploadForm 中有一个干净的方法来验证以下:

I've got a clean method in my AudioFileUploadForm that validates the following:


  • 音频文件不是太大。

  • 音频文件有一个有效的content_type(MIME类型)。

  • 音频文件有一个有效的扩展名。

然而,我担心安全性。用户可以上传带有恶意代码的文件,轻松通过上述验证。我接下来要做的是确认音频文件确实是一个音频文件(在写入磁盘之前)。

However, I'm worried about security. A user could upload a file with malicious code, and easily pass the above validations. What I want to do next is validate that the audio file is, indeed, an audio file (before it writes to disk).

我该怎么做? p>

How should I do this?

class UploadAudioForm(forms.ModelForm):
    audio_file = forms.FileField()

    def clean_audio_file(self):
        file = self.cleaned_data.get('audio_file',False):
            if file:
                if file._size > 12*1024*1024:
                    raise ValidationError("Audio file too large ( > 12mb )")
                if not file.content_type in ['audio/mpeg','audio/mp4', 'audio/basic', 'audio/x-midi', 'audio/vorbis', 'audio/x-pn-realaudio', 'audio/vnd.rn-realaudio', 'audio/x-pn-realaudio', 'audio/vnd.rn-realaudio', 'audio/wav', 'audio/x-wav']:
                    raise ValidationError("Sorry, we do not support that audio MIME type. Please try uploading an mp3 file, or other common audio type.")
                if not os.path.splitext(file.name)[1] in ['.mp3', '.au', '.midi', '.ogg', '.ra', '.ram', '.wav']:
                    raise ValidationError("Sorry, your audio file doesn't have a proper extension.")
                # Next, I want to read the file and make sure it is 
                # a valid audio file. How should I do this? Use a library?
                # Read a portion of the file? ...?
                if not ???.is_audio(file.content):
                    raise ValidationError("Not a valid audio file.")
                return file
            else:
                raise ValidationError("Couldn't read uploaded file")

编辑:通过验证音频文件是,确实是一个音频文件,我的意思是:

By "validate that the audio file is, indeed, an audio file", I mean the following:

一个包含音频文件典型数据的文件。我担心用户可以使用适当的标题上传文件,并将恶意脚本代替音频数据。例如... mp3文件是mp3文件?还是包含mp3文件的不寻常的东西?

A file that contains data typical of an audio file. I'm worried that a user could upload files with appropriate headers, and malicious script in the place of audio data. For example... is the mp3 file an mp3 file? Or does it contain something uncharacteristic of an mp3 file?

推荐答案

另一个发布答案的替代方案是解析。这意味着有人仍然可以在有效标题后面包含其他数据。

An alternative to the other posted answer which does header parsing. This means someone could still include other data behind a valid header.

是要验证整个文件的成本更高的CPU,但也有更严格的策略。可以这样做的图书馆是 python audiotools ,相关的API方法是 AudioFile.verify

Is to verify the whole file which costs more CPU but also has a stricter policy. A library that can do this is python audiotools and the relevant API method is AudioFile.verify.

像这样使用:

import audiotools

f = audiotools.open(filename)
try:
    result = f.verify()
except audiotools.InvalidFile:
    # Invalid file.
    print("Invalid File")
else:
    # Valid file.
    print("Valid File")

警告是这个验证方法是非常严格的,实际上可能将错误编码的文件标记为无效。您必须自己决定这是否适合您的用例。

A warning is that this verify method is quite strict and can actually flag badly encoded files as invalid. You have to decide yourself if this is a proper method or not for your use case.

这篇关于Django / Python:如何读取文件并验证它是一个音频文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆