ISO 8859-1文件名未解码 [英] ISO 8859-1 filename not decoding

查看:140
本文介绍了ISO 8859-1文件名未解码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在从python milter中的MIME消息中提取文件,并且正在运行名为以下文件的问题:

I'm extracting files from MIME messages in a python milter and am running across issues with files named as such:

=?ISO-8859-1?Q?Certificado = 5FZonificaci = F3n = 5F2010 = 2Epdf?=

=?ISO-8859-1?Q?Certificado=5FZonificaci=F3n=5F2010=2Epdf?=

我似乎无法将此名称解码为UTF.为了解决先前的ISO-8859-1问题,我开始将所有文件名传递给此函数:

I can't seem to decode this name into UTF. In order to solve a prior ISO-8859-1 issue, I started passing all filenames to this function:

def unicodeConvert(self, fname):
    normalized = False

    while normalized == False:
        try:
            fname  = unicodedata.normalize('NFKD', unicode(fname, 'utf-8')).encode('ascii', 'ignore')
            normalized = True
        except UnicodeDecodeError:
            fname = fname.decode('iso-8859-1')#.encode('utf-8')
            normalized = True
        except UnicodeError:
            fname = unicode(fname.content.strip(codecs.BOM_UTF8), 'utf-8')
            normalized = True
        except TypeError:
            fname = fname.encode('utf-8')

    return fname

一直工作到我找到这个文件名.

which was working until I got to this filename.

想法一如既往.

推荐答案

您的字符串使用可报价打印" 格式. email.header模块为您处理:

Your string is encoded using the Quoted-printable format for MIME headers. The email.header module handles this for you:

>>> from email.header import decode_header
>>> try:
...     string_type = unicode  # Python 2
... except NameError:
...     string_type = str      # Python 3
...
>>> for part in decode_header('=?ISO-8859-1?Q?Certificado=5FZonificaci=F3n=5F2010=2Epdf?='):
...     decoded = string_type(*part)
...     print(decoded)
...
Certificado_Zonificación_2010.pdf

这篇关于ISO 8859-1文件名未解码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆