正则表达式和 Unicode [英] Regex and unicode

查看：58 发布时间：2021/7/6 19:14:13 python regex unicode character-properties

本文介绍了正则表达式和 Unicode的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个脚本可以解析电视剧集的文件名(例如 show.name.s01e02.avi)，获取剧集名称(来自 www.thetvdb.com API)并自动将它们重命名为更好的名称(Show Name- [01x02].avi)

I have a script that parses the filenames of TV episodes (show.name.s01e02.avi for example), grabs the episode name (from the www.thetvdb.com API) and automatically renames them into something nicer (Show Name - [01x02].avi)

该脚本运行良好，直到您尝试在具有 Unicode 显示名称的文件上使用它(我从未真正考虑过这一点，因为我拥有的所有文件都是英文文件，因此大部分几乎都属于 <代码>[a-zA-Z0-9'\-])

The script works fine, that is until you try and use it on files that have Unicode show-names (something I never really thought about, since all the files I have are English, so mostly pretty-much all fall within [a-zA-Z0-9'\-])

如何允许正则表达式匹配重音字符等?目前正则表达式的配置部分看起来像..

How can I allow the regular expressions to match accented characters and the likes? Currently the regex's config section looks like..

config['valid_filename_chars'] = """0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!@£$%^&*()_+=-[]{}"'.,<>`~? """
config['valid_filename_chars_regex'] = re.escape(config['valid_filename_chars'])

config['name_parse'] = [
    # foo_[s01]_[e01]
    re.compile('''^([%s]+?)[ \._\-]\[[Ss]([0-9]+?)\]_\[[Ee]([0-9]+?)\]?[^\\/]*$'''% (config['valid_filename_chars_regex'])),
    # foo.1x09*
    re.compile('''^([%s]+?)[ \._\-]\[?([0-9]+)x([0-9]+)[^\\/]*$''' % (config['valid_filename_chars_regex'])),
    # foo.s01.e01, foo.s01_e01
    re.compile('''^([%s]+?)[ \._\-][Ss]([0-9]+)[\.\- ]?[Ee]([0-9]+)[^\\/]*$''' % (config['valid_filename_chars_regex'])),
    # foo.103*
    re.compile('''^([%s]+)[ \._\-]([0-9]{1})([0-9]{2})[\._ -][^\\/]*$''' % (config['valid_filename_chars_regex'])),
    # foo.0103*
    re.compile('''^([%s]+)[ \._\-]([0-9]{2})([0-9]{2,3})[\._ -][^\\/]*$''' % (config['valid_filename_chars_regex'])),
]

正则表达式和 Unicode [英] Regex and unicode

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

正则表达式和 Unicode [英] Regex and unicode

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭