Python string.replace() 不替换字符 [英] Python string.replace() not replacing characters

查看:34
本文介绍了Python string.replace() 不替换字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一些背景信息:我工作的地方有一个古老的基于 Web 的文档数据库系统,几乎完全由带有普通"扩展名(.doc、.xls、.ppt)的 MS Office 文档组成.它们都是根据某种任意 ID 号(即 1245.doc)命名的.我们正在切换到 SharePoint,我需要重命名所有这些文件并将它们分类到文件夹中.我有一个包含各种信息的 CSV 文件(比如哪个 ID 号对应哪个文档的标题),所以我用它来重命名这些文件.我编写了一个简短的 Python 脚本来重命名 ID 号标题.

Some background information: We have an ancient web-based document database system where I work, almost entirely consisting of MS Office documents with the "normal" extensions (.doc, .xls, .ppt). They are all named based on some sort of arbitrary ID number (i.e. 1245.doc). We're switching to SharePoint and I need to rename all of these files and sort them into folders. I have a CSV file with all sorts of information (like which ID number corresponds to which document's title), so I'm using it to rename these files. I've written a short Python script that renames the ID number title.

但是,某些文档的标题在文件的标题中有斜杠和其他可能的坏字符,所以我想用下划线替换它们:

However, some of the titles of the documents have slashes and other possibly bad characters to have in a title of a file, so I want to replace them with underscores:

bad_characters = ["/", "\\", ":", "(", ")", "<", ">", "|", "?", "*"]
for letter in bad_characters:
    filename = line[2].replace(letter, "_")
    foldername = line[5].replace(letter, "_")

  • line[2] 示例:废话无聊 - 会议 2/19/2008.doc"
  • line[5] 示例:商务会议 2/2008"
    • Example of line[2]: "Blah blah boring - meeting 2/19/2008.doc"
    • Example of line[5]: "Business meetings 2/2008"
    • 当我在 for 循环中添加 print letter 时,它会打印出它应该替换的字母,但实际上不会用下划线就像我想要的那样.

      When I add print letter inside of the for loop, it will print out the letter it's supposed to be replacing, but won't actually replace that character with an underscore like I want it to.

      我在这里做错了什么吗?

      Is there anything I'm doing wrong here?

      推荐答案

      那是因为 filenamefoldername 会随着循环的每次迭代而被丢弃..replace() 方法返回一个字符串,但您没有将结果保存在任何地方.

      That's because filename and foldername get thrown away with each iteration of the loop. The .replace() method returns a string, but you're not saving the result anywhere.

      你应该使用:

      filename = line[2]
      foldername = line[5]
      
      for letter in bad_characters:
          filename = filename.replace(letter, "_")
          foldername = foldername.replace(letter, "_")
      

      但我会使用正则表达式.它更干净,(可能)更快:

      But I would do it using regex. It's cleaner and (likely) faster:

      p = re.compile('[/:()<>|?*]|(\\\)')
      filename = p.sub('_', line[2])
      folder = p.sub('_', line[5])
      

      这篇关于Python string.replace() 不替换字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆