在多个文件上查找和替换多个查询的最佳方法是什么? [英] What is the best way to do a find and replace of multiple queries on multiple files?

查看:58
本文介绍了在多个文件上查找和替换多个查询的最佳方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个超过 200 行的这种格式的文件:

I have a file that has over 200 lines in this format:

name old_id new_id

这个名称对于我目前正在尝试做的事情没有用,但我仍然希望它在那里,因为它可能对以后的调试有用.

The name is useless for what I'm trying to do currently, but I still want it there because it may become useful for debugging later.

现在我需要遍历文件夹中的每个文件并找到所有 old_id 的实例并将它们替换为 new_id.我正在扫描的文件是可能长达数千行的代码文件.我需要使用我拥有的 200 多个 ID 中的每一个扫描每个文件,因为有些可能在多个文件中使用,并且每个文件多次使用.

Now I need to go through every file in a folder and find all the instances of old_id and replace them with new_id. The files I'm scanning are code files that could be thousands of lines long. I need to scan every file with each of the 200+ ids that I have, because some may be used in more than one file, and multiple times per file.

执行此操作的最佳方法是什么?到目前为止,我一直在创建 python 脚本来找出旧 id 和新 id 的列表以及哪些相互匹配,但我一直在这样做非常低效,因为我基本上是逐行扫描第一个文件并得到当前行的当前 id,然后我将逐行扫描第二个文件,直到找到匹配项.然后我对第一个文件中的每一行都重新做了一遍,最终我大量阅读了第二个文件.我不介意这样做效率低下,因为它们是小文件.

What is the best way to go about doing this? So far I've been creating python scripts to figure out the list of old ids and new ids and which ones match up with each other, but I've been doing it very inefficient because I basically scanned the first file line by line and got the current id of the current line, then I would scan the second file line by line until I found a match. Then I did this over again for each line in the first file, which ended up with my reading the second file a lot. I didn't mind doing this inefficiently because they were small files.

现在我正在搜索大约 30-50 个文件,其中可能包含数千行代码,我希望它更高效一些.这只是一个业余爱好者的项目,所以不需要超级好,我只是不希望它花超过5分钟的时间来查找和替换所有内容,然后查看结果,发现我犯了一个小错误并且需要从头再来.花几分钟时间是可以的(尽管我相信现在的计算机仍然可以几乎立即完成),但我只是不希望它变得荒谬.

Now that I'm searching probably somewhere around 30-50 files that can have thousands of line of code in it, I want it to be a little more efficient. This is just a hobbyist project, so it doesn't need to be super good, I just don't want it to take more than 5 minutes to find and replace everything, then look at the result and see that I made a little mistake and need to do it all over again. Taking a few minutes is fine(although I'm sure with computers nowadays they can do this almost instantly still) but I just don't want it to be ridiculous.

那么这样做的最佳方法是什么?到目前为止,我一直在使用 python,但它不需要是 python 脚本.我不关心代码中的优雅或我做的方式或任何事情,我只是想要一种简单的方法,使用任何最容易使用或实现的工具将我的所有旧 ID 替换为我的新 ID.

So what's the best way to go about doing this? So far I've been using python but it doesn't need to be a python script. I don't care about elegance in the code or way I do it or anything, I just want an easy way to replace all of my old ids with my new ids using whatever tool is easiest to use or implement.

示例:

这是 id 列表中的一行.第一部分是名称,可以忽略,第二部分是旧id,第三部分是需要替换旧id的新id.

Here is a line from the list of ids. The first part is the name and can be ignored, the second part is the old id, and the third part is the new id that needs to replace the old id.

unlock_music_play_grid_thumb_01 0x108043c 0x10804f0

这是要替换的文件之一中的示例行:

Here is an example line in one of the files to be replaced:

const v1, 0x108043c

我需要能够用新的 id 替换那个 id,所以它看起来像这样:

I need to be able to replace that id with the new id so it looks like this:

const v1, 0x10804f0

推荐答案

使用类似 multiwordReplace(我已经根据您的情况对其进行了编辑)使用 mmap.

import os
import os.path
import re
from mmap import mmap
from contextlib import closing


id_filename = 'path/to/id/file'
directory_name = 'directory/to/replace/in'

# read the ids into a dictionary mapping old to new
with open(id_filename) as id_file:
    ids = dict(line.split()[1:] for line in id_file)    

# compile a regex to do the replacement
id_regex = re.compile('|'.join(map(re.escape, ids)))

def translate(match):
    return ids[match.group(0)]

def multiwordReplace(text):
    return id_regex.sub(translate, text)

for code_filename in os.listdir(directory_name):
    with open(os.path.join(directory, code_filename), 'r+') as code_file:
        with closing(mmap(code_file.fileno(), 0)) as code_map:
            new_file = multiword_replace(code_map)
    with open(os.path.join(directory, code_filename), 'w') as code_file:
        code_file.write(new_file)

这篇关于在多个文件上查找和替换多个查询的最佳方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆