使用 Python 在 Excel (.xlsx) 中查找和替换字符串 [英] Find and replace strings in Excel (.xlsx) using Python

查看:26
本文介绍了使用 Python 在 Excel (.xlsx) 中查找和替换字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试替换 .xlsx 工作表中的一堆字符串(约 70k 行,38 列).我有一个要在文件中搜索和替换的字符串列表,格式如下:-

I am trying to replace a bunch of strings in an .xlsx sheet (~70k rows, 38 columns). I have a list of the strings to be searched and replaced in a file, formatted as below:-

bird produk - bird product
pig - pork
ayam - chicken
...
kuda - horse

要搜索的词在左边,替换词在右边(找到'bird produk',替换为'bird product'.我的.xlsx 表看起来像这样:-

The word to be searched is on the left, and the replacement is on the right (find 'bird produk', replace with 'bird product'. My .xlsx sheet looks something like this:-

name     type of animal     ID
ali      pig                3483
abu      kuda               3940
ahmad    bird produk        0399
...
ahchong  pig                2311

我正在为此寻找最快的解决方案,因为我在要搜索的列表中有大约 200 个单词,而且 .xlsx 文件非常大.我需要为此使用 Python,但我愿意接受任何其他更快的解决方案.

I am looking for the fastest solution for this, since I have around 200 words in the list to be searched, and the .xlsx file is quite large. I need to use Python for this, but I am open to any other faster solutions.

- 添加工作表示例

Edit2:- 尝试了一些 python 代码来读取单元格,花了很长时间来读取.有什么指点吗?

- tried some python codes to read the cells, took quite a long time to read. Any pointers?

from xlrd import open_workbook
wb = open_workbook('test.xlsx')

for s in wb.sheets():
    print ('Sheet:',s.name)
    for row in range(s.nrows):
        values = []
        for col in range(s.ncols):
            print(s.cell(row,col).value)

谢谢!

Edit3:- 我终于想通了.VBA 模块和 Python 代码都可以工作.我改用 .csv 来让事情变得更容易.谢谢!这是我的 Python 代码版本:-

import csv

###### our dictionary with our key:values. ######
reps = {
    'JUALAN (PRODUK SHJ)' : 'SALE( PRODUCT)',
    'PAMERAN' : 'EXHIBITION',
    'PEMBIAKAN' : 'BREEDING',
    'UNGGAS' : 'POULTRY'}


def replace_all(text, dic):
    for i, j in reps.items():
        text = text.replace(i, j)
    return text

with open('test.csv','r') as f:
    text=f.read()
    text=replace_all(text,reps)

with open('file2.csv','w') as w:
    w.write(text)

推荐答案

我会将文本文件的内容复制到 Excel 文件中的新工作表中,并将该工作表命名为查找".然后使用 text to columns 来获取这个新工作表的前两列中的数据,从第一行开始.

I would copy the contents of your text file into a new worksheet in the excel file and name that sheet "Lookup." Then use text to columns to get the data in the first two columns of this new sheet starting in the first row.

将以下代码粘贴到 Excel 中的模块中并运行它:

Paste the following code into a module in Excel and run it:

Sub Replacer()
    Dim w1 As Worksheet
    Dim w2 As Worksheet

    'The sheet with the words from the text file:
    Set w1 = ThisWorkbook.Sheets("Lookup")
    'The sheet with all of the data:
    Set w2 = ThisWorkbook.Sheets("Data")

    For i = 1 To w1.Range("A1").CurrentRegion.Rows.Count
        w2.Cells.Replace What:=w1.Cells(i, 1), Replacement:=w1.Cells(i, 2), LookAt:=xlPart, _
        SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
        ReplaceFormat:=False
    Next i

End Sub

这篇关于使用 Python 在 Excel (.xlsx) 中查找和替换字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆