使用Python查找并替换Excel(.xlsx)中的字符串 [英] Find and replace strings in Excel (.xlsx) using Python

查看:1837
本文介绍了使用Python查找并替换Excel(.xlsx)中的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图用一个.xlsx表格替换一堆字符串(约70k行,38列)。我有一个要搜索和替换的文件列表,格式如下: -

I am trying to replace a bunch of strings in an .xlsx sheet (~70k rows, 38 columns). I have a list of the strings to be searched and replaced in a file, formatted as below:-

bird produk - bird product
pig - pork
ayam - chicken
...
kuda - horse

要搜索的词在左边,替换在右边(找到'bird produk',替换为'bird product'。我的.xlsx表如下所示: -

The word to be searched is on the left, and the replacement is on the right (find 'bird produk', replace with 'bird product'. My .xlsx sheet looks something like this:-

name     type of animal     ID
ali      pig                3483
abu      kuda               3940
ahmad    bird produk        0399
...
ahchong  pig                2311

我正在寻找最快的解决方案这个,因为我在搜索列表中有大约200个单词,而.xlsx文件是相当大的,我需要使用Python,但我可以使用任何其他更快的解决方案。

I am looking for the fastest solution for this, since I have around 200 words in the list to be searched, and the .xlsx file is quite large. I need to use Python for this, but I am open to any other faster solutions.

编辑: - 添加的工作表示例

- added sheet example

Edit2: - 尝试一些python代码来读取单元格,花了相当长的时间阅读。任何指针?

- tried some python codes to read the cells, took quite a long time to read. Any pointers?

from xlrd import open_workbook
wb = open_workbook('test.xlsx')

for s in wb.sheets():
    print ('Sheet:',s.name)
    for row in range(s.nrows):
        values = []
        for col in range(s.ncols):
            print(s.cell(row,col).value)

谢谢!

编辑3:我终于弄清楚了。 VBA模块和Python代码都可以工作。我诉诸于.csv而不是让事情更容易。谢谢!这是我的Python代码版本: -

import csv

###### our dictionary with our key:values. ######
reps = {
    'JUALAN (PRODUK SHJ)' : 'SALE( PRODUCT)',
    'PAMERAN' : 'EXHIBITION',
    'PEMBIAKAN' : 'BREEDING',
    'UNGGAS' : 'POULTRY'}


def replace_all(text, dic):
    for i, j in reps.items():
        text = text.replace(i, j)
    return text

with open('test.csv','r') as f:
    text=f.read()
    text=replace_all(text,reps)

with open('file2.csv','w') as w:
    w.write(text)


推荐答案

我将复制你的内容将文本文件导入Excel文件中的新工作表,并将该表格命名为Lookup。然后使用文本列将数据从第一行开始获取此新表格前两列中的数据。

I would copy the contents of your text file into a new worksheet in the excel file and name that sheet "Lookup." Then use text to columns to get the data in the first two columns of this new sheet starting in the first row.

将以下代码粘贴到Excel中的模块中,并运行它:

Paste the following code into a module in Excel and run it:

Sub Replacer()
    Dim w1 As Worksheet
    Dim w2 As Worksheet

    'The sheet with the words from the text file:
    Set w1 = ThisWorkbook.Sheets("Lookup")
    'The sheet with all of the data:
    Set w2 = ThisWorkbook.Sheets("Data")

    For i = 1 To w1.Range("A1").CurrentRegion.Rows.Count
        w2.Cells.Replace What:=w1.Cells(i, 1), Replacement:=w1.Cells(i, 2), LookAt:=xlPart, _
        SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
        ReplaceFormat:=False
    Next i

End Sub

这篇关于使用Python查找并替换Excel(.xlsx)中的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆