使用python处理Excel 2007文件 [英] manipulating excel 2007 files using python

查看:119
本文介绍了使用python处理Excel 2007文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用python,我需要能够对Excel 2007的工作簿执行以下操作:

Using python I need to be able to do the following operations to a workbook for excel 2007:

  1. 删除行
  2. 排序工作表
  3. 从列中获取不同的值

我正在调查 openpyxl ;但是,它的功能似乎有限.

I am looking into openpyxl; however, it seems to have limited capabilities.

任何人都可以推荐一个可以完成上述任务的图书馆吗?

Can anyone please recommend a library that can do the above tasks?

推荐答案

我想以此开头作为开头,让您知道这只是基于Windows的解决方案.但是,如果您使用的是Windows,我建议您使用 Win32Com ,该文件可以在

I want to preface this with letting you know this is only a windows based solution. But if you are using Windows I would recommend using Win32Com which can be found here. This module gives Python programmatic access to any Microsoft Office Application (including Excel) and uses many of the same methods used in VBA. Usually what you will do is record a macro (or recall from memory) how to do something in VBA and then use the same functions in Python

首先,我们想连接到Excel并获得对第一张工作表的访问权限

To start we want to connect to Excel and get access to the first sheet as an example

#First we need to access the module that lets us connect to Excel
import win32com.client 

# Next we want to create a variable that represents Excel
app = win32com.client.Dispatch("Excel.Application")   

# Lastly we will assume that the workbook is active and get the first sheet
wbk = app.ActiveWorkbook
sheet = wbk.Sheets(1)

这时,我们有一个名为 sheet 的变量,它表示将要使用的excel工作表.当然,有多种方法可以访问工作表,这通常是我演示如何在excel中使用win32com的方法,因为它非常直观.

At this point we have a variable named sheet that represents the excel work sheet we will be working with. Of course there are multiple ways to access the sheet, this is usually the way I demo how to use win32com with excel because it is very intuitive.

现在假设我在第一张纸上具有以下值,我将一个接一个地介绍如何回答您的要求:

Now assume I have the following values on the first sheet and I will go over one by one how to answer what you were asking:

     A    
1   "d"
2   "c"
3   "b"
4   "a"
5   "c"

删除行: 假设您要删除活动工作表中的第一行.

Delete Rows: Lets assume that you want to delete the first row in your active sheet.

sheet.Rows(1).Delete()

这将创建:

    A
1   "c"
2   "b"
3   "a"
4   "c"

下一步让我们按升序对单元格进行排序(尽管我建议将值提取到python并在列表中进行排序,然后将值发回)

Next Lets sort the cells in ascending order (although I would recommend extracting the values to python and doing the sorting within a list and sending the values back)

rang = sheet.Range("A1","A4")
sheet.Sort.SetRange(rang)
sheet.Sort.Apply()

这将创建:

    A
1   "a"
2   "b"
3   "c"
4   "c"

现在,我们将从列中获得不同的值.这里要摘录的主要内容是如何从单元格中提取值.您可以使用 sheet.Range("A1","A4")一次选择很多单元格,也可以通过使用sheet.Cells(row,上校).射程快了几个数量级,但Cells的调试稍微容易一些.

And now we will get distinct values from the column. The main thing to take away here is how to extract values from a cells. You can either select a lot of cells at once and with sheet.Range("A1","A4") or you can access the values by iterating over cell by cell with sheet.Cells(row,col). Range is orders of magnitude faster, but Cells is slightly easier for debugging.

#Get a list of all Values using Range
valLstRange = [val[0] for val in sheet.Range("A1","A4").Value]

#Get a list of all Values using Cells
valLstCells = [sheet.Cells(row,1).Value for row in range(1,4)]

#valLstCells and valLstRange both = ["a","b","c","c"]

现在,您最后要保存工作簿,您可以使用以下命令进行保存:

Now lastly you wanted to save the workbook and you can do this with the following:

wbk.SaveAs("C:/savedWorkbook.xlsx")

您完成了!

关于COM的信息

如果您使用过VBA,.NET,VBscript或任何其他语言来使用Excel,这些Excel方法中的许多方法将看起来相同.这是因为它们都使用Microsoft提供的相同库.该库使用COM,这是Microsoft向与语言无关的程序员提供API的方式. COM本身是一种较旧的技术,调试起来很棘手.如果您想了解有关Python和COM的更多信息,我强烈推荐Mark Hammond的在Win32上进行Python编程 .在Windows官方.msi安装程序中安装Python之后,他就是个大喊大叫的人.

If you have worked with VBA, .NET, VBscript or any other language to work with Excel many of these Excel methods will look the same. That is because they are all using the same library provided by Microsoft. This library uses COM, which is Microsoft's way of providing API's to programmers that are language agnostic. COM itself is an older technology and can be tricky to debug. If you want more information on Python and COM I highly recommend Python Programming on Win32 by Mark Hammond. He is the guy that gets a shoutout after you install Python on Windows in the official .msi installer.

WIN32COM的替代品

我还需要指出,在大多数情况下,有几种出色的开源替代方案可以比COM更快,并且可以在任何OS(Mac,Linux,Windows等)上运行.这些工具都解析包含.xlsx的压缩文件.如果您不知道.xlsx文件是.zip,只需将扩展名更改为.zip,然后就可以浏览内容了(在您的职业生涯中至少要做一次有趣的事情).在这些文件中,我推荐 Openpyxl ,我曾用它在性能良好的服务器上解析和创建Excel文件.批判的. 请勿使用win32com进行服务器活动,因为它会为每个可能泄漏的实例打开excel.exe的进程外实例

I also need to point out there are several fantastic open source alternatives that can be faster than COM in most situations and work on any OS (Mac, Linux, Windows, etc.). These tools all parse the zipped files that comprise a .xlsx. If you did not know that a .xlsx file is a .zip, just change the extension to .zip and you can then explore the contents (kind of interesting to do at least once in your career). Of these I recommend Openpyxl which I have used for parsing and creating Excel files on a server where performance was critical. Never use win32com for server activities as it opens an out-of-process instance of excel.exe for each instance that can be leaky

推荐

对于那些正在处理数据发现活动的单个数据集(分析师,金融服务,研究人员,会计师,业务运营等)密切合作的用户,我建议使用win32com,因为它非常适合打开的工作簿.但是,需要以很小的空间进行非常大的操作并行执行 处理的开发人员或用户必须使用这样的程序包作为openpyxl.

I would recommend win32com for users who are working intimately with individual data sets (analysts, financial services, researchers, accountants, business operations, etc.) that are performing data discovery activities as it works great with open workbooks. However, developers or users that need to perform very large tasks with a small footprint or extremely large manipulations or processing in parallel must use a package such as openpyxl.

这篇关于使用python处理Excel 2007文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆