从打开的Excel工作簿中提取数据的库 [英] Library to extract data from open Excel workbooks

查看:93
本文介绍了从打开的Excel工作簿中提取数据的库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从已经打开的工作簿中提取数据.

我已经找到了 xlrd库,但是看来您只能将其与工作簿一起使用通过Python打开.我将在项目中使用的工作簿已经打开,因此该方法不可用.

我发现了第二个库,它是 OpenPyxl ,即使工作簿已打开,它也只会为我返回错误:

from openpyxl import load_workbook

wb = load_workbook(filename = 'Components V2.4.3.xlsm')

返回:

FileNotFoundError:[错误2]没有这样的文件或目录:'Components V2.4.3.xlsm'

最后,我使用了win32com.clientDispatch,但无法从中获取像元值,因此为什么我要寻找替代方法.

我在openpyxl上做错什么了吗,还是我可以使用另一种方法?

解决方案

打开当前在Excel中打开的工作簿test.xlsx,并读取第一个工作表的单元格A1中的值:

from win32com.client import GetObject
xl = GetObject(None, "Excel.Application")
wb = xl.Workbooks("test.xlsx")
ws = wb.Sheets(1)
ws.Cells(1, 1).Value

将范围读取为元组的元组:

ws.Range("A1:D4").Value

回写一些值:

ws.Range("A1:D4").Value = [[16, 3, 2, 13], [5, 10, 11, 8], [9, 6, 7, 12], [4, 15, 14, 1]]


对评论的答案:COM(组件对象模型),有时也称为自动化",允许Windows应用程序提供"COM服务器",该服务器可通过"COM客户端"访问其某些API. Excel具有这样的服务器(而VBA具有客户端:您可以使用 CreateObject 来自VBA的GetObject .

其他应用程序通过COM提供类似的服务:例如 MATLAB SAS Stata 以及Microsoft Office的所有应用程序.

Python的客户端具有 pywin32 .您还可以使用Pywin32开发服务器,例如,请参见:使用pywin32的便携式Python com服务器

请注意,如您所注意到的,对于Excel,您可以访问大多数对象层次结构,并非常精确地控制Excel的行为.基本上,如果可以在VBA中进行操作,则可以从任何COM客户端进行操作.


关于范围的最后一行,我不确定我了解您想要什么.是这样的吗: Excel VBA查找范围的最后一行?


还有几点:

如果尚未打开Excel,您仍然可以打开与Excel的连接.在VBA中,执行此操作的功能是CreateObject而不是GetObject,但在Python中,它是Dispatch:

from win32com.client import Dispatch
xl = Dispatch("Excel.Application")
xl.WorksheetFunction.Gamma(0.5)

在VBA中,您通常会使用Excel常量",例如xlUp.它们也可以在Python中使用,使用此方法(之后使用GetObject或Dispatch开始与Excel的连接):

from win32com.client import constants as const
const.xlUp

要连接到计算机上安装的COM服务器,您需要获取对象的名称.这是几种情况:

对于Microsoft Office:

通常在VBScript中使用:

专业软件:

  • Matlab.Application
  • SAS.Application
  • stata.StataOLEApp

最后一句话:如此处所述,您可以在安装目录([Pythonpath]\Lib\site-packages\PyWin32.chm)或Web上找到Pywin32的文档:xlrd library, but it appears you can only use this with workbooks you open through Python. The workbooks I will use in my project have already been opened, so this method is unusable.

A second library I found, which is OpenPyxl, only returns errors for me, even though the workbook is open:

from openpyxl import load_workbook

wb = load_workbook(filename = 'Components V2.4.3.xlsm')

returns:

FileNotFoundError: [Errno 2] No such file or directory: 'Components V2.4.3.xlsm'

Lastly, I have used win32com.client's Dispatch which I could not get cell values from, hence why I am looking for an alternative.

Am I doing something wrong with openpyxl, or is there another method I can use?

解决方案

Open a workbook test.xlsx currently open in Excel, and read the value in cell A1 of the first worksheet:

from win32com.client import GetObject
xl = GetObject(None, "Excel.Application")
wb = xl.Workbooks("test.xlsx")
ws = wb.Sheets(1)
ws.Cells(1, 1).Value

Read a range as a tuple of tuples:

ws.Range("A1:D4").Value

Write back some values:

ws.Range("A1:D4").Value = [[16, 3, 2, 13], [5, 10, 11, 8], [9, 6, 7, 12], [4, 15, 14, 1]]


Answer to the comments: COM (Component Object Model), sometimes referred to as "Automation", allows a Windows application to provide a "COM server", which gives access to some of its APIs, to be accessed from a "COM client". Excel has such a server (and VBA has a client: you may use CreateObject and GetObject from VBA).

Other applications offer similar services through COM: for instance MATLAB, SAS, Stata, and all applications of Microsoft Office.

Python has a client with pywin32. You may also develop a server with Pywin32, see for instance this: Portable Python com server using pywin32

Note that in the case of Excel, as you noticed, you may access most of the object hierarchy, and control very precisely the behavior of Excel. Basically, if you can do it in VBA, you can do it from any COM client.


Regarding the last row of a range, I'm not sure I understand what you want. Is it this: Excel VBA Find last row in range ?


A few more points:

If Excel is not already open, you can still open a connection to Excel. In VBA the function to do this is CreateObject instead of GetObject, but in Python it's Dispatch:

from win32com.client import Dispatch
xl = Dispatch("Excel.Application")
xl.WorksheetFunction.Gamma(0.5)

In VBA you will often use Excel "constants", such as xlUp. They are available in Python too, with this (after starting the connection with Excel, with GetObject or Dispatch):

from win32com.client import constants as const
const.xlUp

To connect to a COM server installed on your computer, you need the name of the object to get. Here are a few cases:

For Microsoft Office:

Often used in VBScript:

Specialized software:

  • Matlab.Application
  • SAS.Application
  • stata.StataOLEApp

Last remark: as explained here, you can find the documentation of Pywin32 either in the directory where it's installed ([Pythonpath]\Lib\site-packages\PyWin32.chm), or on the web here: http://timgolden.me.uk/pywin32-docs/contents.html

这篇关于从打开的Excel工作簿中提取数据的库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆