用xlwings排序(pywin32) [英] Sorting with xlwings (pywin32)

查看:103
本文介绍了用xlwings排序(pywin32)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要使用python按给定的行对excel电子表格进行排序.为了进行测试,我使用了这些数据(在名为xlwings sorting.xlsx的文件中):

I need to use python to sort an excel spreadsheet by a given row. For the testing, I'm using this data (in a file named xlwings sorting.xlsx):

Numbers Letters Letters_2
7   A   L
6   B   K
5   C   M
4   D   J
3   E   N
2   F   I
1   G   H

应归类为以下内容:

Numbers Letters Letters_2
1   G   H
2   F   I
3   E   N
4   D   J
5   C   M
6   B   K
7   A   L

人们会认为这是一项琐碎的任务,但是在任何 xlwings 文档或

One would think this to be a trivial task, but there seems to be nothing in the way of any documentation (if there is something, it's buried so deep that two days of reading hasn't uncovered it) in either the xlwings docs or the pywin32 ones regarding column sorting.

我可以在网上找到的最接近的东西是这个问题,它没有答案,只是重定向到没有解决方案的github bug线程.

The closest thing I could find anywhere online was this question, which has no answer and just redirects to a github bug thread that had no resolution.

尽管如此,我还是根据提问者的努力将以下代码拼凑在一起:

Still, I have managed to cobble together the following code based on the questioner's:

import xlwings as xw
from xlwings.constants import SortOrder

bk = xw.Book(r'C:\Users\username\Documents\Test Files\xlwings sorting.xlsx')

sht = bk.sheets['Sheet1']

def xl_col_sort(sht,col_num):
    sht.range('a2').api.Sort(sht.range((2,col_num)).api,SortOrder.xlAscending)
    return

xl_col_sort(sht,1)

这可以运行,但是我不知道语法是如何工作的.我什至不知道为什么要先进行 range('a2')调用,但是如果我尝试直接调用 sht.api.Sort ,它会引发异常.我试图直接用ipython的代码查看代码?功能,但它只为我提供< xlwings._xlwindows.COMRetryObjectWrapper对象,位于0x0000001375A459E8> ,没有文档字符串.然后,我尝试通过 Sort()函数的.py文件实际按Ctrl + F,但是在庞大的COM包装器列表中遇到了死胡同,无法找到包含以下内容的实际模块功能.

This runs, but I have no idea how the syntax is working. I can't even tell why the first range('a2') call is necessary, but it throws an exception if I try directly calling sht.api.Sort. I tried looking directly at the code with ipython's ?? feature but it just gives me <xlwings._xlwindows.COMRetryObjectWrapper object at 0x0000001375A459E8> with no docstring. I then tried to actually ctrl+F through the .py files for the Sort() function, but ran into a dead end in a huge list of COM wrappers and couldn't track down the actual module containing the function.

无论如何,即使我不知道如何运行,测试用例也是如此.因此,下一步是将该函数放入一个包含excel工作簿和表格的类中,以将该函数用作方法.我重写了代码,既可以用作方法,也可以使用字符串而不是列号(新列经常添加到工作表的中间,因此数目经常更改):

At any rate, even if I haven't a clue how, the test case works; so the next step is putting this function into a class that contains an excel workbook and sheet to use the function as a method. I rewrite the code both to be used as a method and to take strings instead of column numbers (new columns are added to the middle of the worksheet frequently, so the number would change often):

class Metrics:

    # self.sheet is a sheet object based on self.book opened with xlwings
    # a bunch of other methods and attributes

    def xl_col_sort(self,col):

        # +2 because excel starts at 1 (+1) and the dataframe self.df
        # uses a data column as the index (+1)
        col_num = np.where(self.df.columns == col)[0][0] + 2

        so = xw.constants.SortOrder

        self.sheet.range('a2').api.Sort(self.sheet.range((2,col_num)).api, so.xlAscending)
        return

在这里,我看不到任何功能发生了变化.即使它们经过了另外一个要创建的步骤,它仍然会收到相同的参数.但是尝试运行此命令会产生 MemoryError :

I can't see that anything has functionally changed, here. It's still receiving the same arguments, even if they go through an additional step to be created. Yet attempting to run this produces a MemoryError:

In[1]:    metrics.xl_col_sort('Exp. Date')
---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
<ipython-input-3-f1de8b0e8e98> in <module>()
----> 1 metrics.xl_col_sort('Exp. Date')

C:\Users\username\Documents\Projects\PyBev\pyBev_0-3-1\pybev\metricsobj.py in xl_col_sort(self, col)
    146         so = xw.constants.SortOrder
    147 
--> 148         self.sheet.range('a2').api.Sort(self.sheet.range((2,col_num)).api, so.xlAscending)
    149         return
    150     # def monday_backup(self):
C:\Users\username\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\xlwings\main.py in range(self, cell1, cell2)
    818                 raise ValueError("Second range is not on this sheet")
    819             cell2 = cell2.impl
--> 820         return Range(impl=self.impl.range(cell1, cell2))
    821 
    822     @property
C:\Users\username\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\xlwings\_xlwindows.py in range(self, arg1, arg2)
    576             if 0 in arg1:
    577                 raise IndexError("Attempted to access 0-based Range. xlwings/Excel Ranges are 1-based.")
--> 578             xl1 = self.xl.Cells(arg1[0], arg1[1])
    579         elif isinstance(arg1, numbers.Number) and isinstance(arg2, numbers.Number):
    580             xl1 = self.xl.Cells(arg1, arg2)
C:\Users\username\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\xlwings\_xlwindows.py in __call__(self, *args, **kwargs)
    149         for i in range(N_COM_ATTEMPTS + 1):
    150             try:
--> 151                 v = self._inner(*args, **kwargs)
    152                 t = type(v)
    153                 if t is CDispatch:
C:\Users\username\AppData\Local\Enthought\Canopy\edm\envs\User\lib\site-packages\win32com\client\dynamic.py in __call__(self, *args)
    190                 if invkind is not None:
    191                         allArgs = (dispid,LCID,invkind,1) + args
--> 192                         return self._get_good_object_(self._oleobj_.Invoke(*allArgs),self._olerepr_.defaultDispatchName,None)
    193                 raise TypeError("This dispatch object does not define a default method")
    194 
MemoryError: CreatingSafeArray

有人知道这个东西的语法是如何工作的,或者为什么将它放入方法后会中断呢?

Does anyone know how the syntax of this thing works or why it's breaking when put inside the method?

推荐答案

事实证明这是一个非常微妙的错误,因此我认为我会发布答案,以防万一有人在一年中最终使用谷歌搜索以尝试做某事相似.

This turned out to be an incredibly subtle error, so I figured I'd post the answer in case someone ends up googling this in a year trying to do something similar.

简而言之, sheet.range()方法仅接受整数的坐标,以及表达式:

In short, the sheet.range() method only accepts coordinates that are integers, and the expression:

col_num = np.where(self.df.columns == col)[0][0] + 2

产生一个浮点数.为什么这会产生 MemoryError 而不是语法错误,这超出了我的范围,但可能是一个疏忽.尽管开发人员确实对此有所了解.

produces a floating point number. Why this produces a MemoryError instead of a syntax error is beyond me, but probably an oversight. The devs do seem to know about it, though.

此外,上述文档中未列出该语法,因为它实际上是VBA代码,如

Additionally, the syntax is not listed in the aforementioned docs because it is actually VBA code, as found here. The Sort() method only works on Range objects, hence the first sht.range() call requirement.

最后,万一有人想要一个简化的功能来封装所有这些废话:

And finally, in case anyone wants a simplified function to encapsulate all this nonsense:

import xlwings as xw


bk = xw.Book(file_path)
sheet = bk.sheets['Sheet1'] # or whatever the sheet is named

def xl_col_sort(sheet,col_num):
    sheet.range((2,col_num)).api.Sort(Key1=sheet.range((2,col_num)).api, Order1=1)
return

这篇关于用xlwings排序(pywin32)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆