从Excel中提取信息到Python二维数组 [英] extract information from excel into python 2d array

查看:2164
本文介绍了从Excel中提取信息到Python二维数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有日期,时间,温度和Excel工作表看起来像这样的:

I have an excel sheet with dates, time, and temp that look like this:

,我想提取这个信息到Python数组。

using python, I want to extract this info into python arrays.

该阵列将得到位置0日期,然后临时工存储在以下位置,看起来像这样:

The array would get the date in position 0, and then store the temps in the following positions and look like this:

temparray[0] = [20130102,34.75,34.66,34.6,34.6,....,34.86]
temparray[1] = [20130103,34.65,34.65,34.73,34.81,....,34.64]

这是我的尝试,但它吮吸:

here is my attempt, but it sucks:

from xlrd import * 

print open_workbook('temp.xlsx')

wb = open_workbook('temp.xlsx')

for s in wb.sheets():
    for row in range(s.nrows):
        values = []
        for col in range(s.ncols):
            values.append(s.cell(row,col).value)
        print(values[0])
        print("%.2f" % values[1])
        print'''

我用xlrd,但我愿意用任何东西。谢谢你的帮助。

i used xlrd, but I am open to using anything. Thank you for your help.

推荐答案

据我了解你的问题,问题是你想要的输出为列表的列表,你没有得到这样的事情。

From what I understand of your question, the problem is that you want the output to be a list of lists, and you're not getting such a thing.

这是因为有没有在你的code连的尝试的得到这样的事情。对于每一行,你建立一个列表,打印出列表的第一个值,打印出列表的第二个值,然后忘了列表中。

And that's because there's nothing in your code that even tries to get such a thing. For each row, you build a list, print out the first value of that list, print out the second value of that list, and then forget the list.

要追加每排的那些名单,以列表的大名单,所有你需要做的正是你正在做的追加每列值的行列出了同样的事情:

To append each of those row lists to a big list of lists, all you have to do is exactly the same thing you're doing to append each column value to the row lists:

temparray = []
for row in range(s.nrows):
    values = []
    for col in range(s.ncols):
        values.append(s.cell(row,col).value)
    temparray.append(values)


从你的评论,它看起来像你的真正的希望,不仅如此,而且还在一起白天分组温度,也只增加了第二列,而不是所有的值,对于每一天。这是不是在所有你在问题中所述。在这种情况下,你不应该在所有循环在列。你需要的是这样的:


From your comment, it looks like what you actually want is not only this, but also grouping the temperatures together by day, and also only adding the second column, rather than all of the values, for each day. Which is not at all what you described in the question. In that case, you shouldn't be looping over the columns at all. What you want is something like this:

days = []
current_day, current_date = [], None
for row in range(s.nrows):
    date = s.cell(row, 0)
    if date != current_date:
        current_day, current_date = [], date
        days.append(current_day)
    current_day.append(s.cell(row, 2))

这code假定日期总是排序顺序,因为它们在你输入的截图。

This code assumes that the dates are always in sorted order, as they are in your input screenshot.

我可能会这样结构不同,建行的迭代器传递给 itertools.groupby ,但我想保持这个作为新手友好,并尽可能靠近你原来的code,越好。

I would probably structure this differently, building a row iterator to pass to itertools.groupby, but I wanted to keep this as novice-friendly, and as close to your original code, as possible.

此外,我怀疑你真的不希望这样的:

Also, I suspect you really don't want this:

[[date1, temp1a, temp1b, temp1c], 
 [date2, temp2a, temp2b]]

......而像这样:

… but rather something like this:

{date1: [temp1a, temp1b, temp1c], 
 date2: [temp1a, temp1b, temp1c]}

但不知道你打算什么的的这个信息,我不能告诉你如何最好地保存它。

But without knowing what you're intending to do with this info, I can't tell you how best to store it.

这篇关于从Excel中提取信息到Python二维数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆