平均一列平均数据 [英] Averaging down a column of averaged data

查看:159
本文介绍了平均一列平均数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在python中编写一个代码来完成一些项目,
1)通过列
从xls文件列读取数据2)平均每组三行的列的每一行
3)然后平均结果列

I am writing a code in python for a project that has to accomplish a few things; 1) read in data from an xls file column by column 2) average each row of the columns in groups of three 3) then average the resulting columns

我已经完成了1和2,但是似乎不能得到3,我认为很多麻烦我来自于我使用float的事实,但我需要数字6位小数。任何帮助和耐心是赞赏,我是新的python

I have accomplished 1 and 2 but can't quite seem to get 3, I think a lot of the trouble I'm having stems from the fact that I am using float however I need the numbers to 6 decimal places. Any help and patience is appreciated, I'm very new to python

v = open("Pt_2_Test_Data.xls", 'wb') #created file to write output to
w = open("test2.xls")

count = 0

for row in w: #read in file
    for line in w:
        columns = line.split("\t") #split up into columns
        date = columns[0]
        time = columns[1]
        a = columns[2]
        b = columns[3]
        c = columns[4]
        d = columns[5]
        e = columns[6]
        f = columns[7]
        g = columns[8]
        h = columns[9]
        i = columns[10]
        j = columns[11]
        k = columns[12]
        l = columns[13]
        m = columns[14]
        n = columns[15]
        o = columns[16]
        p = columns[17]
        q = columns[18]
        r = columns[19]
        s = columns[20]
        t = columns[21]
        u = columns[22]
        LZA = columns[23]
        SZA = columns[24]
        LAM = columns[25]

        count += 1

        A = 0
        if count != 0:  # gets rid of column tiles
            filter1 = ((float(a) + float(b) + float(c))/3)
            filter1 = ("%.6f" %A)
            filter2 =  (float(d) + float(e) + float(f))/3
            filter2 = ("%.6f" %filter2)
            filter3 =  (float(g) + float(h) + float(i))/3
            filter3 = ("%.6f" %filter3)
            filter4 =  (float(j) + float(k) + float(l))/3
            filter4 = ("%.6f" %filter4)
            filter5 =  (float(m) + float(n) + float(o))/3
            filter5 = ("%.6f" %filter5)
            filter6 =  (float(p) + float(q) + float(r))/3
            filter6 = ("%.6f" %filter6)
            filter7 =  (float(s) + float(t) + float(u))/3
            filter7 = ("%.6f" %filter7)
            A = [filter1, filter2, filter3, filter4, filter5, filter6, filter7]
            A = ",".join(str(x) for x in A).join('[]')

            print A
            avg = [float(sum(col))/float(len(col)) for col in zip(*A)]
            print avg

我还尝试格式化数据,如:

I have also tried formatting the data like so:

            A = ('{0}    {1}    {2}     {3}    {4}    {5}    {6}    {7}    {8}'.format(date, time, float(filter1), float(filter2), float(filter3), float(filter4), float(filter5), float(filter6), float(filter7))+'\n') # average of triplets
            print A

我认为我可以访问每个列的值,并通过调用他们像你在使用字典时,通过调用他们必要的数学,但这是不成功的:似乎它是识别数据作为一行(因此试图通过[0]访问任何列都超出范围),或者由单个字符,而不是数字列表。这是关于使用float函数吗?

thinking I could access the values of each column and preform the necessary math on them by calling them like you would when using a dictionary, however this was unsuccessful:it seemed it was recognizing the data either as a row (so trying to access any column by [0] was out of bounds) or by the individual characters, not as a list of numbers. Is this related to using the float function?

推荐答案

我不知道我理解你想要在3)中的平均值,但也许这是你想要的:

I'm not sure I understand which columns you want to average in 3), but maybe this does what you want:

with open("test2.xls") as w:
    w.next()  # skip over header row
    for row in w:
        (date, time, a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t,
         u, LZA, SZA, LAM) = row.split("\t")  # split columns into fields

        A = [(float(a) + float(b) + float(c))/3,
             (float(d) + float(e) + float(f))/3,
             (float(g) + float(h) + float(i))/3,
             (float(j) + float(k) + float(l))/3,
             (float(m) + float(n) + float(o))/3,
             (float(p) + float(q) + float(r))/3,
             (float(s) + float(t) + float(u))/3]
        print ('['+ ', '.join(['{:.6f}']*len(A)) + ']').format(*A)
        avg = sum(A)/len(A)
        print avg

事情有点更简洁与如下代码:

You could do the same thing a little more concisely with code like the following:

avg = lambda nums: sum(nums)/float(len(nums))

with open("test2.xls") as w:
    w.next()  # skip over header row
    for row in w:
        cols = row.split("\t")  # split into columns
        # then split that into fields
        date, time, values, LZA, SZA, LAM = (cols[0], cols[1],
                                             map(float, cols[2:23]), 
                                             cols[23], cols[24], cols[25])
        A = [avg(values[i:i+3]) for i in xrange(0, 21, 3)]
        print ('['+ ', '.join(['{:.6f}']*len(A)) + ']').format(*A)
        print avg(A)

这篇关于平均一列平均数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆