pandas 用单位插值数据 [英] Pandas interpolate data with units

查看:104
本文介绍了 pandas 用单位插值数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好,

我一直在寻找Stackoverflow两年了,它对我有很大帮助,以至于我从来不需要注册:)

I've been looking to Stackoverflow for couple of years, and it helped me a lot, so much that I never have to register before :)

但是今天,我陷入了将Python与Pandas和Quantities结合使用的问题(也可能是unum或品脱).我会尽力做出清晰的帖子,但是由于这是我的第一篇文章,因此,如果出现令人困惑的事情,我深表歉意,并将尝试纠正您会发现的任何错误:)

But today I'm stuck on a problem using Python with Pandas and Quantities (could be unum or pint as well). I try to do my best to make a clear post, but since it's my first one, I apologize if something is confusing and will try to correct any mistake you'll find :)

我想从数据源导入数据并构建一个Pandas数据框,如下所示:

I want to import data from a source and build a Pandas dataframe as follow:

import pandas as pd
import quantities as pq

depth = [0.0,1.1,2.0] * pq.m
depth2 = [0,1,1.1,1.5,2] * pq.m

s1 = pd.DataFrame(
        {'depth' : [x for x in depth]},
        index = depth)

这给出了:

S1=
     depth
0.0  0.0 m
1.1  1.1 m
2.0  2.0 m


现在,我想将数据扩展到depth2值: (显然,没有必要在深度上插入深度,但这是在变得更加复杂之前进行的测试).


Now I want to extend the data to the depth2 values: (obviously there is not point to interpolate depth over depth, but it's a test before it gets more complicated).

s2 = s1.reindex(depth2)

这给出了:

S2=
      depth
0.0   0.0 m
1.0   NaN
1.1   1.1 m
1.5   NaN
2.0   2.0 m

到目前为止没有问题.

但是当我尝试对缺失的值进行插值时:

But when I try to interpolate the missing values doing:

s2['depth'].interpolate(method='values')

我遇到以下错误:

C:\Python27\lib\site-packages\numpy\lib\function_base.pyc in interp(x, xp, fp, left, right)
   1067         return compiled_interp([x], xp, fp, left, right).item()
   1068     else:
-> 1069         return compiled_interp(x, xp, fp, left, right)
  1070 
  1071 
TypeError: Cannot cast array data from dtype('O') to dtype('float64') according to the rule 'safe'

我了解从numpy进行插值不适用于对象.

I understand that interpolation from numpy does not work on object.

但是,如果我现在尝试通过删除单位来插值缺失值,则可以:

But if I try now to interpolate the missing values by dropping the units, it works:

s3 = s2['depth'].astype(float).interpolate(method='values')

这给出了:

s3 = 
0.0   0
1.0   1
1.1   1.1
1.5   1.5
2.0   2
Name: depth, dtype: object


如何在深度"列中找回单位?

我找不到放回单位的招数...

I can't find any trick to put back the unit...

任何帮助将不胜感激. 谢谢

Any help will be greatly appreciated. Thanks

推荐答案

好吧,我找到了一种解决方案,可能不是最好的解决方案,但是对于我的问题,它的工作原理还不错:

Ok I found a solution, might not be the best one, but for my problem it works just fine:

import pandas as pd
import quantities as pq

def extendAndInterpolate(input, newIndex):
""" Function to extend a panda dataframe and interpolate
"""
output = pd.concat([input, pd.DataFrame(index=newIndex)], axis=1)

for col in output.columns:
    # (1) Try to retrieve the unit of the current column
    try:
        # if it succeeds, then store the unit
        unit = 1 * output[col][0].units    
    except Exception, e:
        # if it fails, which means that the column contains string
        # then return 1
        unit = 1

    # (2) Check the type of value.
    if isinstance(output[col][0], basestring):
        # if it's a string return the string and fill the missing cell with this string
        value = output[col].ffill()
    else:
        # if it's a value, to be able to interpolate, you need to:
        #   - (a) dump the unit with astype(float)
        #   - (b) interpolate the value
        #   - (c) add again the unit
        value = [x*unit for x in output[col].astype(float).interpolate(method='values')]
    #
    # (3) Returned the extended pandas table with the interpolated values    
    output[col] = pd.Series(value, index=output.index)
# Return the output dataframe
return output

然后:

depth = [0.0,1.1,2.0] * pq.m
depth2 = [0,1,1.1,1.5,2] * pq.m

s1 = pd.DataFrame(
        {'depth' : [x for x in depth]},
        index = depth)

s2 = extendAndInterpolate(s1, depth2)

结果:

s1
     depth
0.0  0.0 m
1.1  1.1 m
2.0  2.0 m

s2     
     depth
0.0  0.0 m
1.0  1.0 m
1.1  1.1 m
1.5  1.5 m
2.0  2.0 m

感谢您的帮助.

这篇关于 pandas 用单位插值数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆