插值时间序列,从x中选择y值 [英] Interpolate time series, select y value from x

查看:161
本文介绍了插值时间序列,从x中选择y值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一段时间以来,我一直在寻找答案,但虽然接近,但仍然遇到错误.有很多类似的问题几乎可以回答这个问题,但是我一直无法解决.任何帮助或朝着正确方向的观点表示赞赏.

I have been searching for an answer to this for a while, and have gotten close but keep running into errors. There are a lot of similar questions that almost answer this, but I haven't been able to solve it. Any help or a point in the right direction is appreciated.

我有一张图,显示温度是深度的主要非线性函数,其中x和y值取自熊猫数据框.

I have a graph showing temperature as a mostly non-linear function of depth, with the x and y values drawn from a pandas data frame.

import matplotlib.pyplot as plt

x = (22.81,  22.81,  22.78,  22.71,  22.55,  22.54,  22.51,  22.37)
y = (5, 16, 23, 34, 61, 68, 77, 86)

#Plot details
plt.figure(figsize=(10,7)), plt.plot(style='.-')
plt.title("Temperature as a Function of Depth")
plt.xlabel("Temperature"), plt.ylabel("Depth")
plt.gca().invert_yaxis()
plt.plot(x,y, linestyle='--', marker='o', color='b')

哪个给我一张像这样的图像(注意,因为我在谈论深度,所以请注意y轴的翻转):

Which gives me an image somewhat like this one (note the flipped y axis since I'm talking about depth):

我想在特定的x值22.61下找到y值,这不是数据集中的原始温度值之一.我尝试了以下步骤:

I would like to find the y value at a specific x value of 22.61, which is not one of the original temperature values in the dataset. I've tried the following steps:

np.interp(22.61, x1, y1)

哪一个给我一个我知道是不正确的值,

Which gives me a value that I know to be incorrect, as does

s = pd.Series([5,16,23,34,np.nan,61,68,77,86], index=[22.81,22.81,22.78,22.71,22.61,22.55,22.54,22.51,22.37])
s.interpolate(method='index')

我试图在其中设置框架并强制插值.我也尝试过

where I am trying to just set up a frame and force the interpolation. I also tried

line = plt.plot(x,y)
xvalues = line[0].get_xdata()
yvalues = line[0].get_ydata()
idx = np.where(xvalues==xvalues[3]) ## 3 is the position
yvalues[idx]

但是这将为已列出的特定x值返回y值,而不是内插值.

but this returns y values for a specific, already-listed x value, rather than an interpolated one.

我希望这足够清楚.我是数据科学和stackoverflow的新手,所以如果我需要改写这个问题,请告诉我.

I hope this is clear enough. I'm brand new to data science, and to stackoverflow, so if I need to rephrase the question please let me know.

推荐答案

您确实可以使用

You may indeed use the numpy.interp function. As the documentation states

数据点的x坐标必须增加[...]

The x-coordinates of the data points, must be increasing [...]

因此,在使用此功能之前,您需要对x数组上的数组进行排序.

So you need to sort the arrays on the x array, before using this function.

# Sort arrays
xs = np.sort(x)
ys = np.array(y)[np.argsort(x)]

# x coordinate
x0 = 22.61
# interpolated y coordinate
y0 = np.interp(x0, xs, ys)


完整的代码:


Complete Code:

import numpy as np
import matplotlib.pyplot as plt

x = (22.81,  22.81,  22.78,  22.71,  22.55,  22.54,  22.51,  22.37)
y = (5, 16, 23, 34, 61, 68, 77, 86)

# Sort arrays
xs = np.sort(x)
ys = np.array(y)[np.argsort(x)]

# x coordinate
x0 = 22.61
# interpolated y coordinate
y0 = np.interp(x0, xs, ys)

#Plot details
plt.figure(figsize=(10,7)), plt.plot(style='.-')
plt.title("Temperature as a Function of Depth")
plt.xlabel("Temperature"), plt.ylabel("Depth")
plt.gca().invert_yaxis()
plt.plot(x,y, linestyle='--', marker='o', color='b')
plt.plot(x0,y0, marker="o", color="C3")

这篇关于插值时间序列,从x中选择y值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆