散点图内核平滑:ksmooth()根本无法平滑我的数据 [英] Scatter plot kernel smoothing: ksmooth() does not smooth my data at all

查看:705
本文介绍了散点图内核平滑:ksmooth()根本无法平滑我的数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想平滑我的解释变量(例如车辆的速度数据),然后使用此平滑后的值。我进行了很多搜索,没有找到直接答案的东西。

I want to smooth my explanatory variable, something like Speed data of a vehicle, and then use this smoothed values. I searched a lot, and find nothing that directly is my answer.

我知道如何计算内核密度估算值( density() KernSmooth :: bkde()),但我不知道该如何计算速度的平滑值。

I know how to calculate the kernel density estimation (density() or KernSmooth::bkde()) but I don't know then how to calculate the smoothed values of speed.

感谢@ZheyuanLi,我可以更好地解释我的拥有和想要做的事情。所以我重新编辑了我的问题,如下所示。

Thanks to @ZheyuanLi, I am able to better explain what I have and what I want to do. So I have re-edited my question as below.

我在一段时间内对车辆进行了一些速度测量,并存储为数据帧车辆

I have some speed measurement of a vehicle during a time, stored as a data frame vehicle:

         t       speed
1        0   0.0000000
2        1   0.0000000
3        2   0.0000000
4        3   0.0000000
5        4   0.0000000
.        .           .
.        .           .
1031  1030   4.8772222
1032  1031   4.4525000
1033  1032   3.2261111
1034  1033   1.8011111
1035  1034   0.2997222
1036  1035   0.2997222

这是一个散点图:

我想对 t 平滑速度,并且我想为此使用内核平滑。根据@Zheyuan的建议,我应该使用 ksmooth()

I want to smooth speed against t, and I want to use kernel smoothing for this purpose. According to @Zheyuan's advice, I should use ksmooth():

fit <- ksmooth(vehicle$t, vehicle$speed)

但是,我发现平滑值与原始值完全相同数据:

However, I found that the smoothed values are exactly the same as my original data:

sum(abs(fit$y - vehicle$speed))  # 0

为什么会这样?谢谢!

推荐答案

回答老问题




您需要区分内核密度估计和内核平滑。

密度估计,仅适用于单个变量。它旨在估计该变量在其物理域上的分布程度。例如,如果我们有1000个正常样本:

Density estimation, only works with a single variable. It aims to estimate how spread out this variable is on its physical domain. For example, if we have 1000 normal samples:

x <- rnorm(1000, 0, 1)

我们可以通过核密度估计器评估其分布:

We can assess its distribution by kernel density estimator:

k <- density(x)
plot(k); rug(x)

x轴上的地毯显示 x 值的位置,而曲线则测量这些地毯的密度。

The rugs on the x-axis shows the locations of your x values, while the curve measures the density of those rugs.

内核更平滑,实际上是回归问题或散点图平滑问题。您需要两个变量:一个响应变量 y 和一个解释性变量 x 。我们只使用上面的 x 作为解释变量。对于响应变量 y ,我们从

Kernel smoother, is actually a regression problem, or scatter plot smoothing problem. You need two variables: one response variable y, and an explanatory variable x. Let's just use the x we have above for the explanatory variable. For response variable y, we generate some toy values from

y <- sin(x) + rnorm(1000, 0, 0.2)

给出 y之间的散点图 x

我们想找到一个平滑函数来近似那些分散的点。

we want to find a smooth function to approximate those scattered dots.

Nadaraya-Watson核回归估计,R函数为 ksmooth()会帮助您:

The Nadaraya-Watson kernel regression estimate, with R function ksmooth() will help you:

s <- ksmooth(x, y, kernel = "normal")
plot(x,y, main = "kernel smoother")
lines(s, lwd = 2, col = 2)

如果要根据预测来解释所有内容,则:

If you want to interpret everything in terms of prediction:


  • 内核密度估计:给定 x ,预测密度为 x ;也就是说,我们对概率 P(grid [n]< x< grid [n + 1])进行估算,其中 grid 是一些重点;

  • 内核平滑:给定 x ,预测 y ;也就是说,我们对函数 f(x)进行了估算,其近似值为 y

  • kernel density estimation: given x, predict density of x; that is, we have an estimate of the probability P(grid[n] < x < grid[n+1]), where grid is some gird points;
  • kernel smoothing: given x, predict y; that is, we have an estimate of the function f(x), which approximates y.

在两种情况下,都没有解释变量 x 的平滑值。因此,您的问题是:我想平滑我的解释变量

In both cases, you have no smoothed value of explanatory variable x. So your question: "I want to smooth my explanatory variable" makes no sense.

您实际上有时间序列吗?

; 车辆的速度听起来好像您正在沿着 t 监视速度。如果是这样,得到一个在 speed t 之间的散点图,并使用 ksmooth()

"Speed of a vehicle" sounds like you are monitoring the speed along time t. If so, get a scatter plot between speed and t, and use ksmooth().

其他平滑方法,例如 loess() smooth.spline() 不是内核平滑类,但是您可以进行比较。

Other smoothing approach like loess() and smooth.spline() are not of kernel smoothing class, but you can compare.

这篇关于散点图内核平滑:ksmooth()根本无法平滑我的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆