当窗口宽度变化时如何找到内核密度曲线上的所有转折点 [英] How to find all the turning points on a kernel density curve when window width varies

查看:134
本文介绍了当窗口宽度变化时如何找到内核密度曲线上的所有转折点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用内核密度对数据系列进行分区.这是我的计划:

I want to partition a data series using kernel density. Here is my plan:

  1. 使用具有不同窗口宽度的核密度函数(例如density())来计算该系列的密度.
  2. 在每个具有不同窗口宽度的内核曲线上,我找到了所有转折点(包括最小和最大)来划分数据.

因此,我需要知道这些转折点在原始数据系列中的位置.我阅读了一些信息,例如 https://stats.stackexchange .com/questions/30750/find-local-extrema-a-density-function-using-splines .但是我不太了解这种方法.在该方法中,d $ x [tp $ tppos]看起来不是原始索引.那么如何根据核密度曲线找到原始数据中所有转折点的位置呢?

So, I need to know where those turning points are in the original data series. I read some information like https://stats.stackexchange.com/questions/30750/finding-local-extrema-of-a-density-function-using-splines. But I do not really understand the method. In that method, d$x[tp$tppos] looks not the original index. So how can I find the positions of all the turning points in the original data based on kernel density curve?

另一个相关的问题是:如何找到所有最小/最大点?

Another related question is: how to find all the minimal/maximal points?

数据系列的示例是:

a <- c(21.11606, 15.22204, 16.27281, 15.22204, 15.22204, 21.11606, 19.32840, 15.22204, 20.25594, 15.22204, 14.28352, 15.22195, 19.32840, 19.32840, 15.22204, 14.28352, 21.11606, 21.19069, 15.22204, 25.26564, 15.22204, 19.32840, 21.11606, 15.22204, 15.22204, 19.32840, 15.22204, 19.32840, 15.22204, 15.22204, 21.13656, 15.22204, 15.22204, 19.32840, 15.22204, 17.98954, 15.22204, 15.22204, 15.22204, 15.22204, 15.22204, 19.32840, 15.22204, 14.28352, 15.22204, 19.32840, 15.22204, 19.32840, 25.42281, 21.19069)

推荐答案

当您采用a的密度时:  Da = density(a) 结果具有与许多x相关的y值.那是情节的来历.要找到转折点",您需要找到导数更改符号的位置.由于Da $ x中给出的x值在增加,因此Each
Da$y[i] - Da$y[i-1]在第i th 点具有与导数相同的符号.您可以通过查找连续值乘积为负的值来查找这些更改的符号.因此,将所有这些放在一起,我们得到:

When you take density of a:   Da = density(a)   the result has the y values associated with many x's. That is where the plot comes from. To find the "turning points", you need to find the places that the derivative changes sign. Since the x values given in Da$x are increasing, Each
Da$y[i] - Da$y[i-1] has the same sign as the derivative at the ith point. You can find where these change sign by finding where the product of consecutive values is negative. So, putting this all together, we get:

Da = density(a)
DeltaY = diff(Da$y)
Turns = which(DeltaY[-1] * DeltaY[-length(DeltaY)] < 0) + 1

plot(Da, xlab="", ylab="", main="")
points(Da$x[Turns], Da$y[Turns], pch=16, col="red")

使用densityadjust参数可以获得不同的窗口宽度".但是,您会发现,当您减小adjust时,密度图将出现许多最大值和最小值.

You can get different "window widths" using the adjust parameter to density. However, you are going to find that as you make adjust smaller, the density plot will develop many maxima and minima.

这篇关于当窗口宽度变化时如何找到内核密度曲线上的所有转折点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆