根据不同R包中的GPD计算退货水平 [英] Calculation of return levels based on a GPD in different R packages

查看:89
本文介绍了根据不同R包中的GPD计算退货水平的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在对气象数据进行极值分析,以精确到以mm/d为单位的降水数据.我正在使用阈值超额方法通过最大似然法来估计广义Pareto分布的参数.

I am performing an extreme value analysis for meteorological data, to be precise for precipitation data available in mm/d. I am using a threshold excess approach for estimating the parameters of a generalized Pareto distribution with a maximum likelihood method.

目的是计算每日降水的几个回归水平(即2、5、10、20、50、100年的事件).

The aim is to calculate several return levels (i.e. the 2, 5, 10, 20, 50, 100 year event) for daily precipitation.

尽管R代码可以正常工作,但是我想知道为什么在基于具有不同封装的GPD拟合的分位数计算返回水平时得到明显不同的结果.即使每个包装中GPD的估计参数几乎相同,但分位数也相差很大.

While the R code works fine, I am wondering why I get clearly different results when calculating return levels based on the quantiles of the fitted GPD with different packages. Even though the estimated parameters of the GPD are almost identical in each package, the quantiles differ a lot.

我使用的软件包是: ismev,extRemes,evir和POT.

The packages I used are: ismev, extRemes, evir and POT.

我猜想GPD参数的不同估计是由于不同的计算程序造成的,但是我不明白为什么分位数的计算会因不同的软件包而有如此大的差异.

I guess that the different estimates for the parameters of the GPD are due to different calculation routines, but I do not understand why the calculation of the quantiles differs that much depending on the different packages.

虽然lmom,evir和POT返回相同的量化值,但是从extRemes软件包获得的返回周期不同于其他结果.

while lmom, evir and POT return the same quanatile values, the return period derived from the extRemes package differs from the other results.

# packages
library(ismev)
library(extRemes)
library(evir)
library(POT)
library(lmom)

th <- 50

# sample data:
potvalues <- c(
  58.5,44.2,49.6,59.3,48.3,60.9,94.5,47.1,45.3,57.6,48.2,46.2,44.2,50.6,42.1,52.7,80.9,
  58.5,51.3,48.4,51.7,71.9,60.1,64.4,43.5,55.5,49.3,58.2,47.5,43.7,45.2,52.8,42.2,46.4,
  96.1,47.5,50.1,42.4,60.9,72.6,51.6,59.4,80.5,63.7,59.9,45.0,66.7,47.6,53.3,43.1,51.0,
  46.2,53.6,59.8,51.7,46.7,42.6,44.5,45.0,50.0,44.0,89.9,44.2,47.8,53.3,43.0,55.7,44.6,
  44.6,54.9,45.1,43.9,78.7,45.5,64.0,42.7,47.4,57.0,105.4,64.3,43.2,50.4,80.2,49.9,71.6,
  47.4,44.1,47.6,55.2,44.4,78.6,50.8,42.4,47.1,43.5,51.4)

#------------------------------------------------------------------------------------------#

# MLE Fitting of GPD - package extRemes

# fit gpd
pot.ext <- fevd(potvalues, method = "MLE", type="GP", threshold=th)

# return levels:
rl.extremes <-  return.level(pot.ext, conf = 0.05,
                             return.period= c(2,5,10,20,50,100))
rl.extremes <- as.numeric(rl.extremes)

#------------------------------------------------------------------------------------------#

# MLE Fitting of GPD - package ismev

pot.gpd <- gpd.fit(potvalues, threshold=th)

s1 <- quagpa(f=.99, para=c(pot.gpd$threshold, pot.gpd$mle[1],-pot.gpd$mle[2])) # 100
s2 <- quagpa(f=.98, para=c(pot.gpd$threshold, pot.gpd$mle[1],-pot.gpd$mle[2])) #  50
s3 <- quagpa(f=.95, para=c(pot.gpd$threshold, pot.gpd$mle[1],-pot.gpd$mle[2])) #  20
s4 <- quagpa(f=.90, para=c(pot.gpd$threshold, pot.gpd$mle[1],-pot.gpd$mle[2])) #  10
s5 <- quagpa(f=.80, para=c(pot.gpd$threshold, pot.gpd$mle[1],-pot.gpd$mle[2])) #   5
s6 <- quagpa(f=.50, para=c(pot.gpd$threshold, pot.gpd$mle[1],-pot.gpd$mle[2])) #   2

rl.ismev <- c(s6, s5, s4, s3, s2, s1)

#------------------------------------------------------------------------------------------#

# MLE Fitting of GPD - package evir

# fit gpd
gpd.evir <- gpd(potvalues, threshold=th)

# plot
evirplot <- plot(gpd.evir)
1 # Excess Distribution
0 # exit

x100 <- gpd.q(pp=.99, x=evirplot) # 100
x050 <- gpd.q(pp=.98, x=evirplot) #  50
x020 <- gpd.q(pp=.95, x=evirplot) #  20
x010 <- gpd.q(pp=.90, x=evirplot) #  10
x005 <- gpd.q(pp=.80, x=evirplot) #   5
x002 <- gpd.q(pp=.50, x=evirplot) #   2

rl.evir <- t(rbind(x002,x005,x010,x020,x050,x100))
rl.evir <- as.numeric(rl.evir[2,])

#------------------------------------------------------------------------------------------#

# MLE Fitting of GPD - package POT

gpd.pot <- fitgpd(potvalues, threshold=th)
quant = c(0.50, 0.80, 0.90, 0.95, 0.98, 0.99)
rtp <- c(2,5,10,20,50,100)

retvec <- vector()
for (i in quant){
  x <- POT::qgpd(i, loc = th, scale = as.numeric(gpd.pot$param[1]),
            shape = as.numeric(gpd.pot$param[2]))
  retvec <- c(retvec,x)
}

rl.pot <- retvec

#------------------------------------------------------------------------------------------#
# comparison of results - return periods
result <- cbind(rl.extremes,rl.ismev, rl.evir, rl.pot)
round(result, 2)

#------------------------------------------------------------------------------------------#
# comparison of estimated parameters
param.extremes <- pot.ext$results$par # extremes
param.ismev <- pot.gpd$mle # ismev
param.evir <- c(gpd.evir$par.ests[2],gpd.evir$par.ests[1])  # evir
param.pot <- gpd.pot$param # POT

parameters <- cbind(param.extremes, param.ismev , param.evir, param.pot)
round(parameters, 4)

#------------------------------------------------------------------------------------------#

推荐答案

针对此问题的解决方案的说明例如在Coles的书中(极值统计建模简介,第4.3.3章).虽然GEV的回报水平可以直接从其分位数中得出,但是在计算GEV的回报水平时,必须考虑所谓的超标率(即,每年的事件数或一个事件分别超过阈值的可能性). GP在峰值超出阈值范围内.

The solution for this problem is described e.g. in Coles book (An Introduction to Statistical Modeling of Extreme Values, Chapter 4.3.3). While the return levels for a GEV can be derived rather directly from its quantiles, the so called exceedance rate (i.e. number of events per year or the likelihood, that an event exceeds the threshold respectively) has to be considered when calculating return levels for a GP within the scope of a peak over threshold appoach.

N年回报水平定义为

因此,仅计算GP分布的分位数而不考虑超出率时,就无法获得有意义的回报水平结果. extRemes软件包考虑了超出率,如果未指定,则POT和evir软件包中每年事件数的默认值设置为1.

Thus it does not work to obtain meaningful results for return levels when simply calculating the quantiles for the GP distribution without considering the exceedance rate. The extRemes package takes the exceedance rate into account, while the default value for the number of events per year in the POT and evir packages is set to 1 if unspecified.

这篇关于根据不同R包中的GPD计算退货水平的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆