快速计算大量贷款的贷款利率 [英] Fast loan rate calculation for a big number of loans

查看:113
本文介绍了快速计算大量贷款的贷款利率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大数据集(约20万行),其中每一行都是贷款.我有贷款金额,还款次数和还款额. 我正在尝试获得贷款利率. R没有计算该函数的功能(至少基础R没有它,而且我找不到它). 编写npv和irr函数并不难

I have a big data set (around 200k rows) where each row is a loan. I have the loan amount, the number of payments, and the loan payment. I'm trying to get the loan rate. R doesn't have a function for calculating this (at least base R doesn't have it, and I couldn't find it). It isn't that hard to write both a npv and irr functions

Npv <- function(i, cf, t=seq(from=0,by=1,along.with=cf)) sum(cf/(1+i)^t)
Irr <- function(cf) { uniroot(npv, c(0,100000), cf=cf)$root }

你可以做

rate = Irr(c(amt,rep(pmt,times=n)))

问题是当您尝试计算大量付款的汇率时.由于uniroot没有被向量化,并且rep花费了令人惊讶的时间,因此最终的计算速度很慢.如果您做一些数学运算并且发现您正在寻找下面方程的根,您可以使其更快

The problem is when you try to calculate the rate for a lot of payments. Because uniroot is not vectorized, and because rep takes a surprising amount of time, you end up with a slow calculation. You can make it faster if you do some math and figure out that you are looking for the roots of the following equation

zerome <- function(r) amt/pmt-(1-1/(1+r)^n)/r

,然后将其用作uniroot的输入.在我的PC中,这需要大约20秒钟才能运行200k数据库.

and then use that as input for uniroot. This, in my pc, takes around 20 seconds to run for my 200k database.

问题是我正在尝试进行一些优化,而这是优化的一步,所以我试图进一步加快它的速度.

The problem is that I'm trying to do some optimization, and this is a step of the optimization, so I'm trying to speed it up even more.

我已经尝试了向量化,但是由于uniroot没有被向量化,所以我无法走得更远.有矢量化的寻根方法吗?

I've tried vectorization, but because uniroot is not vectorized, I can't go further that way. Is there any root finding method that is vectorized?

谢谢

推荐答案

可以使用线性插值器代替使用根查找器.您将必须为每个n值(剩余还款数)创建一个插值器.每个插值器都将(1-1/(1+r)^n)/r映射到r.当然,您将必须构建足够精细的网格,以便它将r返回到可接受的精度级别.这种方法的优点是线性插值器快速且矢量化:您可以在一次调用相应插值器的情况下,找到具有相同还款额(n)的所有贷款的利率.

Instead of using a root finder, you could use a linear interpolator. You will have to create one interpolator for each value of n (the number of remaining payments). Each interpolator will map (1-1/(1+r)^n)/r to r. Of course you will have to build a grid fine enough so it will return r to an acceptable precision level. The nice thing with this approach is that linear interpolators are fast and vectorized: you can find the rates for all loans with the same number of remaining payments (n) in a single call to the corresponding interpolator.

现在有一些代码证明了它是可行的解决方案:

Now some code that proves it is a viable solution:

首先,我们创建插值器,每个插值器每个n的值:

First, we create interpolators, one for each possible value of n:

n.max <- 360L  # 30 years

one.interpolator <- function(n) {
    r <- seq(from = 0.0001, to = 0.1500, by = 0.0001)
    y <- (1-1/(1+r)^n)/r
    approxfun(y, r)
}

interpolators <- lapply(seq_len(n.max), one.interpolator)

请注意,我使用的精度为1/100的百分比(1bp).

Note that I used a precision of 1/100 of a percent (1bp).

然后我们创建一些虚假数据:

Then we create some fake data:

n.loans <- 200000L
n     <- sample(n.max, n.loans, replace = TRUE)
amt   <- 1000 * sample(100:500, n.loans, replace = TRUE)
pmt   <- amt / (n * (1 - runif(n.loans)))
loans <- data.frame(n, amt, pmt)

最后,我们解决r:

library(plyr)
system.time(ddply(loans, "n", transform, r = interpolators[[n[1]]](amt / pmt)))
#    user  system elapsed 
#   2.684   0.423   3.084

速度很快.请注意,某些输出速率为NA,但这是因为我的随机输入没有意义,并且返回的速率超出了我选择的[0〜15%]网格.您的真实数据不会有这个问题.

It's fast. Note that some of the output rates are NA but it is because my random inputs made no sense and would have returned rates outside of the [0 ~ 15%] grid I selected. Your real data won't have that problem.

这篇关于快速计算大量贷款的贷款利率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆