在Python中访问R用户定义的函数 [英] Accessing a R user defined function in Python
问题描述
因此,我需要使用交叉验证来进行主成分回归,而我在Python中找不到可以做到这一点的软件包.我编写了自己的PCR类别,但在使用R的pls软件包进行测试时,它在高维数据(〜50000个特征)上的性能明显较差,并且速度要慢得多,我仍然不确定为什么,但这是另一个问题.因为我所有其他代码都在python中,并且为了节省时间,所以我决定最好的方法可能只是编写一个利用R中的PLS包的R函数.这是该函数:
So I need to do Principle Component Regression with cross validation and I could not find a package in Python that would do so. I wrote my own PCR class but when tested against R's pls package it performs significantly worse and is much slower on high dimensional data (~50000 features) which I am still not sure why but that is another question. Because all of my other code is in python, and in the interest of saving time I decided the best way might just be able to write an R function that utilizes the PLS package in R. Here is the function:
R_pls <-function(X_train,y_train,X_test){
library(pls)
X<-as.matrix(X_train)
y<-as.matrix(y_train)
tdata<-data.frame(y,X=I(X))
REGmodel <- pcr(y~X,scale=FALSE,data=tdata,validation="CV")
B<-RMSEP(REGmodel)
C<-B[[1]]
q<-length(C)
degs<-c(1:q)
allvals<-C[degs%%2==0]
allvals<-allvals[-1]
comps<-which.min(allvals)
xt<-as.matrix(X_test)
ndata<-data.frame(X=I(xt))
ypred_test<-as.data.frame(predict(REGmodel,ncomp=comps,newdata=ndata,se.fit=TRUE))
ntdata<-data.frame(X=I(X))
ypred_train<-as.data.frame(predict(REGmodel,ncomp=comps,newdata=ntdata,se.fit=TRUE))
data_out=list(ypred_test=ypred_test,ypred_train=ypred_train)
return(data_)
}
因此,我已经找到了很多有关如何访问内置函数R的信息,但对于这种情况并没有真正找到任何东西.所以我绑了以下东西:
So I have found a good amount of information on how to access R built in functions but cannot really find anything for this situation. So I tied the following:
import rpy2.robjects as ro
prs=ro('R_pls')
其中R_pls是上面的R函数.这会产生
where R_pls is the R function above. This produces
TypeError: 'module' object is not callable.
任何想法我如何使它起作用的,如果有更好的方法,我也乐于接受建议.
Any idea how I might get this to work or I am open to suggestions if there might be a better method.
谢谢
推荐答案
Consider importing the abitrary R user-defined function as a package with rpy2's SignatureTranslatedAnonymousPackage (STAP):
from rpy2.robjects.numpy2ri import numpy2ri, pandas2ri
from rpy2.robjects.packages import STAP
# for rpy2 < 2.6.1
# from rpy2.robjects.packages import SignatureTranslatedAnonymousPackage as STAP
r_fct_string = """
R_pls <- function(X_train, y_train, X_test){
library(pls)
X <- as.matrix(X_train)
y <- as.matrix(y_train)
xt <- as.matrix(X_test)
tdata <- data.frame(y,X=I(X))
REGmodel <- pls::pcr(y~X,scale=FALSE,data=tdata,validation="CV")
B <- RMSEP(REGmodel)
C <- B[[1]]
q <- length(C)
degs <- c(1:q)
allvals <- C[degs%%2==0]
allvals <- allvals[-1]
comps <- which.min(allvals)
ndata <- data.frame(X=I(xt))
ypred_test <- as.data.frame(predict(REGmodel,ncomp=comps,newdata=ndata,se.fit=TRUE))
ntdata <- data.frame(X=I(X))
ypred_train <- as.data.frame(predict(REGmodel,ncomp=comps,newdata=ntdata,se.fit=TRUE))
data_out <- list(ypred_test=ypred_test, ypred_train=ypred_train)
return(data_out)
}
"""
r_pkg = STAP(r_fct_string, "r_pkg")
# CONVERT PYTHON NUMPY MATRICES TO R OBJECTS
r_X_train, r_y_train, r_X_test = map(numpy2ri, py_X_train, py_y_train, py_X_test)
# PASS R OBJECTS INTO FUNCTION (WILL NEED TO EXTRACT DFs FROM RESULT)
p_res = r_pkg.R_pls(r_X_train, r_y_train, r_X_test)
或者,如果函数保存在单独的.R脚本中,则可以按@agstudy显示的此处的方式获取函数.像任何Python函数一样调用它.
Alternatively, you can source the function as @agstudy shows here if function is saved in a separate .R script then call it like any Python function.
import rpy2.robjects as ro
robjects.r('''source('my_R_pls_func.r')''')
r_pls = ro.globalenv['R_pls']
# CONVERT PYTHON NUMPY MATRICES TO R OBJECTS
r_X_train, r_y_train, r_X_test = map(numpy2ri, py_X_train, py_y_train, py_X_test)
# PASS R OBJECTS INTO FUNCTION (WILL NEED TO EXTRACT DFs FROM RESULT)
p_res = r_pls(r_X_train, r_y_train, r_X_test)
这篇关于在Python中访问R用户定义的函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!