R中的Cor函数产生错误 [英] Cor function in R producing errors
问题描述
代码的数据文件位于此处:
https://d396qusza40orc.cloudfront.net/rprog%2Fdata%2Fspecdata.zip
代码
corr< -function(directory,threshold = 0){
files.list = list.files(directory,full.names = TRUE, pattern =。csv)
comp.sum< -numeric()
num< -numeric()
for(i in 1:332){
data< -read.csv(files.list [i])
data.cor< -na.omit(data [,2:3])
comp.sum< -sum(data.cor)
,如果
{
comp.sum>阈值
cor.var< -cor(data.cor,use = all.obs)
}
else
{
num
}
}
cor.var
}
我对函数进行了一些修改以获得所需的内容。当然,这是假定硫酸盐和硝酸盐总是在第2列和第3列中,并且该目录中没有其他csvs(就像这些列中有数字一样,将为其他内容计算相关系数)。
corr< -function(directory,threshold = 0){
files.list = list.files(directory,full.names = TRUE ,pattern =。csv)
cors<-rep(0,length(files.list))
for(i in 1:length(files.list)){
data< ; -read.csv(files.list [i],header = TRUE)
data.cor< -na.omit(data [,2:3])
nobs< -nrow(data.cor )
if(nobs> threshold){
cors [i]< -cor(data.cor [,1],data.cor [,2])
} else {
cors [i]<-0
}
}
return(cors)
}
i'v been trying to write a function that takes a directory of data files and a threshold for complete cases and calculates the correlation between sulfate and nitrate for monitor locations where the number of completely observed cases (on all variables) is greater than the threshold. The function should return a vector of correlations for the monitors that meet the threshold requirement. If no monitors meet the threshold requirement, then the function should return a numeric vector of length 0. There are multiple errors being generated hence i'm not mentioning them here.
The data files for the code are here: https://d396qusza40orc.cloudfront.net/rprog%2Fdata%2Fspecdata.zip
Code
corr<-function(directory, threshold=0){
files.list=list.files(directory, full.names=TRUE, pattern=".csv")
comp.sum<-numeric()
num<-numeric()
for(i in 1:332){
data<-read.csv(files.list[i])
data.cor<-na.omit(data[,2:3])
comp.sum<-sum(data.cor)
if
{
comp.sum>threshold
cor.var<-cor(data.cor, use="all.obs")
}
else
{
num
}
}
cor.var
}
I modified the function a bit to get what you would like. This of course assumes that sulfate and nitrate are always in column 2 and 3 and that there are no other csvs in that directory (as if there are numbers in those columns a correlation coefficient would be calculated for something else).
corr<-function(directory, threshold=0){
files.list=list.files(directory, full.names=TRUE, pattern=".csv")
cors <- rep(0, length(files.list))
for(i in 1:length(files.list)){
data<-read.csv(files.list[i], header = TRUE)
data.cor<-na.omit(data[,2:3])
nobs<-nrow(data.cor)
if(nobs > threshold){
cors[i]<-cor(data.cor[,1], data.cor[,2])
}else{
cors[i] <- 0
}
}
return(cors)
}
这篇关于R中的Cor函数产生错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!