计算大数据集的欧几里得距离 [英] Calculating Euclidean Distance for Large DataSets
问题描述
我必须计算火车和测试数据之间的欧几里得距离.火车数据的总长度为1389,而测试数据的总长度为364.基本上,这是来自统计学习的要素".
I have to calculate Euclidean distance between train and test data. the total length of train data is 1389 and for test data is 364. It is basically the data from the handwritten ZIP codes on envelopes from U.S. postal mail, downloaded from the website of "Elements of Statistical learning".
我是一个初学者,只是读取R包中的数据.我无法开始计算火车和测试数据之间的距离.谁能帮助我,让我知道如何为这些数据生成循环?
I am a beginner and just read the data in R package. I'm unable to start calculating distance between train and test data. Can anyone help me out to give me an idea that how to generate a loop for this data?
我会很感激.
推荐答案
对于欧几里得距离,我喜欢使用 fields
包中的 rdist
.与 stats
包中的 dist
相比,优点之一是它可以采用两个矩阵作为输入:
For Euclidian distances, I like using rdist
from the fields
packages. One advantage over dist
from the stats
package, is that it can take two matrices as input:
train.data <- matrix(runif(1389*2), ncol = 2)
test.data <- matrix(runif(364*2), ncol = 2)
library(fields)
distances <- rdist(train.data, test.data)
dim(distances)
# [1] 1389 364
这篇关于计算大数据集的欧几里得距离的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!