计算大数据集的欧几里得距离 [英] Calculating Euclidean Distance for Large DataSets

查看:91
本文介绍了计算大数据集的欧几里得距离的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须计算火车和测试数据之间的欧几里得距离.火车数据的总长度为1389,而测试数据的总长度为364.基本上,这是来自统计学习的要素".

I have to calculate Euclidean distance between train and test data. the total length of train data is 1389 and for test data is 364. It is basically the data from the handwritten ZIP codes on envelopes from U.S. postal mail, downloaded from the website of "Elements of Statistical learning".

我是一个初学者,只是读取R包中的数据.我无法开始计算火车和测试数据之间的距离.谁能帮助我,让我知道如何为这些数据生成循环?

I am a beginner and just read the data in R package. I'm unable to start calculating distance between train and test data. Can anyone help me out to give me an idea that how to generate a loop for this data?

我会很感激.

推荐答案

对于欧几里得距离,我喜欢使用 fields 包中的 rdist .与 stats 包中的 dist 相比,优点之一是它可以采用两个矩阵作为输入:

For Euclidian distances, I like using rdist from the fields packages. One advantage over dist from the stats package, is that it can take two matrices as input:

train.data <- matrix(runif(1389*2), ncol = 2)
test.data  <- matrix(runif(364*2),  ncol = 2)

library(fields)
distances <- rdist(train.data, test.data)
dim(distances)
# [1] 1389  364

这篇关于计算大数据集的欧几里得距离的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆