实现DBSCAN算法查询MongoDB数据库的最佳编程语言? [英] Best programming language to implement DBSCAN algorithm querying a MongoDB database?
问题描述
我必须实现DBSCAN算法. 假设从此伪代码开始
I've to implement the DBSCAN algorithm. Assuming to start from this pseudocode
DBSCAN(D, eps, MinPts)
C = 0
for each unvisited point P in dataset D
mark P as visited
NeighborPts = regionQuery(P, eps)
if sizeof(NeighborPts) < MinPts
mark P as NOISE
else
C = next cluster
expandCluster(P, NeighborPts, C, eps, MinPts)
expandCluster(P, NeighborPts, C, eps, MinPts)
add P to cluster C
for each point P' in NeighborPts
if P' is not visited
mark P' as visited
NeighborPts' = regionQuery(P', eps)
if sizeof(NeighborPts') >= MinPts
NeighborPts = NeighborPts joined with NeighborPts'
if P' is not yet member of any cluster
add P' to cluster C
regionQuery(P, eps)
return all points within P's eps-neighborhood
我的代码必须在具有64位Ubuntu Linux的 Amazon EC2 实例上运行.
My code has to run on an Amazon EC2 Instance with Ubuntu Linux 64 bit.
函数 regionQuery 查询 MongoDB 数据库,以获取P的eps邻域内的所有点.
The function regionQuery queries a MongoDB database to obtain all points within P's eps-neighborhood.
那么,据您介绍,什么是实现它以提高性能的最佳编程语言? C , PHP , Java (我不认为)?
So, according to you, what is the best programming language to implement it to improve performances? C, PHP, Java (I don't think)?
推荐答案
我认为您有很多要点,需要快速得出结果-否则您几乎可以使用任何东西.
I assume that you have a lot of points and need results fast - otherwise you can use almost anything.
对我来说,这似乎像减少地图工作
It seems like map-reduce job for me
地图部分将是针对每个未访问点"的循环,并且应发出包含以下内容的数据结构: 邻居,候选群体等等.如果将点归类为噪声,则不应发出任何声音.
Map part would be loop "for each unvisited point" and should emit data construct containing neighbors, candidate clusters and whatever else. In case point is classified as noise it should emit nothing.
集群扩展应归结为减少部分,并可能最终确定部分-语言选择将是javascript,而所有操作都将在mongo内部进行
Cluster expansion shall go into reduce and possibly finalize part - also language choice would be javascript and everything would happen inside mongo
这篇关于实现DBSCAN算法查询MongoDB数据库的最佳编程语言?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!