遍历numpy数组的行以查找模式 [英] Iterate through rows of numpy array to find mode
问题描述
我正在尝试创建决策树分类器功能,该功能将构建决策树的集合并根据所有树的多数投票预测做出最终预测.我的方法是建立一个矩阵,将每个决策树的预测放在单独的列中,然后针对每一行(对应于每个数据点),找到模态值以对该数据点进行最终预测.
I'm trying to create a decision tree classifier function that will build an ensemble of decision trees and make the final prediction based on the majority vote prediction from all the trees. My approach is to build a matrix that has each decision tree's prediction in a separate column, and then for every row (corresponding to each data point), finding the modal value to make the final prediction for that data point.
到目前为止,我的功能是:
So far my function is:
def majority_classify(x_train, y_train, x_test, y_test, num_samples):
n = x_train.shape[0]
c=len(np.unique(y_train))
votes=np.zeros((n, c))
predictions_train=np.empty((n, num_samples+1))
predictions_test=np.empty((n, num_samples))
for i in range(0, num_samples):
# Randomly a sample points from the train set of size 'n'
indices = np.random.choice(np.arange(0, n), size=n)
x_train_sample = x_train[indices, :]
y_train_sample = y_train[indices]
dt_major = tree.DecisionTreeClassifier(max_depth = 2)
model_major = dt_major.fit(x_train, y_train)
predictions_train[:,i]=model_major.predict(x_train)
for r in predictions_train:
predict_train = mode(r)[0][0]
但是,我遇到的麻烦是弄清楚如何遍历每一行并找到模式.有什么建议吗?
However, what I'm having trouble with is figuring how to iterate through each row and find the mode. Any suggestions?
谢谢!
推荐答案
- 将
np.unique
与return_counts
参数一起使用. - 使用counts数组上的
argmax
从唯一数组中获取值. - 将
np.apply_along_axis
用于自定义功能mode
- use
np.unique
with thereturn_counts
parameter. - use the
argmax
on the counts array to get value from unique array. - use
np.apply_along_axis
for a custom functionmode
def mode(a):
u, c = np.unique(a, return_counts=True)
return u[c.argmax()]
a = np.array([
[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[2, 5, 6],
[4, 1, 7],
[5, 4, 8],
[6, 6, 3]
])
np.apply_along_axis(mode, 0, a)
array([2, 4, 3])
这篇关于遍历numpy数组的行以查找模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!