遍历numpy数组的行以查找模式 [英] Iterate through rows of numpy array to find mode

查看:370
本文介绍了遍历numpy数组的行以查找模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建决策树分类器功能,该功能将构建决策树的集合并根据所有树的多数投票预测做出最终预测.我的方法是建立一个矩阵,将每个决策树的预测放在单独的列中,然后针对每一行(对应于每个数据点),找到模态值以对该数据点进行最终预测.

I'm trying to create a decision tree classifier function that will build an ensemble of decision trees and make the final prediction based on the majority vote prediction from all the trees. My approach is to build a matrix that has each decision tree's prediction in a separate column, and then for every row (corresponding to each data point), finding the modal value to make the final prediction for that data point.

到目前为止,我的功能是:

So far my function is:

def majority_classify(x_train, y_train, x_test, y_test, num_samples):

n = x_train.shape[0]
c=len(np.unique(y_train))

votes=np.zeros((n, c))
predictions_train=np.empty((n, num_samples+1))
predictions_test=np.empty((n, num_samples))


for i in range(0, num_samples):
    # Randomly a sample points from the train set of size 'n'
    indices = np.random.choice(np.arange(0, n), size=n)

    x_train_sample = x_train[indices, :]
    y_train_sample = y_train[indices]

    dt_major = tree.DecisionTreeClassifier(max_depth = 2)
    model_major = dt_major.fit(x_train, y_train)

    predictions_train[:,i]=model_major.predict(x_train)




for r in predictions_train:
    predict_train = mode(r)[0][0]

但是,我遇到的麻烦是弄清楚如何遍历每一行并找到模式.有什么建议吗?

However, what I'm having trouble with is figuring how to iterate through each row and find the mode. Any suggestions?

谢谢!

推荐答案

  • np.uniquereturn_counts参数一起使用.
  • 使用counts数组上的argmax从唯一数组中获取值.
  • np.apply_along_axis用于自定义功能mode
    • use np.unique with the return_counts parameter.
    • use the argmax on the counts array to get value from unique array.
    • use np.apply_along_axis for a custom function mode
    • def mode(a):
          u, c = np.unique(a, return_counts=True)
          return u[c.argmax()]
      
      a = np.array([
              [1, 2, 3],
              [2, 3, 4],
              [3, 4, 5],
              [2, 5, 6],
              [4, 1, 7],
              [5, 4, 8],
              [6, 6, 3]
          ])
      
      np.apply_along_axis(mode, 0, a)
      
      array([2, 4, 3])
      

      这篇关于遍历numpy数组的行以查找模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆