如何在R中的特征哈希矩阵上使用H2o [英] How to use H2o on feature hashed matrix in R

查看：85 发布时间：2020/11/22 1:08:17 r h2o

本文介绍了如何在R中的特征哈希矩阵上使用H2o的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在研究一个中等的数据集(train_data).有更多的124个变量和50000个观察值.对于分类变量，我已经通过R中的hashed.model.matrix函数对其进行了特征哈希处理.

I am working on a moderate data set (train_data). There are more 124 variables and 50,00,000 observations. For categorical variables, I have used feature hashing on it through hashed.model.matrix function in R.

## feature hashing
b <- 2 ^ 22
f <- ~ .-1
X_train <- hashed.model.matrix(f, train_data, hash.size=b)

因此，结果，我得到了一个大的dgCmatrix(稀疏矩阵)作为输出(X_train).如何在此矩阵上使用H2o包装器并使用H2o中可用的不同算法? H2o包装器是否采用稀疏矩阵(dgCmatrix).此类用法的任何链接/示例都将有所帮助.谢谢您的期待.

So, as a result , I have got a large dgCmatrix (a sparse matrix) as output (X_train). How can I use, H2o wrapper on this matrix and use different algorithms available in H2o ? Does H2o wrapper take sparse matrix (dgCmatrix). Any link / example of such usage will be helpful. Thanks in anticipation.

期待在H2o环境中导入X_train来进行步骤类型的简化

# initialize connection to H2O server
  h2o.init(nthreads = -1)
 train.hex <- h2o.uploadFile('./X_train', destination_frame='train')

# list of features for training
feature.names <- names(train.hex)

# train random forest model, use ntrees = 500 
drf <- h2o.randomForest(x=feature.names, y='outcome', training_frame,train.hex, ntrees =500)

推荐答案

您可以将稀疏矩阵保存为svmlight稀疏格式，然后使用

you could save your sparse matrix to svmlight sparse format, then use

train.hex <- h2o.uploadFile('./X_train', parse_type = "SVMLight", destination_frame='train')

svmlight稀疏格式还将被h2o.importFile()检测到，h2o.importFile()是并行读取器，可从客户端指定的位置从服务器提取信息.

svmlight sparse format will also be detected by h2o.importFile(), which is a parallelized reader and pulls information from the server from a location specified by the client.

train.hex <- h2o.importFile('./X_train', destination_frame='train')

这篇关于如何在R中的特征哈希矩阵上使用H2o的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在R中的特征哈希矩阵上使用H2o [英] How to use H2o on feature hashed matrix in R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何在R中的特征哈希矩阵上使用H2o [英] How to use H2o on feature hashed matrix in R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭