通过@ leader @ model访问automl领导者时返回空列表 [英] Empty list returned when accessing automl leader via @leader@model

查看:128
本文介绍了通过@ leader @ model访问automl领导者时返回空列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

运行h2o.automl()返回页首横幅中的单个模型;但是,当尝试通过@leader@model访问实际模型时,会发生以下错误:

Running h2o.automl() returns a single model in leaderboard; however, when trying to access the actual model via @leader@model, the following error ensues:

is.H2OFrame(x)中的错误:尝试从对象获取插槽指标" 没有插槽的基本类("NULL")的

Error in is.H2OFrame(x) : trying to get slot "metrics" from an object of a basic class ("NULL") with no slots

同样,在领导者模型上调用h2o.predict()时,也会收到错误消息:

As well, when calling h2o.predict() on the leader model, got the error message:

.h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page ,:错误消息:在以下位置未找到对象'dummy' 功能:预测参数:模型

Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page, : ERROR MESSAGE: Object 'dummy' not found in function: predict for argument: model

使用R中的h2o v3.20.0.2在同一会话中运行模型.

Model was run in the same session using h2o v3.20.0.2 in R.

推荐答案

我认为发生的事情是您无法在一小时内训练一个模型,因此当您尝试收集领导者模型时,它试图抓住一个不完整的模型,你会得到一个错误.您没有很多行,但是您有很多列.

I think what's happening is that you're not able to train a single model in one hour, so when you try to collect the leader model, it's trying to grab an incomplete model and you get an error. You don't have very many rows, but you have a really large number of columns.

由于很难预测模型训练将花费多长时间,因此我将使用max_models参数而不是受时间限制.由于AutoML到达max_modelsmax_runtime_secs的第一个时将停止,因此我将max_runtime_secs设置为一个非常大的数字(例如999999999),然后设置max_models = 10或您喜欢的任何数字.

Since it's hard to predict how long the model training will take, I'd use the max_models argument instead of limiting by time. Since AutoML will stop when it reaches the first of max_models or max_runtime_secs, I'd set max_runtime_secs to a very large number (e.g. 999999999) and then set max_models = 10 or whatever number you like.

第二,由于您拥有非常广泛的数据,因此建议您关闭随机森林和GBM模型,并保留GLM和深度学习模型.为此,设置exclude_algos = c("DRF", "GBM").在12万列上训练基于树的模型将花费很长时间.

Second, since you have very wide data, I'd recommend turning off the Random Forests and GBM models, and leaving the GLM and Deep Learning models. To do that, set exclude_algos = c("DRF", "GBM"). It will take a really long time to train tree-based models on 120k columns.

要考虑的另一个不错的选择是首先应用 GLRM 添加到数据中,以将维数减少到< 500列,然后可以在AutoML运行中包含基于树的模型.

Another good option to consider is to first apply PCA or GLRM to your data to reduce the dimensionality to <500 columns and then you can include the tree-based models in the AutoML run.

这篇关于通过@ leader @ model访问automl领导者时返回空列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆