将 sklearn GridSearchCV 的所有输出打印到文件? [英] Printing all output from sklearn GridSearchCV to file?
问题描述
我正在使用 sklearn
运行长时间的网格搜索,并且我想将 all(强调 all)控制台输出记录到文件中.使用 >
从终端运行并将 stdout 更改为打开的文件等都可以工作......但只是部分 这是公认的答案 此处.print
调用的任何内容都会保存到文件中,但 不是 控制台上显示的所有内容 都会保存.特别适用于:
I am running a long grid-search using sklearn
and I want to log all (emphasis all) console output to file. Running from terminal using >
and changing stdout to an open file etc. all work ... but only partially which is the accepted answer here. Anything called by print
does get saved to file, but not everything shown on console is saved. In particular for:
Fitting 5 folds for each of 128 candidates, totalling 640 fits
[Parallel(n_jobs=4)]: Done 42 tasks | elapsed: 2.7s
[Parallel(n_jobs=4)]: Done 192 tasks | elapsed: 12.3s
[Parallel(n_jobs=4)]: Done 442 tasks | elapsed: 35.1s
[Parallel(n_jobs=4)]: Done 640 out of 640 | elapsed: 55.7s finished
第一行确实被保存到文件中.但是来自 [Parallel(n_jobs=4)]
的日志没有保存.相反:
the first line does get saved to file. But the logging from [Parallel(n_jobs=4)]
is not saved. Instead:
Fitting 5 folds for each of 128 candidates, totalling 640 fits
{'estimator__max_depth': 5, 'estimator__min_samples_leaf': 4, 'estimator__min_samples_split': 8}
...
...
第二行是我简单打印得到的最佳参数;[Parallel(n_jobs=4)]
中的所有内容都丢失了.有谁知道如何将其保存到文件中?
The second line is me simply printing best parameters obtained; everything from [Parallel(n_jobs=4)]
is lost. Does anyone know how to make this get saved to file also?
推荐答案
来自 source:
From source of the joblib
package used internally by sklearn
for parallelization:
def _print(self, msg, msg_args):
"""Display the message on stout or stderr depending on verbosity"""
# XXX: Not using the logger framework: need to
# learn to use logger better.
if not self.verbose:
return
if self.verbose < 50:
writer = sys.stderr.write
else:
writer = sys.stdout.write
msg = msg % msg_args
writer('[%s]: %s\n' % (self, msg))
因此使用 verbose=1
作为 OP 使用,重定向 stderr
应该捕获丢失的行.但是这样就不会得到 stdout
.因此,您可以使用这个答案合并它们并执行:
So with verbose=1
as the OP was using, redirecting stderr
ought to capture the missing lines. But then this will not get stdout
. So one can just merge them using this answer and doing:
# necessary imports
logfile = open('test.txt', 'w')
original_stderr = sys.stderr
original_stdout = sys.stdout
sys.stdout = Tee(sys.stdout, logfile)
sys.stderr = sys.stdout
.
.
[code to log]
.
.
sys.stdout = original_stdout
sys.stderr = original_stderr
logfile.close()
这篇关于将 sklearn GridSearchCV 的所有输出打印到文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!