使用CUDNN_STATUS_ALLOC_FAILED的Tensorflow崩溃 [英] Tensorflow crash with CUDNN_STATUS_ALLOC_FAILED
问题描述
由于在网上搜索了好几个小时都没有结果,所以我想在这里问.
Been searching the web for hours with no results, so figured I'd ask here.
我正在按照Sentdex的教程制作自动驾驶汽车,但是在运行模型时,出现了一系列致命错误.我已经在互联网上搜索了解决方案,但许多人似乎都遇到了同样的问题.但是,我没有找到任何解决方案(包括此Stack-post ),为我工作.
I'm trying to make a self driving car following Sentdex's tutorial, but when running the model, I get a bunch of fatal errors. I've searched all over the internet for the solution, and many seem to have the same problem. However, none of the solutions I've found (Including this Stack-post), work for me.
这是我的软件:
- Tensorflow:1.5,GPU版本
- CUDA:9.0,带有修补程序
- CUDnn:7
- Windows 10专业版
- Python 3.6
硬件:
- Nvidia 1070ti,带有最新驱动程序
- 英特尔i5 7600K
以下是崩溃日志:
2018-02-04 16:29:33.606903: E C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_blas.cc:444] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2018-02-04 16:29:33.608872: E C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_blas.cc:444] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2018-02-04 16:29:33.609308: E C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_blas.cc:444] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2018-02-04 16:29:35.145249: E C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_dnn.cc:385] could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2018-02-04 16:29:35.145563: E C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_dnn.cc:352] could not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM
2018-02-04 16:29:35.149896: F C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\kernels\conv_ops.cc:717] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo<T>(), &algorithms)
2018-02-04 16:29:33.606903: E C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_blas.cc:444] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2018-02-04 16:29:33.608872: E C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_blas.cc:444] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2018-02-04 16:29:33.609308: E C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_blas.cc:444] failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
2018-02-04 16:29:35.145249: E C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_dnn.cc:385] could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2018-02-04 16:29:35.145563: E C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_dnn.cc:352] could not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM
2018-02-04 16:29:35.149896: F C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\kernels\conv_ops.cc:717] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo<T>(), &algorithms)
这是我的代码:
import tensorflow as tf
import numpy as np
import cv2
import time
from PIL import ImageGrab
from getkeys import key_check
from alexnet import alexnet
import os
from sendKeys import PressKey, ReleaseKey, W,A,S,D,Sp
import random
WIDTH = 80
HEIGHT = 60
LR = 1e-3
EPOCHS = 10
MODEL_NAME = 'DiRT-AI-Driver-{}-{}-{}-epochs.model'.format(LR, 'alexnetv2', EPOCHS)
def straight():
PressKey(W)
ReleaseKey(A)
ReleaseKey(S)
ReleaseKey(D)
ReleaseKey(Sp)
def left():
PressKey(A)
ReleaseKey(W)
ReleaseKey(S)
ReleaseKey(D)
ReleaseKey(Sp)
def right():
PressKey(D)
ReleaseKey(A)
ReleaseKey(S)
ReleaseKey(W)
ReleaseKey(Sp)
def brake():
PressKey(S)
ReleaseKey(A)
ReleaseKey(W)
ReleaseKey(D)
ReleaseKey(Sp)
def handbrake():
PressKey(Sp)
ReleaseKey(A)
ReleaseKey(S)
ReleaseKey(D)
ReleaseKey(W)
model = alexnet(WIDTH, HEIGHT, LR)
model.load(MODEL_NAME)
def main():
last_time = time.time()
for i in list(range(4))[::-1]:
print(i+1)
time.sleep(1)
paused = False
while(True):
if not paused:
screen = np.array(ImageGrab.grab(bbox=(0,40,1024,768)))
screen = cv2.cvtColor(screen,cv2.COLOR_BGR2GRAY)
screen = cv2.resize(screen,(80,60))
print('Loop took {} seconds'.format(time.time()-last_time))
last_time = time.time()
print('took time')
prediction = model.predict([screen.reshape(WIDTH,HEIGHT,1)])[0]
print('predicted')
moves = list(np.around(prediction))
print('got moves')
print(moves,prediction)
if moves == [1,0,0,0,0]:
straight()
elif moves == [0,1,0,0,0]:
left()
elif moves == [0,0,1,0,0]:
brake()
elif moves == [0,0,0,1,0]:
right()
elif moves == [0,0,0,0,1]:
handbrake()
keys = key_check()
if 'T' in keys:
if paused:
pased = False
time.sleep(1)
else:
paused = True
ReleaseKey(W)
ReleaseKey(A)
ReleaseKey(S)
ReleaseKey(D)
ReleaseKey(Sp)
time.sleep(1)
main()
我发现使python崩溃并产生前三个bug的行是以下行:
I've found that the line that crashes python and spawns the first three bugs is this line:
-
prediction = model.predict([screen.reshape(WIDTH,HEIGHT,1)])[0]
运行代码时,CPU的运行速度高达100%,这表明有严重问题. GPU的使用率约为40-50%
When running the code, the CPU goes up to a whopping 100%, suggesting that something is seriously off. GPU goes to about 40-50%
我尝试过Tensorflow 1.2和1.3以及CUDA 8,效果不佳.安装CUDA时,我不安装特定的驱动程序,因为它们对于我的GPU而言太旧了.也尝试过不同的CUDnn,效果不好.
I've tried Tensorflow 1.2 and 1.3, as well as CUDA 8, to no good. When installing CUDA I do not install the specific drivers, since they are too old for my GPU. Tried different CUDnn's too, did no good.
推荐答案
就我而言,发生此问题是因为正在运行另一个导入了tensorflow
的python控制台.关闭它可以解决问题.
In my case, the issue happened because another python console with tensorflow
imported was running. Closing it solved the problem.
我有Windows 10,主要错误是:
I have Windows 10, the main errors were :
无法创建cublas句柄:CUBLAS_STATUS_ALLOC_FAILED
failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED
无法创建cudnn句柄:CUDNN_STATUS_ALLOC_FAILED
Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
这篇关于使用CUDNN_STATUS_ALLOC_FAILED的Tensorflow崩溃的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!