Tensorflow 基本示例错误:CUBLAS_STATUS_NOT_INITIALIZED [英] Tensorflow Basic Example Error: CUBLAS_STATUS_NOT_INITIALIZED

查看:152
本文介绍了Tensorflow 基本示例错误:CUBLAS_STATUS_NOT_INITIALIZED的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好,我正在尝试安装和运行 tensorflow 1.0.

Hello I am trying to install and run tensorflow 1.0.

我正在使用以下指南 https://www.tensorflow.org/get_started/mnist/初学者

但是,当我运行文件 mnist_softmax.py 时,出现以下错误.

However when I run the file mnist_softmax.py I get the following errors.

python3 mnist_softmax.py
Extracting /tmp/tensorflow/mnist/input_data/train-images-idx3-ubyte.gz
Extracting /tmp/tensorflow/mnist/input_data/train-labels-idx1-ubyte.gz
Extracting /tmp/tensorflow/mnist/input_data/t10k-images-idx3-ubyte.gz
Extracting /tmp/tensorflow/mnist/input_data/t10k-labels-idx1-ubyte.gz
2017-05-03 14:25:28.243213: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-03 14:25:28.243234: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-03 14:25:28.243238: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-05-03 14:25:28.243241: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-03 14:25:28.243244: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-05-03 14:25:28.436478: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties: 
name: GeForce GTX 1080 Ti
major: 6 minor: 1 memoryClockRate (GHz) 1.582
pciBusID 0000:02:00.0
Total memory: 10.91GiB
Free memory: 349.06MiB
2017-05-03 14:25:28.436501: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0 
2017-05-03 14:25:28.436505: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0:   Y 
2017-05-03 14:25:28.436510: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0)
2017-05-03 14:25:30.507057: E tensorflow/stream_executor/cuda/cuda_blas.cc:365] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2017-05-03 14:25:30.507091: W tensorflow/stream_executor/stream.cc:1550] attempting to perform BLAS operation using StreamExecutor without BLAS support
Traceback (most recent call last):
  File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1039, in _do_call
    return fn(*args)
  File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1021, in _run_fn
    status, run_metadata)
  File "/usr/lib/python3.5/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(100, 784), b.shape=(784, 10), m=100, n=10, k=784
     [[Node: MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](_recv_Placeholder_0/_9, Variable/read)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "mnist_softmax.py", line 79, in <module>
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
  File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "mnist_softmax.py", line 66, in main
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
  File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 778, in run
    run_metadata_ptr)
  File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 982, in _run
    feed_dict_string, options, run_metadata)
  File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1032, in _do_run
    target_list, options, run_metadata)
  File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1052, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(100, 784), b.shape=(784, 10), m=100, n=10, k=784
     [[Node: MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](_recv_Placeholder_0/_9, Variable/read)]]

Caused by op 'MatMul', defined at:
  File "mnist_softmax.py", line 79, in <module>
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
  File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "mnist_softmax.py", line 43, in main
    y = tf.matmul(x, W) + b
  File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 1801, in matmul
    a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
  File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 1263, in _mat_mul
    transpose_b=transpose_b, name=name)
  File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 768, in apply_op
    op_def=op_def)
  File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2336, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1228, in __init__
    self._traceback = _extract_stack()

InternalError (see above for traceback): Blas GEMM launch failed : a.shape=(100, 784), b.shape=(784, 10), m=100, n=10, k=784
     [[Node: MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](_recv_Placeholder_0/_9, Variable/read)]]

我不知道为什么会出现此错误,我也无法运行 matrixMulCUBLAS cuda 示例并出现以下错误.

I am not sure why I am getting this error, I also cannot run the matrixMulCUBLAS cuda example either and get the following error.

./matrixMulCUBLAS
[Matrix Multiply CUBLAS] - Starting...
GPU Device 0: "GeForce GTX 1080 Ti" with compute capability 6.1

MatrixA(640,480), MatrixB(480,320), MatrixC(640,320)
CUDA error at matrixMulCUBLAS.cpp:277 code=1(CUBLAS_STATUS_NOT_INITIALIZED) "cublasCreate(&handle)" 

所有 cuda 示例都可以工作,除非他们使用 CUBLAS,不确定这是否与我的 tensorflow 错误有关.

ALL cuda examples work UNLESS they use CUBLAS, not sure if this is related to my tensorflow error.

推荐答案

@FernandoMM 我让我的脚本在出现相同错误的地方运行.就我而言,我正在运行 GPU 的外部显示器,它耗尽了所有 GPU 内存.我断开了所有显示器的连接并重新启动了 python(在我的情况下,我使用的是 Jupyter 服务器)并且它工作正常.看起来您只有可用内存:349.06MiB".也许释放一些内存对你也有用?我仍然不确定为什么这对我有用以及它与收到的错误有什么关系,所以也许其他人可以启发我们:)

@FernandoMM I got my script to run where I was getting the same error. In my case, I was running external displays of my GPU and it was eating up all the GPU ram. I disconnected all displays and restarted python (in my case I was using a Jupyter Server) and it worked. It looks like you have only 'Free memory: 349.06MiB'. Maybe freeing up some memory will work for you as well? I an not sure still why this worked for me and how it relates to the error received, so maybe someone else can enlighten us :).

这篇关于Tensorflow 基本示例错误:CUBLAS_STATUS_NOT_INITIALIZED的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆