HomeDeep LearningDive Into TensorFlow, Part III: GTX 1080+Ubuntu16.04+CUDA8.0+cuDNN5.0+TensorFlow

This is the third article in the series “Dive Into TensorFlow“, here is an index of all the articles in the series that have been published to date:

Part I: Getting Started with TensorFlow
Part II: Basic Concepts
Part III: Ubuntu16.04+GTX 1080+CUDA8.0+cuDNN5.0+TensorFlow (this article)

Deep Learning PC with GTX 1080

Recently, I got a deep learning pc with gtx 1080, following is the details of the pc:

CPU Intel Core i7-6800K 15M Broadwell-E 6-Core 3.4 GHz LGA 2011-v3 140W
Motherboard Asus X99-A/USB 3.1 ATX LGA2011-3
Video Card GIGABYTE GeForce GTX 1080 G1 Gaming GV-N1080G1 GAMING-8GD Video Card
Storage SAMSUNG SM951 256GB SSD + WD Blue 4TB Desktop Hard Disk Drive
Memory Kingston HyperX Fury 64GB (4 x 16G) DDR4 2400 RAM HX424C15FBK4/64
……

The system is Ubuntu16.04 64-bit, after the system ready, first need to install Nvidia GTX 1080 driver 367.27:

sudo add-apt-repository ppa:graphics-drivers/ppa

Meet the warning:

Fresh drivers from upstream, currently shipping Nvidia.

## Current Status

We currently recommend: `nvidia-361`, Nvidia’s current long lived branch.
For GeForce 8 and 9 series GPUs use `nvidia-340`
For GeForce 6 and 7 series GPUs use `nvidia-304`

## What we’re working on right now:

– Normal driver updates
– Investigating how to bring this goodness to distro on a cadence.

## WARNINGS:

This PPA is currently in testing, you should be experienced with packaging before you dive in here. Give us a few days to sort out the kinks.

Ignore it, and continue to install the GTX1080 driver:

sudo apt-get update
sudo apt-get install nvidia-367
sudo apt-get install mesa-common-dev
sudo apt-get install freeglut3-dev

Then restart the deep learning pc and load the NVIDIA drivers.

Download and install CUDA

We choose the new CUDA8, which support Nvidia GTX 1080:

New in CUDA 8

Pascal Architecture Support
Out of box performance improvements on Tesla P100, supports GeForce GTX 1080
Simplify programming using Unified memory on Pascal including support for large datasets, concurrent data access and atomics*
Optimize Unified Memory performance using new data migration APIs*
Faster Deep Learning using optimized cuBLAS routines for native FP16 computation
Developer Tools
Quickly identify latent system-level bottlenecks using the new critical path analysis feature
Improve productivity with up to 2x faster NVCC compilation speed
Tune OpenACC applications and overall host code using new profiling extensions
Libraries
Accelerate graph analytics algorithms with nvGRAPH
New cuBLAS matrix multiply optimizations for matrices with sizes smaller than 512 and for batched operation

Here is the CUDA 8 download link: https://developer.nvidia.com/cuda-release-candidate-download, which need you register or log into the Nvidia developer program to download. We choose the Ubuntu16.04 runfile install method:

ubuntu16.04-cuda8-runfile

After download the 1.4G cuda_8.0.27_linux.run file, you can run:

sudo sh cuda_8.0.27_linux.run --tmpdir=/opt/temp/

Here you can add the --tmpdir parameter if you meet the follow error:

Not enough space on parition mounted at /.
Need 5091561472 bytes.

Disk space check has failed. Installation cannot continue.

After run the cuda_8.0.27_linux.run, you will meet some yes or no question, it is very important that you should answer “n” for this quesion:

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 361.62?

Very important, If you answer yes, the GTX 1080 367 driver will be overwritten.

Logging to /opt/temp//cuda_install_6583.log
Using more to view the EULA.
End User License Agreement
————————–

Preface
——-

The following contains specific license terms and conditions
for four separate NVIDIA products. By accepting this
agreement, you agree to comply with all the terms and
conditions applicable to the specific product(s) included
herein.

Do you accept the previously read EULA?
accept/decline/quit: accept

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 361.62?
(y)es/(n)o/(q)uit: n

Install the CUDA 8.0 Toolkit?
(y)es/(n)o/(q)uit: y

Enter Toolkit Location
[ default is /usr/local/cuda-8.0 ]:

Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y

Install the CUDA 8.0 Samples?
(y)es/(n)o/(q)uit: y

Enter CUDA Samples Location
[ default is /home/textminer ]:

Installing the CUDA Toolkit in /usr/local/cuda-8.0 …
Installing the CUDA Samples in /home/textminer …
Copying samples to /home/textminer/NVIDIA_CUDA-8.0_Samples now…
Finished copying samples.

===========
= Summary =
===========

Driver: Not Selected
Toolkit: Installed in /usr/local/cuda-8.0
Samples: Installed in /home/textminer

Please make sure that
– PATH includes /usr/local/cuda-8.0/bin
– LD_LIBRARY_PATH includes /usr/local/cuda-8.0/lib64, or, add /usr/local/cuda-8.0/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-8.0/bin

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-8.0/doc/pdf for detailed information on setting up CUDA.

***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 361.00 is required for CUDA 8.0 functionality to work.
To install the driver using this installer, run the following command, replacing with the name of this run file:
sudo .run -silent -driver

Logfile is /opt/temp//cuda_install_6583.log

Then set up the development environment by modifying the PATH and LD_LIBRARY_PATH variables, also add them to the end of .bashrc file:

export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Finally we can test CUDA 8 in our Ubunt16.04 with GeForce GTX 1080:

nvidia-smi

nvidia


cd 1_Utilities/deviceQuery
make

“/usr/local/cuda-8.0″/bin/nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o deviceQuery.o -c deviceQuery.cpp
“/usr/local/cuda-8.0″/bin/nvcc -ccbin g++ -m64 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o deviceQuery deviceQuery.o
mkdir -p ../../bin/x86_64/linux/release
cp deviceQuery ../../bin/x86_64/linux/release

Run ./deviceQuery:

./deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: “GeForce GTX 1080”
CUDA Driver Version / Runtime Version 8.0 / 8.0
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: 8112 MBytes (8506179584 bytes)
(20) Multiprocessors, (128) CUDA Cores/MP: 2560 CUDA Cores
GPU Max Clock rate: 1835 MHz (1.84 GHz)
Memory Clock rate: 5005 Mhz
Memory Bus Width: 256-bit
L2 Cache Size: 2097152 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GTX 1080
Result = PASS

Test nobody sample:

cd ../../5_Simulations/nbody/
make

Run:

./nbody -benchmark -numbodies=256000 -device=0

> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
gpuDeviceInit() CUDA Device [0]: “GeForce GTX 1080
> Compute 6.1 CUDA device: [GeForce GTX 1080]
number of bodies = 256000
256000 bodies, total time for 10 iterations: 2291.469 ms
= 286.000 billion interactions per second
= 5719.998 single-precision GFLOP/s at 20 flops per interaction

Download and install cuDNN

About cuDNN:

The NVIDIA CUDA® Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers. cuDNN is part of the NVIDIA Deep Learning SDK.

Deep learning researchers and framework developers worldwide rely on cuDNN for high-performance GPU acceleration. It allows them to focus on training neural networks and developing software applications rather than spending time on low-level GPU performance tuning. cuDNN accelerates widely used deep learning frameworks, including Caffe, TensorFlow, Theano, Torch, and CNTK. See supported frameworks for more details.

We choose cuDNN v5, which support CUDA8.0, the official download link is: https://developer.nvidia.com/rdp/cudnn-download

cudnn

Install cuDNN is very simple:

tar -zxvf cudnn-8.0-linux-x64-v5.0-ga.tgz

cuda/include/cudnn.h
cuda/lib64/libcudnn.so
cuda/lib64/libcudnn.so.5
cuda/lib64/libcudnn.so.5.0.5
cuda/lib64/libcudnn_static.a


sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/
sudo chmod a+r /usr/local/cuda/include/cudnn.h
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*

Install TensorFlow Python Dependency

The Python version is Python2.7:

sudo apt-get install python-pip
sudo apt-get install python-numpy swig python-dev python-wheel

Install Google Build Tool Bazel

Bazel is a build tool that builds code quickly and reliably. It is used to build the majority of Google’s software, and thus it has been designed to handle build problems present in Google’s development environment, including:

A massive, shared code repository, in which all software is built from source. Bazel has been built for speed, using both caching and parallelism to achieve this. Bazel is critical to Google’s ability to continue to scale its software development practices as the company grows.

An emphasis on automated testing and releases. Bazel has been built for correctness and reproducibility, meaning that a build performed on a continuous build machine or in a release pipeline will generate bitwise-identical outputs to those generated on a developer’s machine.

Language and platform diversity. Bazel’s architecture is general enough to support many different programming languages within Google, and can be used to build both client and server software targeting multiple architectures from the same underlying codebase.

Get the Bazel 0.3.0 install script from Bazel github:

wget https://github.com/bazelbuild/bazel/releases/download/0.3.0/bazel-0.3.0-installer-linux-x86_64.sh

Then run:

chmod +x bazel-0.3.0-installer-linux-x86_64.sh
./bazel-0.3.0-installer-linux-x86_64.sh --user

If you meet the error:

Java not found, please install the corresponding package
See http://bazel.io/docs/install.html for more information on

Install the Java JDK in Ubuntu 16.04 by apt-get:

sudo apt-get update
sudo apt-get install default-jre
sudo apt-get install default-jdk

Then run the following command again:

./bazel-0.3.0-installer-linux-x86_64.sh --user

Bazel installer
—————

# Release 0.3.0 (2016-06-10)

Baseline: a9301fa

Cherry picks:
+ ff30a73: Turn –legacy_external_runfiles back on by default
+ aeee3b8: Fix delete[] warning on fsevents.cc

Incompatible changes:

– The –cwarn command line option is not supported anymore. Use
–copt instead.

New features:

– On OSX, –watchfs now uses FsEvents to be notified of changes
from the filesystem (previously, this flag had no effect on OS X).
– add support for the ‘-=’, ‘*=’, ‘/=’, and’%=’ operators to
skylark. Notably, we do not support ‘|=’ because the semantics
of skylark sets are sufficiently different from python sets.

Important changes:

– Use singular form when appropriate in blaze’s test result summary
message.
– Added supported for Android NDK revision 11
– –objc_generate_debug_symbols is now deprecated.
– swift_library now generates an Objective-C header for its @objc
interfaces.
– new_objc_provider can now set the USES_SWIFT flag.
– objc_framework now supports dynamic frameworks.
– Symlinks in zip files are now unzipped correctly by http_archive,
download_and_extract, etc.
– swift_library is now able to import framework rules such as
objc_framework.
– Adds “jre_deps” attribute to j2objc_library.
– Release apple_binary rule, for creating multi-architecture
(“fat”) objc/cc binaries and libraries, targeting ios platforms.
– Aspects documentation added.
– The –ues_isystem_for_includes command line option is not
supported anymore.
– global function ‘provider’ is removed from .bzl files. Providers
can only be accessed through fields in a ‘target’ object.

## Build informations
– [Build log](http://ci.bazel.io/job/Bazel/JAVA_VERSION=1.8,PLATFORM_NAME=linux-x86_64/595/)
– [Commit](https://github.com/bazelbuild/bazel/commit/e671d29)
Uncompressing……Extracting Bazel installation…
.

Bazel is now installed!

Make sure you have “/home/textminer/bin” in your path. You can also activate bash
completion by adding the following line to your ~/.bashrc:
source /home/textminer/.bazel/bin/bazel-complete.bash

See http://bazel.io/docs/getting-started.html to start a new project!

Then add following code in ~/.bashrc:

source /home/textminer/.bazel/bin/bazel-complete.bash
export PATH=$PATH:/home/textminer/.bazel/bin

and execute:

source ~/.bashrc

Now, Bazel installed successfully.

Build and install TensorFlow GPU Version by Source Code

Get the newest TensorFlow code from tensorflow github repository:

git clone https://github.com/tensorflow/tensorflow

then:

cd tensorflow
./configure

If you meet the error:

ERROR: It appears that the development version of libcurl is not available. Please install the libcurl3-dev package.

You should install libcurl3-dev by apt-get:

sudo apt-get install libcurl3 libcurl3-dev

Then run configure again:

./configure

Please specify the location of python. [Default is /usr/bin/python]:
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] y
Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with GPU support? [y/N] y
GPU support will be enabled for TensorFlow
Please specify which gcc nvcc should use as the host compiler. [Default is /usr/bin/gcc]:
Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]:
Please specify the location where CUDA toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the Cudnn version you want to use. [Leave empty to use system default]:
Please specify the location where cuDNN library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: “3.5,5.2”]:
Setting up Cuda include
Setting up Cuda lib64
Setting up Cuda bin
Setting up Cuda nvvm
Setting up CUPTI include
Setting up CUPTI lib64
Configuration finished

Now the TensorFlow can be build by Bazel:

bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer

If you meet the error:

configure: error: zlib not installed
Target //tensorflow/cc:tutorials_example_trainer failed to build

Install zlib1g-dev:

sudo apt-get install zlib1g-dev

Then build tensorflow by bazel again, and wait a cup of coffee time:

……
Target //tensorflow/cc:tutorials_example_trainer up-to-date:
bazel-bin/tensorflow/cc/tutorials_example_trainer
INFO: Elapsed time: 897.845s, Critical Path: 533.72s

Excute the tensorflow tutorials sample to call the GPU GTX 1080:

bazel-bin/tensorflow/cc/tutorials_example_trainer --use_gpu
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.835
pciBusID 0000:01:00.0
Total memory: 7.92GiB
Free memory: 7.65GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
000003/000006 lambda = 1.841570 x = [0.669396 0.742906] y = [3.493999 -0.669396]
000006/000007 lambda = 1.841570 x = [0.669396 0.742906] y = [3.493999 -0.669396]
000009/000006 lambda = 1.841570 x = [0.669396 0.742906] y = [3.493999 -0.669396]
000009/000004 lambda = 1.841570 x = [0.669396 0.742906] y = [3.493999 -0.669396]
000000/000005 lambda = 1.841570 x = [0.669396 0.742906] y = [3.493999 -0.669396]
000000/000004 lambda = 1.841570 x = [0.669396 0.742906] y = [3.493999 -0.669396]
……

Everything is ok. Now test Tensorflow in ipyhton:

import tensorflow as tf

ImportError: cannot import name pywrap_tensorflow

Continue to build the Python TensorFlow wraper by Bazel:


bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
sudo pip install /tmp/tensorflow_pkg/tensorflow-0.9.0-py2-none-any.whl

Requirement already satisfied (use –upgrade to upgrade): setuptools in /usr/lib/python2.7/dist-packages (from protobuf==3.0.0b2->tensorflow==0.9.0)
Installing collected packages: six, funcsigs, pbr, mock, protobuf, tensorflow
Successfully installed funcsigs-1.0.2 mock-2.0.0 pbr-1.10.0 protobuf-3.0.0b2 six-1.10.0 tensorflow-0.9.0

Now test TensorFlow in iPython again:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Python 2.7.12 (default, Jul  1 2016, 15:12:24) 
Type "copyright", "credits" or "license" for more information.
 
IPython 2.4.1 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.
 
In [1]: import tensorflow as tf
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
 
In [2]: import numpy as np
 
In [3]: x_data = np.random.rand(100).astype(np.float32)
 
In [4]: y_data = x_data * 0.1 + 0.3
 
In [5]: W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
 
In [6]: b = tf.Variable(tf.zeros([1]))
 
In [7]: y = W * x_data + b
 
In [8]: loss = tf.reduce_mean(tf.square(y - y_data))
 
In [9]: optimizer = tf.train.GradientDescentOptimizer(0.5)
 
In [10]: train = optimizer.minimize(loss)
 
In [11]: init = tf.initialize_all_variables()
 
In [12]: sess = tf.Session()
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: 
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.835
pciBusID 0000:01:00.0
Total memory: 7.92GiB
Free memory: 7.65GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
 
In [13]: sess.run(init)
 
In [14]: for step in range(201):
   ....:     sess.run(train)
   ....:     if step % 20 == 0:
   ....:         print(step, sess.run(W), sess.run(b))
   ....:         
(0, array([-0.10331395], dtype=float32), array([ 0.62236434], dtype=float32))
(20, array([ 0.03067014], dtype=float32), array([ 0.3403711], dtype=float32))
(40, array([ 0.08353967], dtype=float32), array([ 0.30958495], dtype=float32))
(60, array([ 0.09609199], dtype=float32), array([ 0.30227566], dtype=float32))
(80, array([ 0.09907217], dtype=float32), array([ 0.3005403], dtype=float32))
(100, array([ 0.09977971], dtype=float32), array([ 0.30012828], dtype=float32))
(120, array([ 0.0999477], dtype=float32), array([ 0.30003047], dtype=float32))
(140, array([ 0.0999876], dtype=float32), array([ 0.30000722], dtype=float32))
(160, array([ 0.09999706], dtype=float32), array([ 0.30000171], dtype=float32))
(180, array([ 0.09999929], dtype=float32), array([ 0.30000043], dtype=float32))
(200, array([ 0.09999985], dtype=float32), array([ 0.3000001], dtype=float32))

Now, just enjoy TensorFlow in your Ubuntu 16.04 system with GeForce GTX 1080.

Reference:
TensorFlow: Installing from sources
Nvidia GTX 1080 on Ubuntu 16.04 for Deep Learning
Tensorflow on Ubuntu 16.04 with Nvidia GTX 1080
TensorFlow, Caffe, Chainer と Deep Learning大御所を一気に source code build で GPU向けに setupしてみた
GTX-1080でTensorFlow
深度学习主机环境配置: Ubuntu16.04+Nvidia GTX 1080+CUDA8.0
深度学习主机环境配置: Ubuntu16.04+GeForce GTX 1080+TensorFlow

Posted by TextMiner


Comments

Dive Into TensorFlow, Part III: GTX 1080+Ubuntu16.04+CUDA8.0+cuDNN5.0+TensorFlow — 25 Comments

    • If you run into compiler problems while installing cuda make sure to have gcc installed:

      sudo apt-get install gcc-4.8

  1. bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
    ——————-^

    There is an error in your example.

    • Sry pls delete my spammy comments but somehow the commands are not displayed correctly on your website. it should be “- – config=cuda” without spaces. not –config=cuda

  2. Also if you get an error message about the gcc version, you can force it by doing:

    sudo emacs -nw /usr/local/cuda/include/host_config.h

    And then comment out line 115:

    //#error — unsupported GNU version!

  3. when I run ‘./configure’, terminal shows that:

    INFO: Starting clean (this may take a while). Consider using –expunge_async if the clean takes more than several minutes.
    .
    ERROR: /home/ei/github/test/tensorflow/tensorflow/contrib/session_bundle/BUILD:237:1: no such target ‘//tensorflow/core:meta_graph_portable_proto’: target ‘meta_graph_portable_proto’ not declared in package ‘tensorflow/core’ defined by /home/ei/github/test/tensorflow/tensorflow/core/BUILD and referenced by ‘//tensorflow/contrib/session_bundle:signature_lite’.
    ERROR: /home/ei/github/test/tensorflow/tensorflow/core/BUILD:689:1: no such package ‘base’: BUILD file not found on package path and referenced by ‘//tensorflow/core:ios_tensorflow_test_lib’.
    ERROR: Evaluation of query “deps((//… union @bazel_tools//tools/jdk:toolchain))” failed: errors were encountered while computing transitive closure.
    Configuration finished

    Can you help me with this?

  4. Hello,

    My machine is similar as yours :

    GTX 1080, i7 6800k, 32 GB RAM, Kingoston SSD etc.

    Thanks you very much for tutorial. I successfully installed latest Linux drivers NVIDIA-Linux-x86_64-367.44.run, latest cuDNN cudnn-8.0-linux-x64-v5.1.tgz (5.1.5), and CUDA 8.0rc. Tensorflow seems to be using those libraries just right :

    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so.8.0 locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so.5.1.5 locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so.8.0 locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
    I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so.8.0 local

    And then selecting GPU :

    Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:05:00.0)

    Tensorflow version is 0.9.0

    When I run any training code (CNN, LSTM, GRU etc.) I can’t get higher GPU Utilizations than 50%, mostly it just stays at ~32%. Do you face similar issues ? Is it really an issue ? Could you please share your results and/or give any hints ?

    • Here’s simple code to test it as I did :

      python -m tensorflow.models.image.mnist.convolutiona

      Just wanted to repeat performance test again of GTX 1080 and got “Ran out of memory” error. Something is clearly not right with either Tensorflow, Cuda or GTX itself.

      What do you think ?

      • Got it running after killing other Tensorflow sessions, shouldn’t new process of overtake GPU memory when others are not running ? Ok, maybe it’s jupyters fault.

        Anyway with code above I finally got results, utilization is around 60%. It’s not that bad but still not 100% utilization which I’m expecting.

        When training LSTM GPU utilization is half of that.

        P.S. In both cases Used Dedicated Memory is 98% if that changes anything.

        Maybe I need to modify my Tesnorflow code ? I already tried

        config = tf.ConfigProto()
        config.gpu_options.allow_growth=True

        with tf.Session(config=config) as sess:
        ….

        For memory growth.

  5. If you see that message “error while loading shared libraries: libcudart.so.8.0” when excute the tensorflow tutorials sample.
    You add to “–genrule_strategy=standalone” to solve the problem, like below.
    bazel build -c opt –config=cuda –genrule_strategy=standalone //tensorflow/cc:tutorials_example_trainer

  6. Hi, I could not get the output for showing running gtx1080, instead;

    farahana@nufa:~/tensorflow$ bazel-bin/tensorflow/cc/tutorials_example_trainer –use_gpu
    I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
    I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally
    I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
    I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
    I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally
    Unknown flag: –use_gpu

  7. One of the best tutorials for CUDA+cuDNN+TensorFlow.

    Only one update I would love to share:

    if anyone encounters

    “ERROR: com.google.devtools.build.lib.packages.BuildFileContainsErrorsException: error loading package ”: Encountered error while reading extension file ‘cuda/build_defs.bzl’: no such package ‘@local_config_cuda//cuda’: Traceback (most recent call last):”

    (for more detail, please refer to https://github.com/tensorflow/tensorflow/issues/5319)

    Using Bazel 0.3.2 resolves this issue on my side.

  8. What you tell me about of the warning at the end of cuda installation:

    “***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 361.00 is required for CUDA 8.0 functionality to work.
    To install the driver using this installer, run the following command, replacing with the name of this run file:
    sudo .run -silent -driver”

    Can you just ignore it? It seems an important warning, or not?

  9. try testing Cuda

    ~/NVIDIA_CUDA-8.0_Samples$ nVidia-smi
    /NVIDIA_CUDA-8.0_Samples$: No such file or directory

    i dont know why

  10. Hi I am using a customized tensor flow, I am selecting only CPU and using bazel 3.0.2, I tried with bazel 0.4.5 but it gave error: tensorflow.bzl:528:19: name ‘DATA_CFG’ is not defined, so i shifted to 0.3.2 and this error was gone, however i am getting following error when i execute
    1. ./configure
    2. bazel build –config=opt //tensorflow/tools/pip_package:build_pip_package

    ERROR: /home/madhu/INTF/intelTensorFlow/tensorflow/core/BUILD:853:1: undeclared inclusion(s) in rule ‘//tensorflow/core:lib_internal’:
    this rule is missing dependency declarations for the following files included by ‘tensorflow/core/lib/png/png_io.cc’:
    ‘/home/madhu/.cache/bazel/_bazel_madhu/b74641c52b46ee79e2d01b22cdf8e7fa/external/zlib_archive/zlib.h’
    ‘/home/madhu/.cache/bazel/_bazel_madhu/b74641c52b46ee79e2d01b22cdf8e7fa/external/zlib_archive/zconf.h’.
    Target //tensorflow/tools/pip_package:build_pip_package failed to build

    CAN YOU PLEASE HELP ME OUT??

    Installed latest zlib1g-dev and my PATH variable is correctly pointing to gcc includes.

  11. This is some hard stuff but this tutorial makes it so easy. This is easily one of the best, most thorough, tutorials I’ve ever seen.

Leave a Reply

Your email address will not be published. Required fields are marked *