Skip to content

How to built tensorflow from source, on Xeon processor, ubuntu 19.x, tf 2.x with GPU support.

Notifications You must be signed in to change notification settings

kazzastic/Tensorflow-BuildFromSource

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

Tensorflow-BuildFromSource

How to built tensorflow from source, on Xeon processor, ubuntu 19.x, tf 2.x with GPU support.

Even though all the required information that one might need when building tensorflow from source can be obtained from https://www.tensorflow.org/install/source but I just want to highlight the current situation of tensorflow on ubuntu and xeon processor.

Configuration

These are the configuration which will be able to re produce results most accurately

  • Ubuntu 19.10
  • CPU: Intel Xeon E5620 (8) @ 2.395GHz
  • GPU: NVIDIA GeForce GTX 1050 Ti
  • NVIDIA-SMI 440.64.00
  • CUDA 10.1
  • nvcc cuDNN 7.6

GPU softwares

CUDA Toolkit 10.1 Download

Here you can see how to install the CUDA 10.1, even though this is outdated but you would want to install this once since TF 2.X supports this version.

$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin

$ sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600

$ wget http://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb

$ sudo dpkg -i cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb

$ sudo apt-key add /var/cuda-repo-10-1-local-10.1.243-418.87.00/7fa2af80.pub

$ sudo apt-get update

$ sudo apt-get -y install cuda

now, if you had already tried installing an nvidia toolkit before then you might run into this minor problem which might give you following error.

$ sudo apt-get install cuda
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies.
 cuda : Depends: cuda-10-0 (>= 10.0.130) but it is not going to be installed
E: Unable to correct problems, you have held broken packages

this is normal and can be solved by the following commands,

$ sudo add-apt-repository -r ppa:graphics-drivers/ppa

$ sudo apt remove nvidia-*

$ sudo apt autoremove

this should solved the problem by removing previouly installed toolkits of any version, and now proceed where you left from meaning,

$ sudo apt install cuda

once done installing, modify the ~/.profile file and add the following lines, first gedit, nano or vim open the file using following command,

$ nano ~/.profile

then add these lines,

# set PATH for cuda 10.1 installation
if [ -d "/usr/local/cuda-10.1/bin/" ]; then
    export PATH=/usr/local/cuda-10.1/bin${PATH:+:${PATH}}
    export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
fi

then compile this once, just to be sure!

$ source ~/.profile

now restart the OS once or use this,

$ sudo reboot now

Now check the installation using these commmands,

$ nvidia-smi
Sun Apr  5 18:34:20 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.64.00    Driver Version: 440.64.00    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 105...  On   | 00000000:0F:00.0  On |                  N/A |
|  0%   46C    P0    N/A /  72W |    341MiB /  4038MiB |     15%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1113      G   /usr/lib/xorg/Xorg                            14MiB |
|    0      1705      G   /usr/lib/xorg/Xorg                            85MiB |
|    0      1912      G   /usr/bin/gnome-shell                         101MiB |
|    0      7262      G   ...AAAAAAAAAAAAAAgAAAAAAAAA --shared-files    88MiB |
+-----------------------------------------------------------------------------+

$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

if you get any error here reach out to github/stackoverflow community.

If you run into any gcc or gnu version error till now, or in any further installation then install these versions of gcc and gnu,

$ sudo apt-get install -y software-properties-common

$ sudo add-apt-repository ppa:ubuntu-toolchain-r/test

$ sudo apt update

$ sudo apt install g++-7 -y

Set it up so the symbolic links gcc, g++ point to the newer version:

$ sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-7 60 \
                         --slave /usr/bin/g++ g++ /usr/bin/g++-7 

$ sudo update-alternatives --config gcc

$ gcc --version

$ g++ --version

cuDNN v7.6.5 for CUDA 10.1

You can find the file here, you might have to sign up to download it. Download the given packages

  • cuDNN Runtime Library for Ubuntu18.04 (Deb)
  • cuDNN Developer Library for Ubuntu18.04 (Deb)
  • cuDNN Code Samples and User Guide for Ubuntu18.04 (Deb) These are for Ubuntu 18.04 but they work like a charm even on Ubuntu 19.10
# see files we downloaded
$ ll | grep cudnn

$ sudo dpkg -i libcudnn7-doc_7.6.5.32-1+cuda10.1_amd64.deb
 
$ sudo dpkg -i libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb
 
$ sudo dpkg -i libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb

$ apt search libcudnn7

navigate the packages and install each as above. We then test the CUDNN installation with:

$ cp -r /usr/src/cudnn_samples_v7/ ~/test/
$ cd ~/test/mnistCUDNN/
$ make clean && make
$ $ ./mnistCUDNN
# ...
# Test passed!

We need to know where CUDDN7 is installed for tensorflow build configuration

$ whereis cudnn

cudnn: /usr/include/cudnn.h

$ ll /usr/include/ | grep cudnn

lrwxrwxrwx  1 root root     26 Sep 29 22:06 cudnn.h -> /etc/alternatives/libcudnn

$ ll /etc/alternatives/libcudnn

libcudnn        libcudnn_so     libcudnn_stlib
  
$ ll /etc/alternatives/ | grep cudnn
lrwxrwxrwx   1 root root    40 Sep 29 22:06 libcudnn -> /usr/include/x86_64-linux-gnu/cudnn_v7.h
lrwxrwxrwx   1 root root    39 Sep 29 22:06 libcudnn_so -> /usr/lib/x86_64-linux-gnu/libcudnn.so.7
lrwxrwxrwx   1 root root    46 Sep 29 22:06 libcudnn_stlib -> /usr/lib/x86_64-linux-gnu/libcudnn_static_v7.a

$ dpkg-query -L libcudnn7
/.
/usr
/usr/lib
/usr/lib/x86_64-linux-gnu
/usr/lib/x86_64-linux-gnu/libcudnn.so.7.3.1
/usr/share
/usr/share/doc
/usr/share/doc/libcudnn7
/usr/share/doc/libcudnn7/changelog.Debian.gz
/usr/share/doc/libcudnn7/copyright
/usr/share/lintian
/usr/share/lintian/overrides
/usr/share/lintian/overrides/libcudnn7
/usr/lib/x86_64-linux-gnu/libcudnn.so.7

$ dpkg-query -L libcudnn7-dev 
/.
/usr
/usr/include
/usr/include/x86_64-linux-gnu
/usr/include/x86_64-linux-gnu/cudnn_v7.h
/usr/lib
/usr/lib/x86_64-linux-gnu
/usr/lib/x86_64-linux-gnu/libcudnn_static_v7.a
/usr/share
/usr/share/doc
/usr/share/doc/libcudnn7-dev
/usr/share/doc/libcudnn7-dev/changelog.Debian.gz
/usr/share/doc/libcudnn7-dev/copyright
/usr/share/lintian
/usr/share/lintian/overrides
/usr/share/lintian/overrides/libcudnn7-dev

We decide to use /etc/alternatives/ for the tensorflow configuration.

Install Bazel

Release Version of Bazel

The most suitable for our build can be checked here further more navigate to correct release version of bazel from here I used bazel version 2.0.0 download the bazel-2.0.0-installer-linux-x86_64.sh

$ cd where-bazel-is-downloaded

$ chmod +x bazel-2.0.0-installer-linux-x86_64.sh

$ ./bazel-2.0.0-installer-linux-x86_64.sh --user

this would install the bazel. Now setup the enviroment,

export PATH="$PATH:$HOME/bin"

add these lines to your ./bashrc file, you can open this file with these commands,

$ nano ~/.bashrc

NVIDIA Collective Communications Library (NCCL)

From here download Download NCCL v2.6.4, for CUDA 10.1, March 26,2020. When enabling CUDA tensorflow asks us to specify the nccl library location. We go at this nvidia webpage to install it. We download the network installer, so that it is more straightforward to get updates.

$ sudo dpkg-i nccl-repo-ubuntu1804-2.6.4-ga-cuda10.2_1-1_amd64.deb 

$ sudo apt update

$ apt search libnccl
Full Text Search... Done
libnccl-dev/unknown 2.3.5-2+cuda10.0 amd64
  NVIDIA Collectives Communication Library (NCCL) Development Files

libnccl2/unknown 2.3.5-2+cuda10.0 amd64
  NVIDIA Collectives Communication Library (NCCL) Runtime

$ sudo apt install libnccl2 libnccl-dev

# See where the libraries got installed
$ dpkg-query -L libnccl-dev libnccl2
/.
/usr
/usr/include
/usr/include/nccl.h
/usr/lib
/usr/lib/x86_64-linux-gnu
/usr/lib/x86_64-linux-gnu/libnccl_static.a
/usr/share
/usr/share/doc
/usr/share/doc/libnccl-dev
/usr/share/doc/libnccl-dev/changelog.Debian.gz
/usr/share/doc/libnccl-dev/copyright
/usr/lib/x86_64-linux-gnu/libnccl.so

/.
/usr
/usr/lib
/usr/lib/x86_64-linux-gnu
/usr/lib/x86_64-linux-gnu/libnccl.so.2.3.5
/usr/share
/usr/share/doc
/usr/share/doc/libnccl2
/usr/share/doc/libnccl2/changelog.Debian.gz
/usr/share/doc/libnccl2/copyright
/usr/lib/x86_64-linux-gnu/libnccl.so.2

We decide to create symbolic links from within the cuda installation to where they are:

$ cd /usr/local/cuda/

$ sudo mkdir lib

$ cd lib

$ sudo ln -s /usr/lib/x86_64-linux-gnu/libnccl.so.2 libnccl.so.2

$ cd ../include

$ sudo ln -s /usr/include/nccl.h nccl.h

Building Tensorflow

Clone Tensorflow

$ git clone https://github.com/tensorflow/tensorflow
$ cd tensorflow
$ git checkout v2.2.0-rc2

Configuration

$ pip install -U pip six numpy wheel setuptools mock 'future>=0.17.1'
$ pip install -U keras_applications --no-deps
$ pip install -U keras_preprocessing --no-deps

now the main part,

$ ./configure
You have bazel 2.0.0 installed.
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3.7

Found possible Python library paths:
  /usr/local/lib/python3.7/dist-packages
  /usr/lib/python3.7/dist-packages
Please input the desired Python library path to use.  Default is [/usr/lib/python3.7/dist-packages]

Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: y
jemalloc as malloc support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: n
Google Cloud Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: n
Hadoop File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Amazon AWS Platform support? [Y/n]: n
Amazon AWS Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Apache Kafka Platform support? [Y/n]: n
Apache Kafka Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with XLA JIT support? [y/N]: n
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with GDR support? [y/N]: n
No GDR support will be enabled for TensorFlow.

Do you wish to build TensorFlow with VERBS support? [y/N]: n
No VERBS support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: Y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10.1]: 10.1

Please specify the location where CUDA 10.1 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:

Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.6]: 

Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:

Do you wish to build TensorFlow with TensorRT support? [y/N]: n
No TensorRT support will be enabled for TensorFlow.

Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: developer.nvidia.com/cuda-gpus
Please note that each additional compute capability significantly increases your
build time and binary size. [Default is: 3.5,7.0] 5.2,6.1,7.0

Do you want to use clang as CUDA compiler? [y/N]: n
nvcc will be used as CUDA compiler.

Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:

Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:

Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See tools/bazel.rc for more details.
    --config=mkl            # Build with MKL support.
Configuration finished
    --config=monolithic     # Config for mostly static monolithic build.

After this you're almost there.

Build

Now using bazel you will build the package,

bazel build //tensorflow/tools/pip_package:build_pip_package

If your PC is freezing and hanging due to this build, use this command to build package rather than that,

bazel build --local_ram_resources=2048 --jobs=1 //tensorflow/tools/pip_package:build_pip_package

This other command will take wayyyy more time but it will get the job done without freezing the PC at any point at all!

Build the package

$ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

Install the package

$ pip install /tmp/tensorflow_pkg/tensorflow-version-tags.whl

These links helped me the most, https://github.com/Iolaum/CompileTF/blob/master/README.md

https://gist.github.com/kmhofmann/e368a2ebba05f807fa1a90b3bf9a1e03

https://www.tensorflow.org/install/source

https://github.com/bazelbuild/bazel/releases

tensorflow/tensorflow#31804

About

How to built tensorflow from source, on Xeon processor, ubuntu 19.x, tf 2.x with GPU support.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published